Chapter 24

Recursion and Divide-and-Conquer

Chapter 24: Recursion and Divide-and-Conquer

You write a function that calls itself within its own body. The program does not spin into an infinite loop โ€” because each call shrinks the problem slightly, until it bottoms out and returns. This is recursion: a way of thinking that decomposes complex problems into structurally identical smaller problems.

Recursion is not syntactic sugar. It is not a programming trick. It is a computational model. In 1936, Church and Turing independently proved the equivalence of computability โ€” Church through the lambda calculus (whose core mechanism is recursive definition) and Turing through his machine. When you write def f(n): return f(n-1) in Python, you are employing a reasoning structure isomorphic to mathematical induction: assume f(n-1) is correct, prove f(n) is correct.

Divide and Conquer is a specific application pattern of recursion: split a problem into several smaller sub-problems of the same structure, solve them recursively, then combine their solutions into the solution for the original problem. Merge sort, fast exponentiation, Karatsuba multiplication, Strassen matrix multiplication โ€” these seemingly unrelated algorithms all share the same divide-and-conquer framework.

This chapter starts from the three elements of recursion, dives into the mathematical analysis of divide-and-conquer, and lands on high-frequency interview applications. You will see that recursion is more than "a function calling itself," and divide-and-conquer is more than "splitting an array in half." They are among the most powerful paradigms in algorithm design.


Level 1 ยท What You Need to Know

1.1 The Three Elements of Recursion

Every correct recursive function must contain three elements:

Element Definition Consequence if Missing
Base Case Problem is small enough to solve directly Infinite recursion, stack overflow
Recurrence Relation How f(n) is expressed in terms of f(smaller) Cannot construct the recursion
Size Reduction Each recursive call's argument is closer to the base case Infinite recursion

These three elements correspond to the three parts of mathematical induction: the base step, the inductive step, and well-ordering.

Why these three and not something else? Because recursion is fundamentally induction over a well-founded relation. A well-founded relation requires that no infinite descending chain exists โ€” this is precisely the mathematical expression of "the size must decrease." Without a base case, recursion has no endpoint; without a recurrence relation, recursion has no computational content; without size reduction, recursion does not terminate. These three conditions are exactly the necessary and sufficient conditions for recursion to terminate correctly (Floyd, "Assigning Meanings to Programs", 1967).

Factorial: The Hello World of Recursion

def factorial(n: int) -> int:
    """Compute n! = n ร— (n-1) ร— ... ร— 1
    
    Three elements:
    - Base case: n == 0 returns 1 (0! = 1 by mathematical definition)
    - Recurrence: n! = n ร— (n-1)!
    - Size reduction: n โ†’ n-1, strictly decreasing, must reach 0
    """
    if n == 0:          # Base case
        return 1
    return n * factorial(n - 1)  # Recurrence + size reduction

The execution of factorial(4):

factorial(4)
  = 4 * factorial(3)
  = 4 * (3 * factorial(2))
  = 4 * (3 * (2 * factorial(1)))
  = 4 * (3 * (2 * (1 * factorial(0))))
  = 4 * (3 * (2 * (1 * 1)))
  = 4 * (3 * (2 * 1))
  = 4 * (3 * 2)
  = 4 * 6
  = 24

Notice the shape of this expansion: it first extends to the right (recursive calls), then contracts from right to left (return values propagating back). This is the "recur" and "return" of recursion โ€” first descend to the base case, then ascend back with results.

Fibonacci: Recursion vs Iteration

def fib_recursive(n: int) -> int:
    """Naive recursive Fibonacci
    
    Three elements:
    - Base case: n <= 1 returns n
    - Recurrence: F(n) = F(n-1) + F(n-2)
    - Size reduction: n โ†’ n-1 and n โ†’ n-2
    
    Problem: O(2^n) time due to massive redundant computation
    """
    if n <= 1:
        return n
    return fib_recursive(n - 1) + fib_recursive(n - 2)


def fib_iterative(n: int) -> int:
    """Iterative Fibonacci โ€” O(n) time, O(1) space"""
    if n <= 1:
        return n
    prev, curr = 0, 1
    for _ in range(2, n + 1):
        prev, curr = curr, prev + curr
    return curr

Why is the recursive version slow? Draw the call tree for fib_recursive(5):

                    fib(5)
                   /      \
              fib(4)      fib(3)
             /    \       /    \
         fib(3)  fib(2) fib(2) fib(1)
        /    \   /   \   /   \
    fib(2) fib(1) fib(1) fib(0) fib(1) fib(0)
    /   \
fib(1) fib(0)

fib(3) is computed 2 times, fib(2) is computed 3 times. For fib(n), the total number of calls is approximately phi^n where phi = 1.618 (the golden ratio), yielding exponential time complexity. The iterative version computes each subproblem exactly once, achieving O(n).

General comparison โ€” Recursion vs Iteration:

Dimension Recursion Iteration
Code clarity Often more intuitive (especially for tree structures) May require explicit stack management
Space overhead O(recursion depth) stack space Typically O(1)
Function call overhead Frame creation/destruction per call None
Best for Divide-and-conquer, tree/graph traversal, backtracking Linear recurrences, simple loops

Conclusion: Recursion is not "more advanced" and iteration is not "more primitive." The choice depends on whether the problem structure is naturally recursive. Fibonacci is a linear recurrence โ€” iteration is more natural. Tree traversal is a recursive structure โ€” recursion is more natural.

1.2 The Core Idea of Divide-and-Conquer

Divide-and-conquer algorithms follow a three-step paradigm:

Divide-and-Conquer(P):
    1. DIVIDE: Decompose problem P into k sub-problems Pโ‚, Pโ‚‚, ..., Pโ‚–
    2. CONQUER: Recursively solve each sub-problem
    3. COMBINE: Merge sub-problem solutions into the solution for P

This paradigm was systematically articulated by Knuth in The Art of Computer Programming (1973), though the idea predates him. Any divide-and-conquer algorithm must answer three design questions:

  1. How to divide? โ€” How to partition the problem into sub-problems
  2. When to stop? โ€” At what size to solve directly
  3. How to combine? โ€” How to merge sub-solutions

Preconditions for divide-and-conquer to be effective:

1.3 Merge Sort: The Textbook Divide-and-Conquer

Merge sort is the canonical implementation of divide-and-conquer. John von Neumann first described this algorithm in 1945 (Knuth, TAOCP Vol. 3, 1973).

def merge_sort(arr: list) -> list:
    """Merge sort โ€” O(n log n) time, O(n) space
    
    Three steps:
    1. DIVIDE: Split at the middle into left and right halves
    2. CONQUER: Recursively sort left and right
    3. COMBINE: Merge two sorted arrays
    """
    n = len(arr)
    if n <= 1:  # Base case: 0 or 1 elements are already sorted
        return arr
    
    mid = n // 2
    left = merge_sort(arr[:mid])      # Recursively sort left half
    right = merge_sort(arr[mid:])     # Recursively sort right half
    return merge(left, right)         # Combine


def merge(left: list, right: list) -> list:
    """Merge two sorted arrays into one sorted array โ€” O(n)"""
    result = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:  # <= ensures stability
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    # Append remaining elements
    result.extend(left[i:])
    result.extend(right[j:])
    return result

Time complexity analysis:

Let T(n) be the time to merge sort n elements:

Recurrence: T(n) = 2T(n/2) + O(n)

By the Master Theorem: a=2, b=2, f(n)=O(n), log_b(a) = 1, f(n) = Theta(n^1), Case 2, therefore T(n) = O(n log n).

Why is the space complexity O(n)? The merge step requires an auxiliary array. Although recursion depth is O(log n), each level of merging collectively needs O(n) temporary space (shared across all levels at that depth), so total space is O(n).

1.4 Fast Exponentiation: Computing a^n in O(log n)

The naive method of computing a^n multiplies a by itself n times โ€” O(n). When n = 10^9, that means 10^9 multiplications. Fast exponentiation (also called Binary Exponentiation or Exponentiation by Squaring) uses divide-and-conquer to reduce this to O(log n).

Core idea:

$$a^n = \begin{cases} 1 & \text{if } n = 0 \ (a^{n/2})^2 & \text{if } n \text{ is even} \ a \cdot (a^{(n-1)/2})^2 & \text{if } n \text{ is odd} \end{cases}$$

Each recursive step halves the exponent, so only O(log n) multiplications are needed.

def power_recursive(a: float, n: int) -> float:
    """Fast exponentiation โ€” recursive version
    
    Three steps:
    1. DIVIDE: Reduce a^n to the sub-problem a^(n/2)
    2. CONQUER: Recursively compute a^(n/2)
    3. COMBINE: Square (even case) or square-then-multiply (odd case)
    """
    if n == 0:
        return 1
    if n < 0:
        return 1.0 / power_recursive(a, -n)
    
    half = power_recursive(a, n // 2)
    if n % 2 == 0:
        return half * half
    else:
        return half * half * a


def power_iterative(a: float, n: int) -> float:
    """Fast exponentiation โ€” iterative version (more commonly used)
    
    Idea: Express n in binary, process from LSB to MSB
    Example: a^13 = a^(1101โ‚‚) = a^8 ร— a^4 ร— a^1
    """
    if n < 0:
        a = 1.0 / a
        n = -n
    
    result = 1
    base = a
    while n > 0:
        if n & 1:          # Current binary bit is 1
            result *= base
        base *= base       # base successively becomes a, aยฒ, aโด, aโธ, ...
        n >>= 1           # Right shift by one
    return result

Applications of fast exponentiation:

def matrix_power(M: list, n: int, mod: int) -> list:
    """Matrix exponentiation โ€” for accelerating linear recurrences"""
    size = len(M)
    # Identity matrix
    result = [[1 if i == j else 0 for j in range(size)] for i in range(size)]
    
    while n > 0:
        if n & 1:
            result = matrix_multiply(result, M, mod)
        M = matrix_multiply(M, M, mod)
        n >>= 1
    return result


def matrix_multiply(A: list, B: list, mod: int) -> list:
    """Matrix multiplication with modular arithmetic"""
    n = len(A)
    C = [[0] * n for _ in range(n)]
    for i in range(n):
        for j in range(n):
            for k in range(n):
                C[i][j] = (C[i][j] + A[i][k] * B[k][j]) % mod
    return C


def fib_fast(n: int, mod: int = 10**9 + 7) -> int:
    """O(log n) computation of the nth Fibonacci number
    
    Using: [F(n+1), F(n)] = [[1,1],[1,0]]^n ร— [1, 0]
    """
    if n <= 1:
        return n
    M = [[1, 1], [1, 0]]
    result = matrix_power(M, n, mod)
    return result[0][1]

1.5 Common Mistakes and Fixes

Mistake 1: Forgetting the base case

# Wrong: infinite recursion
def bad_sum(arr, i=0):
    return arr[i] + bad_sum(arr, i + 1)

# Fixed: add base case
def good_sum(arr, i=0):
    if i == len(arr):  # Base case
        return 0
    return arr[i] + good_sum(arr, i + 1)

Mistake 2: Size not decreasing

# Wrong: n never changes
def bad_power(a, n):
    if n == 0:
        return 1
    return a * bad_power(a, n)  # Should be n-1

# Fixed
def good_power(a, n):
    if n == 0:
        return 1
    return a * good_power(a, n - 1)  # Size decreases

Mistake 3: Incomplete base cases

# Wrong: only handles n==0, not n==1
def bad_fib(n):
    if n == 0:
        return 0
    return bad_fib(n - 1) + bad_fib(n - 2)  # fib(-1) when n=1

# Fixed: cover all base cases
def good_fib(n):
    if n <= 1:  # Handles both n=0 and n=1
        return n
    return good_fib(n - 1) + good_fib(n - 2)

Level 2 ยท How It Works Under the Hood

2.1 Execution of Recursion: Call Stack Visualization

To truly understand recursion, you must understand its execution mechanism. Each function call creates a new stack frame on the call stack, containing:

import sys

def factorial_traced(n: int, depth: int = 0) -> int:
    """Factorial with visualization"""
    indent = "  " * depth
    print(f"{indent}factorial({n}) called")
    
    if n == 0:
        print(f"{indent}factorial({n}) returns 1")
        return 1
    
    result = n * factorial_traced(n - 1, depth + 1)
    print(f"{indent}factorial({n}) returns {result}")
    return result

# Output of factorial_traced(4):
# factorial(4) called
#   factorial(3) called
#     factorial(2) called
#       factorial(1) called
#         factorial(0) called
#         factorial(0) returns 1
#       factorial(1) returns 1
#     factorial(2) returns 2
#   factorial(3) returns 6
# factorial(4) returns 24

The call stack in memory:

When execution reaches factorial(0), the stack holds 5 frames:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ† Stack Top
โ”‚ factorial(0)            โ”‚
โ”‚   n = 0                 โ”‚
โ”‚   return addr โ†’ line X  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ factorial(1)            โ”‚
โ”‚   n = 1                 โ”‚
โ”‚   return addr โ†’ line X  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ factorial(2)            โ”‚
โ”‚   n = 2                 โ”‚
โ”‚   return addr โ†’ line X  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ factorial(3)            โ”‚
โ”‚   n = 3                 โ”‚
โ”‚   return addr โ†’ line X  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ factorial(4)            โ”‚
โ”‚   n = 4                 โ”‚
โ”‚   return addr โ†’ main    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ† Stack Bottom

Each recursive call pushes a frame; each return pops one. The maximum stack depth equals the maximum recursion depth. For factorial(n), maximum depth is n+1; for binary recursion (like merge sort), maximum depth is O(log n).

Key insight: Recursion depth is NOT the same as total call count. Merge sort on n elements performs O(n log n) operations, but the recursion depth is only O(log n). This means stack space consumption is far less than time consumption.

2.2 Tail Recursion Optimization

Tail recursion occurs when the recursive call is the function's very last operation โ€” nothing needs to be done with the return value after the call returns.

# Not tail-recursive: multiplication happens AFTER the recursive call
def factorial_normal(n):
    if n == 0:
        return 1
    return n * factorial_normal(n - 1)  # Must multiply after return

# Tail-recursive: recursive call is the last operation
def factorial_tail(n, accumulator=1):
    if n == 0:
        return accumulator
    return factorial_tail(n - 1, n * accumulator)  # Nothing to do after this

Why can tail recursion be optimized? Because when the recursive call is the last operation, the current stack frame is no longer needed after the call โ€” its only role would be to receive the return value and pass it unchanged to the caller. So the compiler can reuse the current frame, effectively converting tail recursion into a loop, reducing space from O(n) to O(1).

Languages with Tail Call Optimization (TCO):

Python does NOT support tail recursion optimization. Guido van Rossum explicitly stated in 2009 that it would not be added, for two reasons:

  1. It destroys stack traces, making debugging harder
  2. Python's design philosophy is "explicit is better than implicit"

This means that in Python, even if you write tail-recursive form, it still consumes O(n) stack space. If stack depth matters, you must manually convert to iteration.

2.3 Karatsuba Multiplication

Multiplying two n-digit numbers using the grade-school method requires O(n^2) single-digit multiplications. In 1960, 23-year-old Soviet mathematician Anatoly Karatsuba discovered an O(n^1.585) divide-and-conquer method, disproving Kolmogorov's conjecture that multiplication has a lower bound of Omega(n^2).

Core idea: Split each n-digit number into its high and low halves.

Let x = x_H * B^m + x_L and y = y_H * B^m + y_L, where B is the base and m = n/2.

Naive approach: x * y = x_Hy_H * B^(2m) + (x_Hy_L + x_Ly_H) * B^m + x_Ly_L

This requires 4 multiplications of n/2-digit numbers. Recurrence: T(n) = 4T(n/2) + O(n), which solves to T(n) = O(n^2) โ€” no improvement.

Karatsuba's genius observation: Only 3 multiplications are needed!

Define:

Then: x * y = z2 * B^(2m) + z1 * B^m + z0

Verification: z1 = x_Hy_H + x_Hy_L + x_Ly_H + x_Ly_L - x_Hy_H - x_Ly_L = x_Hy_L + x_Ly_H. Correct!

The recurrence becomes T(n) = 3T(n/2) + O(n). By the Master Theorem: log_2(3) = 1.585, so T(n) = O(n^1.585).

def karatsuba(x: int, y: int) -> int:
    """Karatsuba multiplication
    
    Time: O(n^logโ‚‚3) โ‰ˆ O(n^1.585), where n is the number of digits
    """
    # Base case: small numbers multiply directly
    if x < 10 or y < 10:
        return x * y
    
    # Determine split point
    n = max(len(str(abs(x))), len(str(abs(y))))
    m = n // 2
    
    # Split
    power = 10 ** m
    x_H, x_L = divmod(x, power)
    y_H, y_L = divmod(y, power)
    
    # Three recursive multiplications (not four)
    z2 = karatsuba(x_H, y_H)
    z0 = karatsuba(x_L, y_L)
    z1 = karatsuba(x_H + x_L, y_H + y_L) - z2 - z0
    
    # Combine
    return z2 * (10 ** (2 * m)) + z1 * (10 ** m) + z0

Practical impact: For 1000-digit multiplication, the naive method requires 10^6 single-digit multiplications, while Karatsuba needs only about 10^4.75 โ‰ˆ 59,000 โ€” a speedup of roughly 17x. Python's built-in big integer multiplication (CPython) switches to Karatsuba when numbers exceed a certain size (see Objects/longobject.c in CPython source).

2.4 Strassen Matrix Multiplication

Multiplying two n*n matrices by definition requires O(n^3) multiplications. In 1969, Volker Strassen discovered an O(n^2.807) divide-and-conquer method โ€” the same idea as Karatsuba: partition matrices into four sub-blocks and find a way to use fewer sub-block multiplications.

The naive block multiplication requires 8 multiplications of (n/2)*(n/2) matrices (giving T(n) = 8T(n/2) + O(n^2) = O(n^3)). Strassen reduced 8 to 7 through clever linear combinations:

T(n) = 7T(n/2) + O(n^2) โ†’ T(n) = O(n^(log_2 7)) = O(n^2.807)

Strassen's 7 multiplications:

Mโ‚ = (Aโ‚โ‚ + Aโ‚‚โ‚‚)(Bโ‚โ‚ + Bโ‚‚โ‚‚)
Mโ‚‚ = (Aโ‚‚โ‚ + Aโ‚‚โ‚‚)Bโ‚โ‚
Mโ‚ƒ = Aโ‚โ‚(Bโ‚โ‚‚ - Bโ‚‚โ‚‚)
Mโ‚„ = Aโ‚‚โ‚‚(Bโ‚‚โ‚ - Bโ‚โ‚)
Mโ‚… = (Aโ‚โ‚ + Aโ‚โ‚‚)Bโ‚‚โ‚‚
Mโ‚† = (Aโ‚‚โ‚ - Aโ‚โ‚)(Bโ‚โ‚ + Bโ‚โ‚‚)
Mโ‚‡ = (Aโ‚โ‚‚ - Aโ‚‚โ‚‚)(Bโ‚‚โ‚ + Bโ‚‚โ‚‚)

Cโ‚โ‚ = Mโ‚ + Mโ‚„ - Mโ‚… + Mโ‚‡
Cโ‚โ‚‚ = Mโ‚ƒ + Mโ‚…
Cโ‚‚โ‚ = Mโ‚‚ + Mโ‚„
Cโ‚‚โ‚‚ = Mโ‚ - Mโ‚‚ + Mโ‚ƒ + Mโ‚†

Practical considerations: Strassen's algorithm has large constant factors (many additions required), so it only outperforms naive multiplication when n > a few hundred. NumPy/BLAS uses naive algorithms for small matrices and switches to Strassen or faster algorithms for large ones.

The current fastest known matrix multiplication exponent is approximately 2.371 (Alman & Vassilevska Williams, 2021), but due to enormous constant factors, it is never used in practice.

2.5 Divide-and-Conquer in Sorting

Merge sort and quicksort are both divide-and-conquer algorithms, but their strategies are fundamentally different:

Dimension Merge Sort Quicksort
How to divide By position (mid = n/2) By value (pivot)
Division cost O(1) O(n) (partition)
Combination cost O(n) (merge) O(1) (nothing to combine)
Where work happens During combination During division
Worst-case time O(n log n) O(n^2)
Average time O(n log n) O(n log n)
Stability Stable Unstable
Extra space O(n) O(log n) (stack)

Key insight: Merge sort does its work on the way up ("combine" phase); quicksort does its work on the way down ("divide" phase). Merge sort has an easy split and hard merge; quicksort has a hard split and trivial merge.

def quicksort(arr: list, lo: int = 0, hi: int = None) -> None:
    """Quicksort โ€” in-place divide-and-conquer
    
    Contrast with merge sort:
    - Merge sort: easy divide (split at middle), hard combine (merge sorted arrays)
    - Quicksort: hard divide (partition), easy combine (nothing to do)
    """
    if hi is None:
        hi = len(arr) - 1
    if lo >= hi:
        return
    
    # DIVIDE: partition arranges [โ‰คpivot] [pivot] [โ‰ฅpivot]
    pivot_idx = partition(arr, lo, hi)
    
    # CONQUER: recursively sort both sides
    quicksort(arr, lo, pivot_idx - 1)
    quicksort(arr, pivot_idx + 1, hi)
    # COMBINE: nothing! partition already ensures left โ‰ค pivot โ‰ค right


def partition(arr: list, lo: int, hi: int) -> int:
    """Lomuto partition scheme"""
    pivot = arr[hi]
    i = lo - 1
    for j in range(lo, hi):
        if arr[j] <= pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]
    arr[i + 1], arr[hi] = arr[hi], arr[i + 1]
    return i + 1

2.6 Complete Application of the Master Theorem

The Master Theorem provides closed-form solutions for divide-and-conquer recurrences of the form:

T(n) = aT(n/b) + f(n)

where a >= 1 (number of sub-problems), b > 1 (factor by which size shrinks), and f(n) is the combination cost.

Let c_crit = log_b(a). Then:

Case Condition Result
Case 1 f(n) = O(n^c), c < c_crit T(n) = Theta(n^c_crit)
Case 2 f(n) = Theta(n^c_crit * log^k n) T(n) = Theta(n^c_crit * log^(k+1) n)
Case 3 f(n) = Omega(n^c), c > c_crit, plus regularity T(n) = Theta(f(n))

Application examples:

Algorithm Recurrence a b f(n) c_crit Case T(n)
Binary search T(n)=T(n/2)+O(1) 1 2 O(1) 0 Case 2 (k=0) O(log n)
Merge sort T(n)=2T(n/2)+O(n) 2 2 O(n) 1 Case 2 (k=0) O(n log n)
Karatsuba T(n)=3T(n/2)+O(n) 3 2 O(n) 1.585 Case 1 O(n^1.585)
Strassen T(n)=7T(n/2)+O(n^2) 7 2 O(n^2) 2.807 Case 1 O(n^2.807)
Quicksort (best) T(n)=2T(n/2)+O(n) 2 2 O(n) 1 Case 2 (k=0) O(n log n)

Intuitive understanding: c_crit = log_b(a) measures the total work done at the leaves of the recursion tree. If f(n) grows slower than the leaf contribution (Case 1), total time is dominated by leaves. If they are comparable (Case 2), each level contributes equally, multiplied by the number of levels (log n). If f(n) grows faster than the leaves (Case 3), total time is dominated by the root's combination step.


Level 3 ยท What the Theory Says

3.1 The History of Divide-and-Conquer

The divide-and-conquer idea is far older than computer science. The earliest known divide-and-conquer algorithm traces back to a method Carl Friedrich Gauss described in an 1805 manuscript โ€” later proven equivalent to the Fast Fourier Transform (FFT). While computing the orbit of the asteroid Pallas, Gauss needed to evaluate trigonometric series coefficients. He discovered that by decomposing an n-point DFT into two n/2-point DFTs, the computation could be reduced from O(n^2) to O(n log n).

This method was forgotten for over 150 years, until Cooley and Tukey rediscovered and published it in 1965 ("An Algorithm for the Machine Calculation of Complex Fourier Series", Mathematics of Computation). Heideman, Johnson, and Burrus revealed Gauss's priority in their 1984 article "Gauss and the History of the Fast Fourier Transform."

Other early instances of divide-and-conquer:

The term "divide and conquer" itself originates from military strategy (divide et impera, Latin, often attributed to Caesar) and was introduced into computer science by Knuth in The Art of Computer Programming (1973).

3.2 Mathematical Foundations: Recurrence Relations and Characteristic Equations

A recurrence relation defines a sequence where each term is expressed in terms of preceding terms. Solving a recurrence means finding its closed form.

Linear homogeneous recurrence relations have the general form:

a_n = c_1 * a_{n-1} + c_2 * a_{n-2} + ... + c_k * a_{n-k}

The characteristic equation is:

x^k - c_1 * x^(k-1) - c_2 * x^(k-2) - ... - c_k = 0

Theorem (Characteristic Root Method): If the characteristic equation has k distinct roots r_1, r_2, ..., r_k, then the general solution is:

a_n = A_1 * r_1^n + A_2 * r_2^n + ... + A_k * r_k^n

where A_1, ..., A_k are determined by initial conditions.

Example: Deriving the closed form for Fibonacci

F(n) = F(n-1) + F(n-2), F(0)=0, F(1)=1

Characteristic equation: x^2 - x - 1 = 0

Roots: x = (1 +/- sqrt(5)) / 2

Let phi = (1+sqrt(5))/2 (the golden ratio, approximately 1.618) and psi = (1-sqrt(5))/2 (approximately -0.618).

General solution: F(n) = A * phi^n + B * psi^n

Applying initial conditions:

Therefore: F(n) = (phi^n - psi^n) / sqrt(5)

This is Binet's formula (1843). Since |psi| < 1, psi^n approaches 0, so F(n) is approximately phi^n / sqrt(5), rounded to the nearest integer.

This explains why naive recursive Fibonacci has complexity O(phi^n) = O(1.618^n): each call generates two sub-calls, and the total number of calls equals F(n+1), which grows as phi^n by Binet's formula.

3.3 Divide-and-Conquer vs Dynamic Programming

Both divide-and-conquer and dynamic programming decompose problems into sub-problems, but there is a crucial difference:

Dimension Divide-and-Conquer Dynamic Programming
Sub-problem overlap Sub-problems do not overlap Sub-problems overlap extensively
Solution approach Solve each independently Memoization / bottom-up table
Number of sub-problems Usually O(log n) levels Usually O(n) or O(n^2) distinct states
When to use Sub-problems are independent Optimal substructure + overlapping sub-problems

Decision criterion: If you draw the recursion tree and find the same sub-problem appearing at multiple positions (like fib(3) appearing multiple times in the Fibonacci tree), you should use DP. If each sub-problem appears exactly once (like each sub-array in merge sort being sorted exactly once), use divide-and-conquer.

Mathematical perspective: The recursion tree of a divide-and-conquer algorithm is a genuine tree (no repeated nodes) โ€” the number of nodes equals the number of distinct sub-problems. The "recursion tree" of a DP problem is actually a DAG (directed acyclic graph); if expanded as a tree, it would have exponentially many nodes, but after deduplication, only polynomially many remain.

# Divide-and-conquer: each sub-problem solved exactly once
def merge_sort(arr):
    # arr[0:4] and arr[4:8] are distinct sub-problems, never repeated
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    return merge(left, right)

# DP: extensive sub-problem overlap
def fib_memo(n, memo={}):
    # fib(3) might be called from both fib(5) and fib(4)
    if n in memo:
        return memo[n]
    if n <= 1:
        return n
    memo[n] = fib_memo(n-1, memo) + fib_memo(n-2, memo)
    return memo[n]

Edge case: Some problems can be solved with either approach. For example, maximum subarray sum (LeetCode #53) has both a divide-and-conquer solution (O(n log n)) and a Kadane's DP solution (O(n)). The choice depends on your understanding of the problem structure.

3.4 The Akra-Bazzi Theorem

The Master Theorem is convenient but restrictive: sub-problems must all be the same size (n/b). Many real divide-and-conquer algorithms violate this. For example, quicksort's average case splits the array into n/4 and 3n/4.

The Akra-Bazzi Theorem (1998) generalizes the Master Theorem to handle unequal splits:

T(n) = sum_i a_i * T(b_i * n) + f(n)

where 0 < b_i < 1 and a_i > 0.

Find the unique real number p satisfying sum_i a_i * b_i^p = 1. Then:

T(n) = Theta(n^p * (1 + integral from 1 to n of f(u)/(u^(p+1)) du))

Example: Quicksort's average case

Suppose each partition splits the array into n/4 and 3n/4 (not the most typical case, but illustrative):

T(n) = T(n/4) + T(3n/4) + O(n)

Akra-Bazzi: a_1=1, b_1=1/4, a_2=1, b_2=3/4

Find p: (1/4)^p + (3/4)^p = 1

When p=1: 1/4 + 3/4 = 1. Check!

So T(n) = Theta(n * (1 + integral from 1 to n of u/u^2 du)) = Theta(n * (1 + ln n)) = Theta(n log n)

This confirms that even with unbalanced partitions (as long as they are not extremely unbalanced), quicksort remains O(n log n).

3.5 Recursion and Computability

From the perspective of theoretical computer science, recursion has deeper significance.

Primitive Recursive Functions are a class of functions built from these basic operations:

  1. Zero function: Z(n) = 0
  2. Successor function: S(n) = n + 1
  3. Projection function: P_i(x_1,...,x_n) = x_i
  4. Composition
  5. Primitive recursion: from f and g, construct h such that h(0) = f and h(n+1) = g(h(n), n)

Primitive recursive functions include addition, multiplication, exponentiation, and factorial, but not all computable functions are primitive recursive. The Ackermann function (1928) is computable but not primitive recursive โ€” it grows faster than any primitive recursive function.

def ackermann(m: int, n: int) -> int:
    """Ackermann function โ€” computable but not primitive recursive
    
    Growth rate:
    - A(0, n) = n + 1
    - A(1, n) = n + 2
    - A(2, n) = 2n + 3
    - A(3, n) = 2^(n+3) - 3
    - A(4, n) โ‰ˆ 2โ†‘โ†‘(n+3) (hyper-exponential)
    - A(4, 2) = 2^65536 - 3 (a number with about 19,728 digits)
    """
    if m == 0:
        return n + 1
    elif n == 0:
        return ackermann(m - 1, 1)
    else:
        return ackermann(m - 1, ackermann(m, n - 1))

Mu-recursive functions (adding the unbounded search / minimization operator) correspond exactly to all computable functions โ€” this is one formulation of the Church-Turing thesis.


Level 4 ยท Edge Cases and Pitfalls

4.1 Python Recursion Depth Limits

Python's default recursion limit is 1000. Exceeding it raises RecursionError.

import sys

# Check current limit
print(sys.getrecursionlimit())  # Default: 1000

# Modify the limit
sys.setrecursionlimit(10000)

# But this is not a panacea! Two problems:
# 1. The OS stack space is finite (typically 1-8 MB)
# 2. Even with 10000, if each frame is large, you may still segfault

Why does Python limit recursion depth?

Python uses C's call stack to implement Python function calls. The C stack has a fixed size (Linux default 8MB, macOS default 512KB-8MB). Each Python stack frame consumes roughly 1-2 KB, so 1000 levels of recursion consume about 1-2 MB. Without a limit, deep recursion would overflow the C stack, causing a segmentation fault โ€” far harder to debug than an exception.

Common practice in competitive programming:

import sys
import threading

# Method 1: Increase recursion limit + thread stack size
sys.setrecursionlimit(300000)
threading.stack_size(67108864)  # 64MB

def main():
    # Your recursive code here
    pass

thread = threading.Thread(target=main)
thread.start()
thread.join()

# Method 2: Manually convert recursion to iteration (recommended)

4.2 General Methods for Converting Recursion to Iteration

Any recursion can be converted to iteration โ€” because the computer itself uses a stack to implement recursion. You just need to simulate that stack manually.

Method 1: Explicit Stack Simulation

# Recursive DFS
def dfs_recursive(graph, start, visited=None):
    if visited is None:
        visited = set()
    visited.add(start)
    for neighbor in graph[start]:
        if neighbor not in visited:
            dfs_recursive(graph, neighbor, visited)
    return visited

# Iterative version: explicit stack
def dfs_iterative(graph, start):
    visited = set()
    stack = [start]
    while stack:
        node = stack.pop()
        if node in visited:
            continue
        visited.add(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                stack.append(neighbor)
    return visited

Method 2: CPS Transform + Trampoline

For more complex recursion (like tree recursion needing post-processing), use Continuation-Passing Style (CPS):

def trampoline(fn):
    """Trampoline: recursion executor that avoids stack overflow"""
    def wrapper(*args):
        result = fn(*args)
        while callable(result):
            result = result()
        return result
    return wrapper

# Factorial using trampoline โ€” no stack overflow
def factorial_cps(n, cont=lambda x: x):
    if n == 0:
        return cont(1)
    # Return a thunk (deferred computation) instead of actual recursive call
    return lambda: factorial_cps(n - 1, lambda x: cont(n * x))

@trampoline
def factorial_safe(n, cont=lambda x: x):
    if n == 0:
        return cont(1)
    return lambda: factorial_safe(n - 1, lambda x: cont(n * x))

# factorial_safe(100000) will NOT overflow the stack

Method 3: Bottom-up (when recursion can be linearized)

# Recursive Fibonacci โ†’ Iterative
# Recursive direction: top-down (n โ†’ n-1 โ†’ ... โ†’ 0)
# Iterative direction: bottom-up (0 โ†’ 1 โ†’ ... โ†’ n)

def fib_bottom_up(n):
    if n <= 1:
        return n
    dp = [0] * (n + 1)
    dp[1] = 1
    for i in range(2, n + 1):
        dp[i] = dp[i-1] + dp[i-2]
    return dp[n]

4.3 High-Frequency Interview Problems

Problem 1: Pow(x, n) โ€” LeetCode #50

def myPow(x: float, n: int) -> float:
    """Implement pow(x, n)
    
    Key pitfalls:
    1. n can be negative
    2. n can be -2^31 (negation overflows 32-bit int, but not in Python)
    3. Must use fast exponentiation, otherwise TLE when n=2^31
    """
    if n < 0:
        x = 1 / x
        n = -n
    
    result = 1.0
    current_product = x
    while n > 0:
        if n & 1:
            result *= current_product
        current_product *= current_product
        n >>= 1
    return result

Follow-up questions in interviews: What if x=0 and n<0? Should handle specially (division by zero). What about x=0 and n=0? Mathematically 0^0 is undefined, but Python's pow(0,0) returns 1 (computer science convention).

Problem 2: Majority Element โ€” LeetCode #169 (Divide-and-Conquer Solution)

def majorityElement(nums: list) -> int:
    """Find the element appearing more than n/2 times
    
    Divide-and-conquer approach:
    - If a is the majority element of the left half OR the right half,
      then a is a candidate for the majority of the whole array
    - Verify by counting
    
    Time: T(n) = 2T(n/2) + O(n) = O(n log n)
    """
    def helper(lo, hi):
        # Base case: single element
        if lo == hi:
            return nums[lo]
        
        mid = (lo + hi) // 2
        left_majority = helper(lo, mid)
        right_majority = helper(mid + 1, hi)
        
        # If both halves agree, that's the answer
        if left_majority == right_majority:
            return left_majority
        
        # Otherwise, count which appears more in the full range
        left_count = sum(1 for i in range(lo, hi + 1) if nums[i] == left_majority)
        right_count = sum(1 for i in range(lo, hi + 1) if nums[i] == right_majority)
        
        return left_majority if left_count > right_count else right_majority
    
    return helper(0, len(nums) - 1)

Interview follow-up: The divide-and-conquer solution is O(n log n), but Boyer-Moore voting algorithm achieves O(n). The value of the D&C solution is demonstrating the paradigm and thought process. Interviewers may ask "Can you do O(n)?"

Problem 3: Maximum Subarray โ€” LeetCode #53 (Divide-and-Conquer vs DP)

def maxSubArray_divide_conquer(nums: list) -> int:
    """Divide-and-conquer solution โ€” O(n log n)
    
    Idea: The maximum subarray either lies entirely in the left half,
          entirely in the right half, or crosses the midpoint.
          Handle the crossing case separately.
    """
    def helper(lo, hi):
        if lo == hi:
            return nums[lo]
        
        mid = (lo + hi) // 2
        
        # Maximum subarray entirely in left half
        left_max = helper(lo, mid)
        # Maximum subarray entirely in right half
        right_max = helper(mid + 1, hi)
        
        # Maximum subarray crossing the midpoint
        # Extend left from mid
        left_sum = float('-inf')
        curr_sum = 0
        for i in range(mid, lo - 1, -1):
            curr_sum += nums[i]
            left_sum = max(left_sum, curr_sum)
        
        # Extend right from mid+1
        right_sum = float('-inf')
        curr_sum = 0
        for i in range(mid + 1, hi + 1):
            curr_sum += nums[i]
            right_sum = max(right_sum, curr_sum)
        
        cross_max = left_sum + right_sum
        
        return max(left_max, right_max, cross_max)
    
    return helper(0, len(nums) - 1)


def maxSubArray_dp(nums: list) -> int:
    """Kadane's Algorithm โ€” O(n) time, O(1) space
    
    DP formulation: dp[i] = maximum subarray sum ending at nums[i]
    Transition: dp[i] = max(nums[i], dp[i-1] + nums[i])
    """
    max_sum = curr_sum = nums[0]
    for i in range(1, len(nums)):
        curr_sum = max(nums[i], curr_sum + nums[i])
        max_sum = max(max_sum, curr_sum)
    return max_sum

Divide-and-conquer vs DP comparison:

Problem 4: Closest Pair of Points

import math

def closest_pair(points: list) -> float:
    """Closest pair of points โ€” O(n log n) divide-and-conquer
    
    Brute force O(nยฒ): check all C(n,2) pairs
    D&C O(n log n): split into left/right halves, find closest pair in each,
                    then handle pairs crossing the dividing line
    """
    def distance(p1, p2):
        return math.sqrt((p1[0]-p2[0])**2 + (p1[1]-p2[1])**2)
    
    def closest_recursive(pts_x, pts_y):
        n = len(pts_x)
        if n <= 3:
            # Brute force
            min_dist = float('inf')
            for i in range(n):
                for j in range(i+1, n):
                    min_dist = min(min_dist, distance(pts_x[i], pts_x[j]))
            return min_dist
        
        mid = n // 2
        mid_point = pts_x[mid]
        
        # Split into left and right halves by x-coordinate
        pts_y_left = [p for p in pts_y if p[0] <= mid_point[0]]
        pts_y_right = [p for p in pts_y if p[0] > mid_point[0]]
        
        # Fix: ensure correct split (handle points with same x-coordinate)
        if len(pts_y_left) > mid:
            pts_y_left = [p for p in pts_y if p[0] < mid_point[0] or
                         (p[0] == mid_point[0] and p[1] <= mid_point[1])]
            pts_y_right = [p for p in pts_y if p not in set(map(tuple, pts_y_left))]
        
        dl = closest_recursive(pts_x[:mid], pts_y_left)
        dr = closest_recursive(pts_x[mid:], pts_y_right)
        d = min(dl, dr)
        
        # Critical step: check pairs crossing the dividing line
        # Only need points with x-coordinate in [mid_x - d, mid_x + d]
        strip = [p for p in pts_y if abs(p[0] - mid_point[0]) < d]
        
        # Key theorem: each point in strip needs to check at most 7 subsequent points
        # (Proof in CLRS Chapter 33)
        for i in range(len(strip)):
            j = i + 1
            while j < len(strip) and (strip[j][1] - strip[i][1]) < d:
                d = min(d, distance(strip[i], strip[j]))
                j += 1
        
        return d
    
    # Pre-sort
    pts_x = sorted(points, key=lambda p: p[0])
    pts_y = sorted(points, key=lambda p: p[1])
    return closest_recursive(pts_x, pts_y)

Why only 7 points need checking in the strip? This is the core lemma of the closest pair problem. In a strip of width 2d and height d, at most 8 points can fit (at most 4 on each side, since any two points on the same side have distance >= d). Excluding itself, each point compares with at most 7 others. This guarantees the strip step is O(n), giving total complexity T(n) = 2T(n/2) + O(n) = O(n log n).

4.4 When to Use Divide-and-Conquer vs DP

This is a common design decision question in interviews. Here are the criteria:

Signals for divide-and-conquer:

  1. The problem naturally splits into independent sub-parts (e.g., left and right halves of an array)
  2. Sub-problems do not share state
  3. The combination step has a clear definition
  4. The recursion tree is a genuine tree (no repeated sub-problems)

Signals for dynamic programming:

  1. The same sub-problem is referenced multiple times in the recurrence
  2. The problem has "optimal substructure" โ€” the optimal solution contains optimal sub-solutions
  3. A state transition equation can be defined
  4. The number of distinct sub-problems is polynomial

Examples in the gray zone:

Problem D&C works? DP works? Optimal choice
Merge sort Yes No D&C
Fibonacci Yes (but exponential) Yes DP
Maximum subarray Yes O(n log n) Yes O(n) DP
Closest pair Yes O(n log n) No D&C
Matrix chain multiplication Yes (but exponential) Yes O(n^3) DP
Fast exponentiation Yes No D&C

Rule of thumb: If sub-problems do not overlap, use divide-and-conquer. If sub-problems overlap but you still use divide-and-conquer, you are doing redundant computation and should switch to DP (add memoization or go bottom-up).

4.5 Recursion in Production Engineering: Pitfalls

Pitfall 1: Recursion + global state = disaster

# Dangerous: modifying global/shared state in recursion
count = 0
def bad_recursive(n):
    global count
    count += 1  # Side effect! Bugs in multithreading, hard to debug
    if n <= 0:
        return
    bad_recursive(n - 1)

# Better: pass state through parameters and return values
def good_recursive(n, count=0):
    if n <= 0:
        return count
    return good_recursive(n - 1, count + 1)

Pitfall 2: Shallow copy trap in recursion

# Wrong: path is mutated after being saved
def bad_backtrack(node, path, results):
    path.append(node.val)
    if not node.left and not node.right:
        results.append(path)  # Shallow reference! Later mutations affect saved results
    # ...

# Correct: store a deep copy
def good_backtrack(node, path, results):
    path.append(node.val)
    if not node.left and not node.right:
        results.append(path[:])  # Critical: path[:] creates a copy
    if node.left:
        good_backtrack(node.left, path, results)
    if node.right:
        good_backtrack(node.right, path, results)
    path.pop()  # Backtrack

Pitfall 3: Base case performance in divide-and-conquer

In practice, when sub-problems become very small (e.g., merge sort with just a few elements), the function call overhead of recursion exceeds the actual computation. High-performance implementations switch to simpler algorithms at small sizes:

def merge_sort_optimized(arr: list) -> list:
    """Production-grade merge sort: switch to insertion sort for small arrays"""
    CUTOFF = 16  # Empirically determined optimal switchover point
    
    if len(arr) <= CUTOFF:
        # Small arrays: insertion sort (cache-friendly, small constants)
        for i in range(1, len(arr)):
            key = arr[i]
            j = i - 1
            while j >= 0 and arr[j] > key:
                arr[j + 1] = arr[j]
                j -= 1
            arr[j + 1] = key
        return arr
    
    mid = len(arr) // 2
    left = merge_sort_optimized(arr[:mid])
    right = merge_sort_optimized(arr[mid:])
    return merge(left, right)

TimSort (Python's built-in sorting algorithm) does exactly this: it first uses insertion sort to create sorted runs of length 32-64, then uses merge sort to combine these runs.

4.6 Parallelization Potential of Divide-and-Conquer

Divide-and-conquer algorithms are naturally suited for parallelization โ€” because sub-problems are independent and can be assigned to different processors.

from concurrent.futures import ProcessPoolExecutor
import multiprocessing

def parallel_merge_sort(arr: list, depth: int = 0) -> list:
    """Parallel merge sort
    
    Only parallelize the first few levels (deeper sub-problems are too small
    to justify parallelization overhead)
    """
    MAX_DEPTH = int(multiprocessing.cpu_count().bit_length())  # logโ‚‚(CPU count)
    
    if len(arr) <= 1:
        return arr
    
    mid = len(arr) // 2
    
    if depth < MAX_DEPTH:
        with ProcessPoolExecutor(max_workers=2) as executor:
            left_future = executor.submit(parallel_merge_sort, arr[:mid], depth+1)
            right_future = executor.submit(parallel_merge_sort, arr[mid:], depth+1)
            left = left_future.result()
            right = right_future.result()
    else:
        left = parallel_merge_sort(arr[:mid], depth+1)
        right = parallel_merge_sort(arr[mid:], depth+1)
    
    return merge(left, right)

Work-Span Model: For parallel divide-and-conquer algorithms, we care about two metrics:

Maximum parallel speedup = W(n) / S(n) (Brent's Theorem).

For merge sort: W(n) = O(n log n), S(n) = O(n) (merge is linear and cannot be fully parallelized). So the speedup upper bound is O(log n). If using parallel merge (binary search + parallel combination), span can be reduced to O(log^2 n), improving speedup to O(n / log n).


Summary

Recursion and divide-and-conquer are not two independent concepts โ€” divide-and-conquer is the most important application pattern of recursion. Mastering them requires understanding:

  1. Recursion = mathematical induction embodied in programming. The three elements (base case, recurrence, size reduction) are all indispensable.
  2. Divide-and-conquer = decompose + recursively solve + combine. Effective divide-and-conquer requires independent sub-problems and manageable combination cost.
  3. The Master Theorem and Akra-Bazzi Theorem provide closed-form complexity solutions for divide-and-conquer recurrences.
  4. Divide-and-conquer vs DP: the deciding factor is whether sub-problems overlap.
  5. Recursion to iteration: explicit stack, trampoline, bottom-up โ€” three methods covering all scenarios.

From Gauss's 1805 FFT precursor, to Karatsuba's 1960 refutation of the multiplication lower bound conjecture, to Strassen's 1969 breakthrough past the O(n^3) matrix multiplication barrier โ€” divide-and-conquer thinking has proven again and again throughout algorithmic history that decomposing one large problem into several smaller ones is often the most profound approach to problem-solving.

Rate this chapter
4.6  / 5  (6 ratings)

๐Ÿ’ฌ Comments