Chapter 35

Number Theory and Mathematical Foundations

Chapter 35: Number Theory and Mathematical Foundations

You wrote an encryption algorithm that needs to compute 2^1000000007 mod 998244353. If you multiply in a loop a billion times, the program won't finish before the heat death of the universe. But with fast exponentiation, 50 multiplications suffice. You need to check if a number is prime โ€” for numbers on the order of 10^18, trial division requires 10^9 divisions, while Miller-Rabin needs only a few dozen modular exponentiations.

Number theory is not a mathematician's toy. It is the cornerstone of cryptography (RSA relies on the difficulty of factoring large primes), the theoretical guarantee of hash functions (uniform distribution of modular arithmetic), and a core tool in competitive programming (combinatorial counting requires modular inverses). This chapter starts from the most basic GCD and gradually builds a complete number theory toolkit.


Level 1 ยท What You Need to Know

1.1 Greatest Common Divisor (GCD) and Euclid's Algorithm

Problem Definition: Given two positive integers a and b, find their greatest common divisor gcd(a, b).

Core Theorem: gcd(a, b) = gcd(b, a mod b), and when b = 0, gcd(a, 0) = a.

Why does this theorem hold? Let d = gcd(a, b), so d|a and d|b. Since a mod b = a - โŒŠa/bโŒ‹ยทb, and d|a and d|b, we have d|(a mod b). In the reverse direction, d|b and d|(a mod b) implies d|(a mod b + โŒŠa/bโŒ‹ยทb) = d|a. Therefore the set of common divisors of (a, b) and (b, a mod b) are identical, so their GCDs are equal.

def gcd(a: int, b: int) -> int:
    """Euclidean algorithm for GCD
    
    Time complexity: O(log(min(a, b)))
    Space complexity: O(1) iterative version
    """
    while b:
        a, b = b, a % b
    return a

# Recursive version (more intuitive but risk of stack overflow)
def gcd_recursive(a: int, b: int) -> int:
    if b == 0:
        return a
    return gcd_recursive(b, a % b)

Complexity Analysis: The number of iterations in Euclid's algorithm is at most O(log(min(a, b))). This is because after every two iterations, the larger number is at least halved. Specifically, if a > b, then a mod b < a/2 (proof: if b โ‰ค a/2, then a mod b < b โ‰ค a/2; if b > a/2, then a mod b = a - b < a/2). Therefore at most 2ยทlogโ‚‚(min(a, b)) iterations are needed.

Worst case: When a and b are adjacent Fibonacci numbers, Euclid's algorithm reaches its worst case. For example, gcd(F_{n+1}, F_n) requires n-1 iterations because F_{n+1} mod F_n = F_{n-1}. This also proves the complexity is O(log n), since F_n โ‰ˆ ฯ†^n / โˆš5, so n โ‰ˆ log_ฯ†(F_n).

1.2 Least Common Multiple (LCM)

Definition: lcm(a, b) is the smallest positive integer divisible by both a and b.

Relationship to GCD: lcm(a, b) = a ยท b / gcd(a, b)

Why does this formula hold? Factor a and b into primes: a = pโ‚^ฮฑโ‚ ยท pโ‚‚^ฮฑโ‚‚ ยท ... , b = pโ‚^ฮฒโ‚ ยท pโ‚‚^ฮฒโ‚‚ ยท .... Then gcd(a, b) = โˆ pแตข^min(ฮฑแตข, ฮฒแตข), lcm(a, b) = โˆ pแตข^max(ฮฑแตข, ฮฒแตข). Since min(x, y) + max(x, y) = x + y, we get gcd(a, b) ยท lcm(a, b) = a ยท b.

def lcm(a: int, b: int) -> int:
    """Least Common Multiple
    
    Note: divide first, multiply second to prevent overflow
    (in fixed-precision languages)
    """
    return a // gcd(a, b) * b

# LCM of multiple numbers
def lcm_multiple(numbers: list) -> int:
    from functools import reduce
    return reduce(lcm, numbers)

Common mistake: Computing a * b / gcd(a, b) directly may cause integer overflow in C/C++/Java. The correct approach is to divide first: a / gcd(a, b) * b. Python's arbitrary-precision integers don't overflow, but in competitive programming with C++ this is critical.

1.3 Primality Testing

Trial Division: A natural number n > 1 is prime if it cannot be divided by any integer from 2 to โˆšn.

Why only check up to โˆšn? If n = a ยท b with a โ‰ค b, then a โ‰ค โˆšn. So if n has a factor between 1 and n (exclusive), there must be one no greater than โˆšn.

def is_prime(n: int) -> bool:
    """Trial division primality test
    
    Time complexity: O(โˆšn)
    """
    if n < 2:
        return False
    if n < 4:
        return True
    if n % 2 == 0 or n % 3 == 0:
        return False
    # All primes > 3 are of the form 6kยฑ1
    i = 5
    while i * i <= n:
        if n % i == 0 or n % (i + 2) == 0:
            return False
        i += 6
    return True

Why the 6kยฑ1 optimization works: All integers can be written as 6k, 6k+1, 6k+2, 6k+3, 6k+4, or 6k+5. Among these, 6k, 6k+2, 6k+4 are divisible by 2, and 6k+3 is divisible by 3. So primes greater than 3 can only be of the form 6k+1 or 6k+5 (i.e., 6k-1). This skips 2/3 of candidate divisors.

1.4 Sieve of Eratosthenes

Problem: Find all primes from 1 to n.

Idea: Starting from 2, mark all multiples of each prime as composite. Unmarked numbers are prime.

def sieve_of_eratosthenes(n: int) -> list:
    """Sieve of Eratosthenes
    
    Time complexity: O(n log log n)
    Space complexity: O(n)
    """
    if n < 2:
        return []
    
    is_prime = [True] * (n + 1)
    is_prime[0] = is_prime[1] = False
    
    # Only need to sieve up to โˆšn
    i = 2
    while i * i <= n:
        if is_prime[i]:
            # Start marking from iยฒ (i*2, i*3, ..., i*(i-1) already marked)
            for j in range(i * i, n + 1, i):
                is_prime[j] = False
        i += 1
    
    return [i for i in range(2, n + 1) if is_prime[i]]

Why start marking from iยฒ? When processing prime p, the numbers pยท2, pยท3, ..., pยท(p-1) have already been marked when processing 2, 3, ..., p-1. For example, when p=5: 5ยท2=10 (marked by 2), 5ยท3=15 (marked by 3), 5ยท4=20 (marked by 2). The first new number to mark is 5ยท5=25.

Complexity analysis: Total operations are โˆ‘{pโ‰คn, p prime} n/p = n ยท โˆ‘{pโ‰คn} 1/p. By Mertens' theorem (Mertens, 1874), the sum of reciprocals of primes โˆ‘_{pโ‰คn} 1/p = ln(ln n) + M (M is the Meissel-Mertens constant โ‰ˆ 0.2615). Thus total complexity is O(n log log n).

1.5 Linear Sieve (Euler's Sieve)

The problem with the Sieve of Eratosthenes is that a composite number may be marked multiple times. For example, 30 gets marked by 2, 3, and 5. The linear sieve ensures each composite is marked exactly once by its smallest prime factor, achieving strict O(n) time.

def linear_sieve(n: int) -> list:
    """Linear Sieve (Euler's Sieve)
    
    Time complexity: O(n)
    Space complexity: O(n)
    
    Core idea: each composite m is marked only once by its
    smallest prime factor lpf(m)
    """
    is_prime = [True] * (n + 1)
    primes = []
    
    for i in range(2, n + 1):
        if is_prime[i]:
            primes.append(i)
        
        for p in primes:
            if i * p > n:
                break
            is_prime[i * p] = False
            # Key: stop when p divides i
            # Because for subsequent p' > p, the smallest prime
            # factor of i*p' is p (since p|i), and should be
            # marked when processing (i*p'/p) * p
            if i % p == 0:
                break
    
    return primes

Why if i % p == 0: break guarantees linearity: Suppose i = p ยท k. Then for any prime q > p in the primes list, i ยท q = p ยท k ยท q. The smallest prime factor of this number is p (since p|i, so p|iยทq), not q. If we continue marking i ยท q, this number would be marked twice โ€” once now (incorrectly attributing q as smallest prime factor) and once in the future when iterating to k ยท q (correctly marked by p). So we must stop when p|i.

Linear Sieve vs Sieve of Eratosthenes in practice:

Property Eratosthenes Linear Sieve
Time complexity O(n log log n) O(n)
Constant factor Small (cache-friendly) Large (poor branch prediction)
Actual speed (n=10โธ) About the same About the same
Additional info None Gives smallest prime factor
Code complexity Simple Slightly complex

In practice for n โ‰ค 10โธ, both are similarly fast. The linear sieve's advantage is providing each number's smallest prime factor, useful for fast factorization.

1.6 Common Errors and Pitfalls

Error 1: GCD argument order

Euclid's algorithm doesn't require a > b. If a < b, the first step gives a % b = a (since a < b), resulting in gcd(b, a) โ€” the arguments are automatically swapped.

Error 2: LCM overflow

# Dangerous in C++ (not an issue in Python):
# long long lcm = a * b / gcd(a, b);  // a*b may overflow!
# Correct:
# long long lcm = a / gcd(a, b) * b;  // divide first

Error 3: Array bounds in sieve

# Wrong: range should go to n+1
is_prime = [True] * n  # should be n+1
# Accessing is_prime[n] causes out-of-bounds

Error 4: Treating 1 as prime

1 is neither prime nor composite. This convention exists because the Fundamental Theorem of Arithmetic requires unique factorization โ€” if 1 were prime, then 6 = 2 ร— 3 = 1 ร— 2 ร— 3 = 1 ร— 1 ร— 2 ร— 3... would not be unique.


Level 2 ยท How It Works Under the Hood

2.1 Modular Exponentiation (Fast Power)

Problem: Compute a^n mod m, where n can be very large (e.g., 10^18).

Idea: Use binary expansion to reduce n multiplications to O(log n).

a^13 = a^(1101โ‚‚) = a^8 ยท a^4 ยท a^1

Each squaring doubles the exponent while checking if n's current lowest bit is 1.

def power_mod(base: int, exp: int, mod: int) -> int:
    """Modular exponentiation (fast power)
    
    Computes base^exp % mod
    Time complexity: O(log exp)
    Space complexity: O(1)
    """
    result = 1
    base %= mod
    
    while exp > 0:
        if exp & 1:
            result = result * base % mod
        base = base * base % mod
        exp >>= 1
    
    return result

Execution trace: Computing 2^13 mod 1000

Step exp (binary) base result Operation
Init 1101 2 1 -
1 110 4 2 exp&1=1, result*=base; baseยฒ=base
2 11 16 2 exp&1=0; baseยฒ=base
3 1 256 32 exp&1=1, result*=base; baseยฒ=base
4 0 65536 8192 exp&1=1, result*=base; done

Verify: 2^13 = 8192, 8192 mod 1000 = 192. Correct.

Why take mod at every step? Without modular reduction, intermediate results grow exponentially โ€” 2^(10^18) has more digits than atoms in the universe. Taking mod at each step keeps intermediate results below modยฒ (product of two numbers less than mod), making the computation feasible in finite precision.

2.2 Extended Euclidean Algorithm

Problem: Given a, b, find integers x, y such that ax + by = gcd(a, b).

Bรฉzout's Identity: For any integers a, b, there exist integers x, y such that ax + by = gcd(a, b). This is the theoretical basis for the extended Euclidean algorithm.

def extended_gcd(a: int, b: int) -> tuple:
    """Extended Euclidean Algorithm
    
    Returns (g, x, y) such that ax + by = g = gcd(a, b)
    Time complexity: O(log(min(a, b)))
    """
    if b == 0:
        return a, 1, 0  # a*1 + 0*0 = a
    
    g, x1, y1 = extended_gcd(b, a % b)
    # From b*x1 + (a%b)*y1 = g
    # i.e., b*x1 + (a - โŒŠa/bโŒ‹*b)*y1 = g
    # i.e., a*y1 + b*(x1 - โŒŠa/bโŒ‹*y1) = g
    x = y1
    y = x1 - (a // b) * y1
    return g, x, y

Derivation: Suppose we already know gcd(b, a%b) = g with bยทxโ‚ + (a%b)ยทyโ‚ = g.

Since a%b = a - โŒŠa/bโŒ‹ยทb, substituting:

Comparing with aยทx + bยทy = g: x = yโ‚, y = xโ‚ - โŒŠa/bโŒ‹ยทyโ‚.

Iterative version (avoids stack overflow):

def extended_gcd_iterative(a: int, b: int) -> tuple:
    """Extended Euclidean Algorithm - iterative version"""
    old_r, r = a, b
    old_s, s = 1, 0
    old_t, t = 0, 1
    
    while r != 0:
        quotient = old_r // r
        old_r, r = r, old_r - quotient * r
        old_s, s = s, old_s - quotient * s
        old_t, t = t, old_t - quotient * t
    
    return old_r, old_s, old_t

2.3 Modular Inverse

Definition: The modular inverse of a modulo m is x satisfying aยทx โ‰ก 1 (mod m), denoted aโปยน mod m.

Existence condition: aโปยน mod m exists if and only if gcd(a, m) = 1 (i.e., a and m are coprime).

Method 1: Extended Euclidean

From ax + my = gcd(a, m) = 1, we get ax โ‰ก 1 (mod m), so x is the modular inverse.

def mod_inverse(a: int, m: int) -> int:
    """Modular inverse via extended Euclidean algorithm
    
    Precondition: gcd(a, m) = 1
    Returns x such that a*x โ‰ก 1 (mod m)
    """
    g, x, _ = extended_gcd(a, m)
    if g != 1:
        raise ValueError(f"{a} and {m} are not coprime; inverse doesn't exist")
    return x % m  # ensure positive result

Method 2: Fermat's Little Theorem (when m is prime)

Fermat's Little Theorem: If p is prime and gcd(a, p) = 1, then a^(p-1) โ‰ก 1 (mod p).

Therefore a ยท a^(p-2) โ‰ก 1 (mod p), i.e., aโปยน โ‰ก a^(p-2) (mod p).

def mod_inverse_fermat(a: int, p: int) -> int:
    """Modular inverse via Fermat's Little Theorem (only when p is prime)
    
    a^(-1) โ‰ก a^(p-2) (mod p)
    """
    return power_mod(a, p - 2, p)

Comparison:

Method Condition Time Advantage
Extended Euclidean gcd(a, m) = 1 O(log m) General; m need not be prime
Fermat's theorem m is prime O(log m) Concise code, smaller constant

2.4 Combinatorics and Lucas' Theorem

Computing C(n, k):

Method 1 (small scale): Pascal's Triangle

def comb_pascal(n: int, k: int) -> int:
    """Pascal's Triangle method
    
    Time: O(nk), Space: O(k)
    """
    if k > n - k:
        k = n - k
    
    dp = [0] * (k + 1)
    dp[0] = 1
    
    for i in range(1, n + 1):
        for j in range(min(i, k), 0, -1):
            dp[j] = dp[j] + dp[j - 1]
    
    return dp[k]

Method 2 (modular, large scale): Precompute factorials and their inverses

class CombMod:
    """O(1) binomial coefficient queries mod p after O(n) preprocessing"""
    
    def __init__(self, n: int, mod: int):
        self.mod = mod
        self.fact = [1] * (n + 1)
        self.inv_fact = [1] * (n + 1)
        
        for i in range(1, n + 1):
            self.fact[i] = self.fact[i - 1] * i % mod
        
        self.inv_fact[n] = power_mod(self.fact[n], mod - 2, mod)
        for i in range(n - 1, -1, -1):
            self.inv_fact[i] = self.inv_fact[i + 1] * (i + 1) % mod
    
    def comb(self, n: int, k: int) -> int:
        """Compute C(n, k) mod p"""
        if k < 0 or k > n:
            return 0
        return self.fact[n] * self.inv_fact[k] % self.mod * self.inv_fact[n - k] % self.mod

Lucas' Theorem: When p is prime and n, k may be very large (but p is small):

C(n, k) mod p = C(n mod p, k mod p) ยท C(n // p, k // p) mod p

def lucas(n: int, k: int, p: int) -> int:
    """Lucas' Theorem for C(n, k) mod p
    
    Requires: p is prime
    Use when: n, k are large but p is small (e.g., p โ‰ค 10โต)
    Time: O(p + log_p(n) ยท log p)
    """
    fact = [1] * p
    for i in range(1, p):
        fact[i] = fact[i - 1] * i % p
    
    inv_fact = [1] * p
    inv_fact[p - 1] = power_mod(fact[p - 1], p - 2, p)
    for i in range(p - 2, -1, -1):
        inv_fact[i] = inv_fact[i + 1] * (i + 1) % p
    
    def small_comb(n, k):
        if k < 0 or k > n:
            return 0
        return fact[n] * inv_fact[k] % p * inv_fact[n - k] % p
    
    result = 1
    while n > 0 or k > 0:
        ni, ki = n % p, k % p
        result = result * small_comb(ni, ki) % p
        n //= p
        k //= p
    
    return result

Intuition behind Lucas' Theorem: Think of n and k as base-p numbers. C(n, k) mod p equals the product of the binomial coefficients of each digit pair. This is analogous to "digit-by-digit" processing in base 10, but applied to binomial coefficients modulo a prime.

2.5 Variants and Applications of Fast Exponentiation

Matrix exponentiation: Accelerates linear recurrences.

def matrix_mult(A: list, B: list, mod: int) -> list:
    """Matrix multiplication mod"""
    n = len(A)
    m = len(B[0])
    k = len(B)
    C = [[0] * m for _ in range(n)]
    for i in range(n):
        for j in range(m):
            for l in range(k):
                C[i][j] = (C[i][j] + A[i][l] * B[l][j]) % mod
    return C

def matrix_power(M: list, n: int, mod: int) -> list:
    """Matrix fast exponentiation
    
    Computes M^n mod p
    Used to accelerate Fibonacci and other linear recurrences
    """
    size = len(M)
    result = [[1 if i == j else 0 for j in range(size)] for i in range(size)]
    
    while n > 0:
        if n & 1:
            result = matrix_mult(result, M, mod)
        M = matrix_mult(M, M, mod)
        n >>= 1
    
    return result

def fibonacci_fast(n: int, mod: int = 10**9 + 7) -> int:
    """O(log n) computation of the nth Fibonacci number
    
    [F(n+1), F(n)] = [[1,1],[1,0]]^n ยท [1, 0]
    """
    if n <= 1:
        return n
    M = [[1, 1], [1, 0]]
    result = matrix_power(M, n, mod)
    return result[0][1]

Applications:

2.6 Prime Distribution and the Prime Number Theorem

Prime Number Theorem (Hadamard and de la Vallรฉe-Poussin, independently proved in 1896):

ฯ€(n) ~ n / ln(n)

where ฯ€(n) is the number of primes not exceeding n. More precisely:

lim_{nโ†’โˆž} ฯ€(n) / (n / ln n) = 1

Practical implications: This tells us there are approximately n/ln(n) primes in [1, n]. For example:

The practical meaning: near n, on average one in every ln(n) consecutive integers is prime. This means:


Level 3 ยท What the Specifications Say

3.1 History of Euclid's Algorithm (circa 300 BCE)

Euclid's algorithm appears in Book VII of Euclid's Elements (Propositions 1 and 2), circa 300 BCE. This makes it one of the oldest non-trivial algorithms in human history โ€” predating the "mutual subtraction" method in China's Nine Chapters on the Mathematical Art (circa 200 CE) by about 500 years.

Original formulation: Euclid did not use the modulo operation but rather repeated subtraction. Given two line segments AB and CD (AB > CD), repeatedly subtract CD's length from AB until the remainder is less than CD, then swap roles and continue. This is mathematically equivalent to modulo (a mod b = a - โŒŠa/bโŒ‹ยทb), but the modular version converges faster.

Chinese "mutual subtraction" method: The Nine Chapters records a method essentially identical to Euclid's subtraction version: "If both are even, halve them; otherwise, subtract the smaller from the larger repeatedly until they are equal." It first tries dividing both by 2, then uses repeated subtraction.

Lamรฉ's Theorem (1844): Gabriel Lamรฉ proved that the number of steps in Euclid's algorithm is at most 5 times the number of digits in the input. Specifically, if a > b โ‰ฅ 1, the number of divisions is at most 5ยทlogโ‚โ‚€(b) + 1. This is one of the earliest complexity proofs in the history of algorithm analysis.

Core of Lamรฉ's proof: He showed that the smallest inputs requiring n steps are consecutive Fibonacci numbers (F_{n+2}, F_{n+1}). Since F_n โ‰ˆ ฯ†^n/โˆš5 (ฯ† = (1+โˆš5)/2 โ‰ˆ 1.618), the number of steps n โ‰ˆ log_ฯ†(b) = logโ‚โ‚€(b) / logโ‚โ‚€(ฯ†) โ‰ˆ 2.078ยทlogโ‚โ‚€(b), giving the upper bound 5ยทlogโ‚โ‚€(b).

3.2 Fermat's Little Theorem

Theorem (Pierre de Fermat, stated in a 1640 letter; first proved by Euler in 1736):

If p is prime and a is not divisible by p, then a^(p-1) โ‰ก 1 (mod p).

Equivalent form: For any integer a, a^p โ‰ก a (mod p).

Proof (group-theoretic): Consider the nonzero residues {1, 2, ..., p-1} modulo p, which form a group of order p-1 under multiplication (denoted (Z/pZ)*). By Lagrange's theorem, the order of every element divides the group order, i.e., ord(a) | (p-1). Therefore a^(p-1) = (a^ord(a))^((p-1)/ord(a)) = 1^((p-1)/ord(a)) = 1.

Combinatorial proof (without group theory): Consider a, 2a, 3a, ..., (p-1)a modulo p. They are pairwise distinct (otherwise ia โ‰ก ja (mod p) means (i-j)a โ‰ก 0 (mod p), but pโˆคa and 0 < |i-j| < p, contradiction). So they form a permutation of 1, 2, ..., p-1. Therefore:

a ยท 2a ยท 3a ยท ... ยท (p-1)a โ‰ก 1 ยท 2 ยท 3 ยท ... ยท (p-1) (mod p)

a^(p-1) ยท (p-1)! โ‰ก (p-1)! (mod p)

Since (p-1)! is coprime to p, we can cancel to get a^(p-1) โ‰ก 1 (mod p).

Applications:

  1. Computing modular inverse: aโปยน โ‰ก a^(p-2) (mod p)
  2. Primality testing (Fermat test): If a^(n-1) โ‰ข 1 (mod n), then n is definitely composite
  3. Caveat: Carmichael numbers exist (e.g., 561 = 3ยท11ยท17) โ€” they are composite but satisfy a^(n-1) โ‰ก 1 (mod n) for all a coprime to n

3.3 Chinese Remainder Theorem (CRT)

Theorem (Sunzi Suanjing, circa 3rd-5th century CE):

Let mโ‚, mโ‚‚, ..., mโ‚– be pairwise coprime. Then the system of congruences

x โ‰ก aโ‚ (mod mโ‚) x โ‰ก aโ‚‚ (mod mโ‚‚) ... x โ‰ก aโ‚– (mod mโ‚–)

has a unique solution modulo M = mโ‚ยทmโ‚‚ยท...ยทmโ‚–.

Historical context: The original problem in Sunzi Suanjing: "There are things whose number is unknown. When divided by 3, remainder is 2; by 5, remainder is 3; by 7, remainder is 2. What is the number?" (Answer: 23)

Constructive proof:

Let M = mโ‚ยทmโ‚‚ยท...ยทmโ‚– and Mแตข = M/mแตข.

Since gcd(Mแตข, mแตข) = 1, there exists tแตข (the inverse of Mแตข mod mแตข) with Mแตข ยท tแตข โ‰ก 1 (mod mแตข).

The solution is: x = ฮฃแตข aแตข ยท Mแตข ยท tแตข (mod M)

Verification: Taking x mod mโฑผ, all terms except i = j vanish (their Mแตข contains factor mโฑผ), leaving aโฑผ ยท 1 = aโฑผ.

def chinese_remainder_theorem(remainders: list, moduli: list) -> int:
    """Chinese Remainder Theorem
    
    Input: remainders = [a1, a2, ..., ak]
           moduli = [m1, m2, ..., mk] (pairwise coprime)
    Output: x such that x โ‰ก ai (mod mi) for all i
    
    Time: O(k ยท log(max(mi)))
    """
    M = 1
    for m in moduli:
        M *= m
    
    x = 0
    for ai, mi in zip(remainders, moduli):
        Mi = M // mi
        _, ti, _ = extended_gcd(Mi, mi)
        x += ai * Mi * ti
    
    return x % M

Practical example:

# Sunzi Suanjing problem
# x โ‰ก 2 (mod 3), x โ‰ก 3 (mod 5), x โ‰ก 2 (mod 7)
result = chinese_remainder_theorem([2, 3, 2], [3, 5, 7])
print(result)  # 23
# Verify: 23 % 3 = 2, 23 % 5 = 3, 23 % 7 = 2 โœ“

CRT in real systems:

  1. CRT-RSA: Instead of computing c^d mod n (n is 2048-bit), compute c^d mod p and c^d mod q (each 1024-bit) separately, then combine with CRT. This is ~4x faster.

  2. Big integer multiplication: Choose small primes pโ‚, pโ‚‚, ..., multiply mod each pแตข, then reconstruct with CRT.

  3. Multi-modulus technique in contests: When answers may be huge, compute mod 3 large primes separately, then reconstruct with CRT.

3.4 Euler's Theorem and Euler's Totient Function

Euler's totient ฯ†(n): The count of positive integers less than n that are coprime to n.

def euler_phi(n: int) -> int:
    """Euler's totient function
    
    ฯ†(n) = n ยท โˆ(1 - 1/p) for all distinct prime factors p of n
    Time: O(โˆšn)
    """
    result = n
    p = 2
    while p * p <= n:
        if n % p == 0:
            while n % p == 0:
                n //= p
            result -= result // p
        p += 1
    if n > 1:
        result -= result // n
    return result

Euler's Theorem (Euler, 1763): If gcd(a, n) = 1, then a^ฯ†(n) โ‰ก 1 (mod n).

This generalizes Fermat's Little Theorem: when n = p is prime, ฯ†(p) = p-1, reducing to Fermat's theorem.

Properties of Euler's totient:

3.5 Miller-Rabin Primality Test

For large numbers (e.g., 10^18), trial division is too slow. Miller-Rabin is a probabilistic primality test based on the following observation:

Setup: If p is an odd prime, write p-1 = 2^s ยท d (d odd). For any 1 < a < p:

(This follows from Fermat's theorem a^(p-1) โ‰ก 1 and the fact that the only square roots of 1 mod p are ยฑ1)

If n fails this test for some a, then n is definitely composite.

def miller_rabin(n: int, witnesses: list = None) -> bool:
    """Miller-Rabin primality test
    
    For n < 3,317,044,064,679,887,385,961,981,
    witnesses = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
    gives a deterministic result (no probability of error).
    """
    if n < 2:
        return False
    if n == 2 or n == 3:
        return True
    if n % 2 == 0:
        return False
    
    # Write n-1 = 2^s * d
    s, d = 0, n - 1
    while d % 2 == 0:
        s += 1
        d //= 2
    
    if witnesses is None:
        witnesses = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
    
    for a in witnesses:
        if a >= n:
            continue
        
        x = power_mod(a, d, n)
        
        if x == 1 or x == n - 1:
            continue
        
        for _ in range(s - 1):
            x = x * x % n
            if x == n - 1:
                break
        else:
            return False
    
    return True

Deterministic guarantee: Gary Miller (1976) proved that under the Generalized Riemann Hypothesis (GRH), testing a = 2, 3, ..., 2(ln n)ยฒ suffices for deterministic primality. In practice, for 64-bit integers (n < 2^64), a fixed set of 12 witnesses is sufficient.


Level 4 ยท Edge Cases and Interview Problems

4.1 Interview Problem: Count Primes (LeetCode #204)

Problem: Given integer n, return the count of all primes less than n.

Analysis: Direct application of the Sieve of Eratosthenes. Note: "less than n", not "at most n".

def count_primes(n: int) -> int:
    """LeetCode 204: Count Primes
    
    Time: O(n log log n)
    Space: O(n)
    """
    if n <= 2:
        return 0
    
    is_prime = [True] * n
    is_prime[0] = is_prime[1] = False
    
    i = 2
    while i * i < n:
        if is_prime[i]:
            for j in range(i * i, n, i):
                is_prime[j] = False
        i += 1
    
    return sum(is_prime)

Follow-up questions in interviews:

  1. "Can we use only O(โˆšn) space?" โ€” Segmented Sieve
  2. "What if n = 10^10?" โ€” Meissel-Lehmer algorithm, O(n^(2/3)) time
  3. "How to test a single large number?" โ€” Miller-Rabin

Optimization: Use a bitset instead of bool array to save 8x memory. In Python, use bytearray:

def count_primes_optimized(n: int) -> int:
    """Optimized: only process odd numbers"""
    if n <= 2:
        return 0
    if n <= 3:
        return 1
    
    # Only store odds: sieve[i] represents whether 2i+3 is prime
    size = (n - 3) // 2 + 1
    sieve = bytearray(b'\x01') * size
    
    i = 0
    while (2 * i + 3) ** 2 < n:
        if sieve[i]:
            p = 2 * i + 3
            start = (p * p - 3) // 2
            sieve[start::p] = bytearray(len(sieve[start::p]))
        i += 1
    
    return sum(sieve) + 1  # +1 for prime 2

4.2 Interview Problem: Happy Number (LeetCode #202)

Problem: Determine if a number n is a "happy number." A happy number is defined by replacing the number with the sum of squares of its digits, repeating until it reaches 1 (happy) or loops endlessly (not happy).

Number-theoretic analysis: The key insight is the process must enter a cycle (for a d-digit number, the next value is at most 81d, so the range is finite). Moreover, unhappy numbers always enter the cycle containing 4: 4 โ†’ 16 โ†’ 37 โ†’ 58 โ†’ 89 โ†’ 145 โ†’ 42 โ†’ 20 โ†’ 4.

def is_happy(n: int) -> bool:
    """LeetCode 202: Happy Number
    
    Method 1: Floyd's cycle detection
    Time: O(log n) (practically O(1) since values quickly shrink below 1000)
    Space: O(1)
    """
    def digit_square_sum(num):
        total = 0
        while num:
            num, digit = divmod(num, 10)
            total += digit * digit
        return total
    
    slow = n
    fast = digit_square_sum(n)
    
    while fast != 1 and slow != fast:
        slow = digit_square_sum(slow)
        fast = digit_square_sum(digit_square_sum(fast))
    
    return fast == 1

def is_happy_v2(n: int) -> bool:
    """Method 2: Check if we hit 4"""
    def digit_square_sum(num):
        total = 0
        while num:
            num, digit = divmod(num, 10)
            total += digit * digit
        return total
    
    while n != 1 and n != 4:
        n = digit_square_sum(n)
    
    return n == 1

Why convergence is fast: For n โ‰ค 999 (3 digits), the next value is at most 9ยฒ+9ยฒ+9ยฒ = 243. For n โ‰ค 9999999 (7 digits), the next value is at most 7ร—81 = 567. So regardless of initial value, within a few steps it drops below 1000, then cycles within a finite set.

4.3 Interview Problem: Pow(x, n) (LeetCode #50)

Problem: Implement pow(x, n) โ€” compute x raised to the power n.

Key pitfalls:

  1. n can be negative โ†’ x^(-n) = 1 / x^n
  2. n can be -2^31 (taking absolute value overflows int32)
  3. x = 0 and n < 0 โ†’ undefined (but problem guarantees this won't happen)
def my_pow(x: float, n: int) -> float:
    """LeetCode 50: Pow(x, n)
    
    Fast exponentiation with negative exponent handling
    Time: O(log |n|)
    Space: O(1)
    """
    if n == 0:
        return 1.0
    
    if n < 0:
        x = 1.0 / x
        n = -n
    
    result = 1.0
    current = x
    
    while n > 0:
        if n & 1:
            result *= current
        current *= current
        n >>= 1
    
    return result

Interview discussion points:

4.4 Interview Problem: Trailing Zeroes (LeetCode #172)

Problem: Given integer n, return the number of trailing zeroes in n!.

Number-theoretic analysis: Trailing zeroes come from factors of 10 = 2 ร— 5. In n!, the count of factor 2 far exceeds factor 5 (every other even number contributes at least one 2, but only every 5th number contributes a 5). So trailing zeroes = total count of factor 5 in n!.

Legendre's Formula: The exponent of prime p in n! is ฮฃ_{i=1}^{โˆž} โŒŠn/p^iโŒ‹

def trailing_zeroes(n: int) -> int:
    """LeetCode 172: Factorial Trailing Zeroes
    
    Count trailing zeros = count factor 5 in n!
    Time: O(logโ‚… n)
    Space: O(1)
    """
    count = 0
    power_of_5 = 5
    while power_of_5 <= n:
        count += n // power_of_5
        power_of_5 *= 5
    return count

# More concise version
def trailing_zeroes_v2(n: int) -> int:
    count = 0
    while n >= 5:
        n //= 5
        count += n
    return count

Intuitive explanation:

Example n = 100: โŒŠ100/5โŒ‹ + โŒŠ100/25โŒ‹ + โŒŠ100/125โŒ‹ = 20 + 4 + 0 = 24. So 100! has 24 trailing zeroes.

Advanced variant: "Trailing zeroes of n! in base k" โ€” factor k into primes, use Legendre's formula for each, take the minimum.

4.5 Common Number Theory Pitfalls in Contests

Pitfall 1: Negative numbers in modular arithmetic

# Python: -7 % 3 = 2 (always non-negative) โ€” correct mathematical definition
# C/C++: -7 % 3 = -1 (preserves sign) โ€” needs manual fix
# Safe mod in C++: ((a % m) + m) % m

Pitfall 2: Modular arithmetic doesn't distribute over division

# Wrong: (a / b) % m โ‰  (a % m) / (b % m)
# Correct: (a / b) % m = (a % m) * inverse(b, m) % m

# Example: (10 / 2) % 7 = 5
# Wrong: (10 % 7) / (2 % 7) = 3 / 2 = 1.5 ??
# Correct: (10 % 7) * mod_inverse(2, 7) % 7 = 3 * 4 % 7 = 5
# (since 2 * 4 = 8 โ‰ก 1 (mod 7), so 2โปยน โ‰ก 4 (mod 7))

Pitfall 3: Fast power with base=0 or mod=1

def power_mod_safe(base, exp, mod):
    if mod == 1:
        return 0  # anything mod 1 = 0
    if base == 0:
        return 0 if exp > 0 else 1  # 0^0 usually defined as 1
    # ... normal fast power

Pitfall 4: C(n, k) when k > n

# Must return 0 when k > n; don't let factorial compute negative values
def safe_comb(n, k, mod):
    if k < 0 or k > n:
        return 0
    # ... normal computation

4.6 Complete Number Theory Toolkit for Interviews

class NumberTheoryToolbox:
    """Complete number theory toolkit for contests/interviews"""
    
    def __init__(self, mod: int = 10**9 + 7):
        self.mod = mod
    
    def power(self, base: int, exp: int) -> int:
        result = 1
        base %= self.mod
        while exp > 0:
            if exp & 1:
                result = result * base % self.mod
            base = base * base % self.mod
            exp >>= 1
        return result
    
    def inverse(self, a: int) -> int:
        return self.power(a, self.mod - 2)
    
    def factorial_and_inverse(self, n: int) -> tuple:
        fact = [1] * (n + 1)
        for i in range(1, n + 1):
            fact[i] = fact[i - 1] * i % self.mod
        
        inv_fact = [1] * (n + 1)
        inv_fact[n] = self.power(fact[n], self.mod - 2)
        for i in range(n - 1, -1, -1):
            inv_fact[i] = inv_fact[i + 1] * (i + 1) % self.mod
        
        return fact, inv_fact
    
    def comb(self, n: int, k: int, fact: list, inv_fact: list) -> int:
        if k < 0 or k > n:
            return 0
        return fact[n] * inv_fact[k] % self.mod * inv_fact[n - k] % self.mod
    
    @staticmethod
    def gcd(a: int, b: int) -> int:
        while b:
            a, b = b, a % b
        return a
    
    @staticmethod
    def lcm(a: int, b: int) -> int:
        return a // NumberTheoryToolbox.gcd(a, b) * b
    
    @staticmethod
    def extended_gcd(a: int, b: int) -> tuple:
        if b == 0:
            return a, 1, 0
        g, x, y = NumberTheoryToolbox.extended_gcd(b, a % b)
        return g, y, x - (a // b) * y
    
    @staticmethod
    def crt(remainders: list, moduli: list) -> int:
        M = 1
        for m in moduli:
            M *= m
        result = 0
        for a, m in zip(remainders, moduli):
            Mi = M // m
            _, t, _ = NumberTheoryToolbox.extended_gcd(Mi, m)
            result += a * Mi * t
        return result % M

4.7 Interview Response Strategy

When encountering number theory problems in interviews:

  1. Confirm bounds first: The range of n determines algorithm choice

    • n โ‰ค 10โถ: Sieve / brute force
    • n โ‰ค 10โท: Linear sieve
    • n โ‰ค 10^18: Fast power / Miller-Rabin / Pollard Rho
  2. Three principles of modular arithmetic:

    • Addition, subtraction, multiplication can take mod anytime
    • Division must be converted to multiplication by inverse
    • Take mod at every step to prevent overflow
  3. Common pattern recognition:

    • "Result modulo 10^9+7" โ†’ Need modular inverse, fast power
    • "Compute binomial coefficient" โ†’ Preprocess factorials + Lucas
    • "Greatest common divisor" โ†’ Euclid's algorithm
    • "Congruence equations" โ†’ Extended Euclidean / CRT
  4. Complexity quick reference:

    • gcd: O(log n)
    • Fast power: O(log n)
    • Sieve of Eratosthenes [1,n]: O(n log log n)
    • Linear sieve [1,n]: O(n)
    • Trial division: O(โˆšn)
    • Miller-Rabin: O(k ยท logยฒn), k = number of witnesses
Rate this chapter
4.8  / 5  (3 ratings)

๐Ÿ’ฌ Comments