Chapter 21

Sorting: From Bubble Sort to TimSort

Chapter 21: Sorting — From Bubble Sort to TimSort

Open a Python interpreter, type sorted([3, 1, 4, 1, 5, 9]), and you get [1, 1, 3, 4, 5, 9] in microseconds. This seems trivial, but behind it lies over 60 years of computer science wisdom. Sorting is one of the most thoroughly studied problems in computing — from John von Neumann's merge sort implemented on EDVAC in 1945, to Tim Peters' TimSort designed for CPython in 2002, each generation of algorithms pushes closer to "perfect sorting."

Why does sorting matter so much? Because it is the foundation for other algorithms. Binary search requires sorted arrays; database indexes are essentially sorted mappings; OS process scheduling needs priority ordering; search engine results need relevance ordering. Donald Knuth dedicated an entire volume of "The Art of Computer Programming" (Volume 3) to sorting and searching, noting that "a computer spends roughly 25% of its time sorting."

In this chapter, we start from the simplest bubble sort and work our way up to industrial-grade TimSort and introsort, covering everything you need to know about sorting.


Level 1 · What You Need to Know

21.1 Overview of Eight Classic Sorting Algorithms

Before diving into each algorithm, let's see the big picture. The following table summarizes the core properties of eight classic sorting algorithms:

Algorithm Average Time Worst Time Space Stable Use Case
Bubble Sort O(n²) O(n²) O(1) Yes Teaching only
Selection Sort O(n²) O(n²) O(1) No Fewest swaps
Insertion Sort O(n²) O(n²) O(1) Yes Small/nearly-sorted arrays
Merge Sort O(n log n) O(n log n) O(n) Yes Stability needed, linked lists
Quick Sort O(n log n) O(n²) O(log n) No General purpose, fastest in practice
Heap Sort O(n log n) O(n log n) O(1) No Worst-case guarantee
Counting Sort O(n + k) O(n + k) O(k) Yes Integers with small range
Radix Sort O(d·(n+k)) O(d·(n+k)) O(n+k) Yes Fixed-width integers

Here n is the number of elements, k is the value range, and d is the maximum number of digits.

A critical distinction: the first six are comparison sorts — they determine order only by pairwise comparisons. The last two (plus bucket sort) are non-comparison sorts — they exploit the numerical properties of elements to determine positions. This distinction matters enormously: comparison sorts have a theoretical lower bound of O(n log n), while non-comparison sorts can break through it.

21.2 Bubble Sort

Bubble sort is the most intuitive sorting algorithm: repeatedly traverse the array, compare adjacent elements, and swap them if they're in the wrong order. Each pass "bubbles" at least one largest element to its correct position.

def bubble_sort(arr):
    """Bubble sort: compare and swap adjacent elements"""
    n = len(arr)
    for i in range(n):
        swapped = False
        for j in range(n - 1 - i):
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
                swapped = True
        if not swapped:
            break  # Optimization: if no swaps in a pass, array is sorted
    return arr

Analysis:

The only virtues of bubble sort are simplicity and stability. In practice, there is almost never a reason to use it — even among O(n²) algorithms, insertion sort is far superior.

21.3 Selection Sort

Selection sort repeatedly finds the minimum element from the unsorted portion and places it at the end of the sorted portion.

def selection_sort(arr):
    """Selection sort: find minimum, place it at front"""
    n = len(arr)
    for i in range(n):
        min_idx = i
        for j in range(i + 1, n):
            if arr[j] < arr[min_idx]:
                min_idx = j
        arr[i], arr[min_idx] = arr[min_idx], arr[i]
    return arr

Analysis:

21.4 Insertion Sort

Insertion sort works like sorting a hand of cards: take a new card and find its correct position among the cards already in hand.

def insertion_sort(arr):
    """Insertion sort: insert each element into its correct position"""
    for i in range(1, len(arr)):
        key = arr[i]
        j = i - 1
        while j >= 0 and arr[j] > key:
            arr[j + 1] = arr[j]
            j -= 1
        arr[j + 1] = key
    return arr

Analysis:

Insertion sort is the fastest O(n²) algorithm in practice, because:

  1. The inner loop exits early (unlike bubble and selection which always complete)
  2. Degrades to O(n) for nearly-sorted data
  3. Cache-friendly: always accesses adjacent memory
  4. Small constant factor: each comparison does one assignment (not a swap)

These properties make insertion sort faster than quicksort for small arrays (n < 16~64). Both TimSort and introsort switch to insertion sort for small subarrays.

21.5 Merge Sort

Merge sort is the classic embodiment of divide and conquer:

  1. Divide: Split the array in half
  2. Conquer: Recursively sort each half
  3. Combine: Merge the two sorted halves into one sorted array
def merge_sort(arr):
    """Merge sort: divide and conquer + merge"""
    if len(arr) <= 1:
        return arr
    
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    
    return merge(left, right)


def merge(left, right):
    """Merge two sorted arrays"""
    result = []
    i = j = 0
    
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:  # <= ensures stability
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    
    result.extend(left[i:])
    result.extend(right[j:])
    return result

Analysis:

In-place merge sort exists but is complex (Kronrod 1969), rarely used in practice. Merge sort's main disadvantage is the O(n) extra space.

Where merge sort shines:

21.6 Quick Sort

Quick sort was invented by Tony Hoare in 1959 and remains one of the most widely used sorting algorithms. The core idea:

  1. Choose an element as the pivot
  2. Partition: rearrange so elements less than pivot come before it, greater after
  3. Recursively sort the two partitions
def quicksort(arr, lo=0, hi=None):
    """Quick sort (Lomuto partition scheme)"""
    if hi is None:
        hi = len(arr) - 1
    if lo < hi:
        pivot_idx = partition_lomuto(arr, lo, hi)
        quicksort(arr, lo, pivot_idx - 1)
        quicksort(arr, pivot_idx + 1, hi)
    return arr


def partition_lomuto(arr, lo, hi):
    """Lomuto partition: use last element as pivot"""
    pivot = arr[hi]
    i = lo  # i points to next position in "less than" zone
    for j in range(lo, hi):
        if arr[j] <= pivot:
            arr[i], arr[j] = arr[j], arr[i]
            i += 1
    arr[i], arr[hi] = arr[hi], arr[i]
    return i

Lomuto vs Hoare Partition

Lomuto partition (above):

Hoare partition:

def partition_hoare(arr, lo, hi):
    """Hoare partition: scan from both ends"""
    pivot = arr[lo]
    i = lo - 1
    j = hi + 1
    while True:
        i += 1
        while arr[i] < pivot:
            i += 1
        j -= 1
        while arr[j] > pivot:
            j -= 1
        if i >= j:
            return j
        arr[i], arr[j] = arr[j], arr[i]

Randomized Pivot

The worst case O(n²) occurs when the pivot is always the maximum or minimum element. Solution: randomly choose the pivot.

import random

def quicksort_random(arr, lo=0, hi=None):
    """Randomized quicksort"""
    if hi is None:
        hi = len(arr) - 1
    if lo < hi:
        rand_idx = random.randint(lo, hi)
        arr[rand_idx], arr[hi] = arr[hi], arr[rand_idx]
        pivot_idx = partition_lomuto(arr, lo, hi)
        quicksort_random(arr, lo, pivot_idx - 1)
        quicksort_random(arr, pivot_idx + 1, hi)
    return arr

With randomization, the probability of hitting worst-case O(n²) is negligible. Expected time complexity is O(n log n).

Quicksort analysis:

21.7 Heap Sort Overview

Heap sort leverages the max-heap property: the root is always the largest element.

  1. Build heap: O(n)
  2. Repeatedly extract the root (maximum) to the end of the array, reduce heap size by 1, and sift down
def heap_sort(arr):
    """Heap sort: build heap + repeatedly extract max"""
    n = len(arr)
    
    # Build max heap (start from last non-leaf node)
    for i in range(n // 2 - 1, -1, -1):
        sift_down(arr, i, n)
    
    # Extract max one by one
    for i in range(n - 1, 0, -1):
        arr[0], arr[i] = arr[i], arr[0]
        sift_down(arr, 0, i)
    
    return arr


def sift_down(arr, root, size):
    """Sift down operation"""
    while True:
        largest = root
        left = 2 * root + 1
        right = 2 * root + 2
        
        if left < size and arr[left] > arr[largest]:
            largest = left
        if right < size and arr[right] > arr[largest]:
            largest = right
        
        if largest == root:
            break
        arr[root], arr[largest] = arr[largest], arr[root]
        root = largest

Analysis:

See Chapter 12 (Heaps and Priority Queues) for detailed discussion. The key point here: heap sort guarantees worst-case O(n log n), but is slower than quicksort in practice (cache-unfriendly — jumping access patterns).

21.8 Counting Sort

When elements are integers with a limited range, counting sort breaks through the O(n log n) barrier.

def counting_sort(arr, max_val=None):
    """Counting sort: for non-negative integers with bounded range"""
    if not arr:
        return arr
    
    if max_val is None:
        max_val = max(arr)
    
    # Count occurrences
    count = [0] * (max_val + 1)
    for x in arr:
        count[x] += 1
    
    # Prefix sums (ensures stability)
    for i in range(1, len(count)):
        count[i] += count[i - 1]
    
    # Place elements from right to left (ensures stability)
    result = [0] * len(arr)
    for x in reversed(arr):
        count[x] -= 1
        result[count[x]] = x
    
    return result

Analysis:

Counting sort's limitation: k cannot be too large. Sorting 1 million integers in range 0100 is extremely efficient; sorting 100 integers in range 01 billion would require a billion-element count array.

21.9 Bucket Sort

Bucket sort generalizes counting sort: distribute elements into "buckets," sort each bucket internally (usually with insertion sort), then concatenate all buckets in order.

def bucket_sort(arr, num_buckets=10):
    """Bucket sort: good for uniformly distributed floats"""
    if not arr:
        return arr
    
    min_val, max_val = min(arr), max(arr)
    if min_val == max_val:
        return arr
    
    bucket_range = (max_val - min_val) / num_buckets
    buckets = [[] for _ in range(num_buckets)]
    
    for x in arr:
        idx = int((x - min_val) / bucket_range)
        if idx == num_buckets:
            idx -= 1
        buckets[idx].append(x)
    
    # Sort each bucket (using insertion sort)
    result = []
    for bucket in buckets:
        insertion_sort(bucket)
        result.extend(bucket)
    
    return result

Analysis:

21.10 Radix Sort

Radix sort processes digits one at a time, from least significant digit (LSD) or most significant digit (MSD), using a stable sort (like counting sort) at each digit position.

def radix_sort(arr):
    """Radix sort (LSD, least significant digit first)"""
    if not arr:
        return arr
    
    max_val = max(arr)
    exp = 1  # Current digit position: 1, 10, 100, ...
    
    while max_val // exp > 0:
        counting_sort_by_digit(arr, exp)
        exp *= 10
    
    return arr


def counting_sort_by_digit(arr, exp):
    """Counting sort by a specific digit"""
    n = len(arr)
    output = [0] * n
    count = [0] * 10  # digits 0-9
    
    for x in arr:
        digit = (x // exp) % 10
        count[digit] += 1
    
    for i in range(1, 10):
        count[i] += count[i - 1]
    
    for i in range(n - 1, -1, -1):
        digit = (arr[i] // exp) % 10
        count[digit] -= 1
        output[count[digit]] = arr[i]
    
    for i in range(n):
        arr[i] = output[i]

Analysis:

Radix sort excels for fixed-width integers (phone numbers, IDs). When d is constant, time is O(n).


Level 2 · Deeper Understanding

21.11 TimSort: The Choice of Python and Java

TimSort is the sorting algorithm designed by Tim Peters in 2002 for CPython. Python's list.sort() and sorted() both use TimSort underneath, and Java's Arrays.sort() uses TimSort for object arrays. It combines merge sort's stability with insertion sort's efficiency on small or nearly-sorted data.

Core Insight: Exploit Existing Order

Real-world data is rarely completely random. A log file is roughly ordered by timestamp; a user list is roughly ordered by registration date. TimSort's key insight is: first identify naturally ordered segments (runs) in the data, then merge them together using merge sort.

TimSort Execution Flow

Step 1: Identify Runs

Scan left to right, finding natural ascending or descending sequences. Descending sequences are reversed to ascending.

def find_run(arr, lo, hi):
    """Find the natural sorted run starting at lo"""
    if lo + 1 > hi:
        return lo + 1
    
    run_hi = lo + 1
    if arr[run_hi] < arr[lo]:
        # Strictly descending
        while run_hi <= hi and arr[run_hi] < arr[run_hi - 1]:
            run_hi += 1
        # Reverse to ascending
        reverse(arr, lo, run_hi - 1)
    else:
        # Non-descending
        while run_hi <= hi and arr[run_hi] >= arr[run_hi - 1]:
            run_hi += 1
    
    return run_hi

Step 2: Ensure Minimum Run Length (minrun)

If a natural run is too short (shorter than minrun), extend it to minrun length using insertion sort. minrun is typically between 32 and 64, determined by the total array length.

def compute_minrun(n):
    """Compute minrun: take the top 6 bits of n, add 1 if remaining bits are non-zero"""
    r = 0
    while n >= 64:
        r |= n & 1
        n >>= 1
    return n + r

This formula ensures that n/minrun is either a power of 2 or slightly less — making the final merge phase most efficient.

Step 3: Merge Runs

A stack manages the runs. Each time a new run is found, it's pushed onto the stack, then the stack is checked for merge conditions. Tim Peters defined an invariant to ensure merge efficiency:

For the top three runs with lengths A, B, C:

If the invariant is violated, the smaller adjacent runs are merged.

Step 4: Galloping Mode (Accelerated Merge)

During a merge of two runs, if one side "wins" many times consecutively (e.g., elements from the left run are consecutively smaller than those from the right), there are likely more consecutive wins. At this point, switch to binary search mode (galloping) to find the entire winning range at once, then bulk-copy.

def gallop_left(key, arr, lo, hi):
    """Find insertion position of key in arr[lo:hi] (leftmost).
    Uses galloping: exponential jumps, then binary search."""
    last_ofs = 0
    ofs = 1
    
    while ofs < hi - lo and arr[lo + ofs] < key:
        last_ofs = ofs
        ofs = (ofs << 1) + 1  # Exponential growth: 1, 3, 7, 15, ...
    
    ofs = min(ofs, hi - lo)
    
    # Binary search within [last_ofs, ofs)
    last_ofs += lo
    ofs += lo
    while last_ofs < ofs:
        mid = last_ofs + (ofs - last_ofs) // 2
        if arr[mid] < key:
            last_ofs = mid + 1
        else:
            ofs = mid
    
    return ofs

If galloping doesn't yield sufficient benefit (consecutive wins below threshold), it falls back to element-by-element comparison.

TimSort Complexity

TimSort's magic lies in its performance on real-world data: existing order in the data is exploited rather than disrupted and rebuilt.

21.12 Three-Way Partition (Dutch National Flag)

Standard quicksort degrades when facing many duplicate elements. For example, sorting 1 million elements where 99% are the same value: Lomuto partition puts all equal elements on one side, causing severely unbalanced recursion.

Three-way partition divides the array into three parts: less than pivot, equal to pivot, greater than pivot. The "equal" portion is excluded from further recursion.

def quicksort_3way(arr, lo=0, hi=None):
    """Three-way quicksort (Dutch National Flag)"""
    if hi is None:
        hi = len(arr) - 1
    if lo >= hi:
        return
    
    # Three-way partition
    pivot = arr[lo]
    lt = lo      # arr[lo..lt-1] < pivot
    gt = hi      # arr[gt+1..hi] > pivot
    i = lo + 1   # arr[lt..i-1] == pivot
    
    while i <= gt:
        if arr[i] < pivot:
            arr[lt], arr[i] = arr[i], arr[lt]
            lt += 1
            i += 1
        elif arr[i] > pivot:
            arr[gt], arr[i] = arr[i], arr[gt]
            gt -= 1
            # Don't increment i: swapped element hasn't been examined
        else:
            i += 1
    
    # arr[lt..gt] all equal to pivot, no further sorting needed
    quicksort_3way(arr, lo, lt - 1)
    quicksort_3way(arr, gt + 1, hi)

This is Dijkstra's famous "Dutch National Flag Problem" — posed in 1976, where the Dutch flag's three colors (red, white, blue) correspond to the three partitions.

21.13 Why Quicksort Is Fastest in Practice

Asymptotically, merge sort and heap sort are both O(n log n), while quicksort's worst case is O(n²). Yet quicksort is usually fastest in practice. Here's why:

1. Cache Friendliness (Cache Locality)

Quicksort's partition phase scans the array sequentially, which is extremely cache-friendly. Merge sort jumps between two arrays during merging; heap sort's parent-child node accesses are scattered, destroying spatial locality.

On modern CPUs, L1/L2/L3 cache hit rates have enormous performance impact. Quicksort's sequential access pattern results in very few cache misses.

2. Small Constant Factor

Quicksort's inner loop is extremely tight: one comparison + one pointer increment. Merge sort's merge phase requires extra arrays and copy operations.

3. In-Place Sorting

Quicksort needs only O(log n) stack space, while merge sort needs O(n) extra space. Less memory allocation = fewer system calls = faster.

4. Tail Recursion Optimization

Quicksort can apply tail recursion on the longer subarray, guaranteeing stack depth never exceeds O(log n):

def quicksort_tail_optimized(arr, lo, hi):
    """Tail-recursive optimization: always recurse on the shorter half"""
    while lo < hi:
        pivot_idx = partition_lomuto(arr, lo, hi)
        if pivot_idx - lo < hi - pivot_idx:
            quicksort_tail_optimized(arr, lo, pivot_idx - 1)
            lo = pivot_idx + 1
        else:
            quicksort_tail_optimized(arr, pivot_idx + 1, hi)
            hi = pivot_idx - 1

21.14 External Sort

When data exceeds memory capacity (e.g., 100GB of data with only 4GB of RAM), external sorting is needed. The classic approach is multi-way merge sort:

Phase 1: Generate Sorted Runs

  1. Split data into memory-sized chunks (e.g., 4GB each)
  2. Sort each chunk in memory using quicksort/TimSort, write back to disk as a sorted file

Phase 2: Multi-Way Merge

  1. With k sorted files, maintain a min-heap of size k
  2. Repeatedly extract the minimum from the heap and write to output
  3. Read the next element from the file that produced the minimum, push to heap
import heapq

def external_merge_sort(sorted_files, output_file):
    """Multi-way merge: merge k sorted files"""
    file_handles = [open(f) for f in sorted_files]
    
    # Use heap for k-way merge
    heap = []
    for i, fh in enumerate(file_handles):
        line = fh.readline()
        if line:
            heapq.heappush(heap, (int(line.strip()), i))
    
    with open(output_file, 'w') as out:
        while heap:
            val, file_idx = heapq.heappop(heap)
            out.write(f"{val}\n")
            
            next_line = file_handles[file_idx].readline()
            if next_line:
                heapq.heappush(heap, (int(next_line.strip()), file_idx))
    
    for fh in file_handles:
        fh.close()

Optimizations:

The critical metric for external sorting is not comparisons but disk I/O operations. Modern implementations also consider SSD random read/write characteristics.


Level 3 · Theory and History

21.15 Proof of the O(n log n) Lower Bound for Comparison Sorting

Theorem: Any comparison-based sorting algorithm requires at least Omega(n log n) comparisons in the worst case.

Proof (Decision Tree Model):

Abstract any sorting algorithm as a binary decision tree:

There are n! possible permutations of n elements, so the decision tree has at least n! leaves. A binary tree of height h has at most 2^h leaves, therefore:

2^h >= n!
h >= log2(n!)

By Stirling's approximation n! ~ (n/e)^n:

h >= log2((n/e)^n) = n * log2(n/e) = n * log2(n) - n * log2(e)
  = Omega(n log n)

Information-Theoretic Interpretation:

Sorting is essentially determining which of n! permutations the input represents. Each comparison provides 1 bit of information. To distinguish among n! possibilities requires at least log2(n!) = Omega(n log n) bits.

This is why counting sort and radix sort can break through this bound — they are not purely comparison-based; they exploit numerical properties of elements (additional information sources).

21.16 History and Inventors of Sorting Algorithms

The history of sorting algorithms nearly parallels the development of computer science itself:

Year Algorithm Inventor Context
1945 Merge Sort John von Neumann Described in EDVAC report, one of the earliest computer algorithms
1956 Radix Sort Harold H. Seward MIT Master's thesis
1959 Quick Sort Tony Hoare Invented while working on a translation project at Moscow State University
1964 Heap Sort J.W.J. Williams Exploiting the heap data structure
1964 Heap Sort improvement Robert W. Floyd Optimized heap building to O(n)
1969 Shell Sort analysis Knuth Analyzed Shell's (1959) algorithm
1976 Dutch National Flag Edsger Dijkstra Three-way partition problem
1993 Introsort David Musser Hybrid quicksort + heapsort + insertion sort
1997 Introsort paper David Musser Published formal paper
2002 TimSort Tim Peters Designed for CPython
2009 Java TimSort Joshua Bloch Ported TimSort to Java

Tony Hoare was only 25 when he invented quicksort. He later recalled: "I was working on machine translation from Russian to English and needed to sort words. I first thought of bubble sort but it was too slow. I thought for a while and discovered the partition idea."

Tim Peters submitted TimSort to the CPython mailing list with the subject "[Python-Dev] Sorting," accompanied by a 27-page design document (listsort.txt) explaining every design decision in detail. This document remains part of the CPython source tree to this day.

21.17 Strict Definition of Stability and Why It Matters

Definition: A sorting algorithm is stable if whenever a[i] == a[j] and i < j before sorting, a[i] still appears before a[j] after sorting.

In other words: equal elements maintain their original relative order.

Why does stability matter?

  1. Multi-key sorting: Sort by name first, then by age. If the sort is stable, people of the same age remain in name order.
students = [
    ("Alice", 20), ("Bob", 22), ("Charlie", 20), ("Diana", 22)
]

# Sort by name
students.sort(key=lambda x: x[0])
# [("Alice", 20), ("Bob", 22), ("Charlie", 20), ("Diana", 22)]

# Sort by age (stable sort)
students.sort(key=lambda x: x[1])
# [("Alice", 20), ("Charlie", 20), ("Bob", 22), ("Diana", 22)]
# Same-age: Alice still before Charlie, Bob still before Diana
  1. Database queries: SQL's ORDER BY with multiple columns relies on stability.

  2. UI sorting: When users click column headers to sort, they expect rows with equal values to maintain their previous order.

  3. Correctness: In some applications, the relative order of equal elements has semantic meaning (e.g., events with the same timestamp ordered by arrival time).

Python's sort guarantees stability — this is part of the language specification, not merely an implementation detail.

21.18 introsort: The Hybrid Strategy of C++ std::sort

introsort (introspective sort) is a hybrid sorting algorithm proposed by David Musser in 1997. It is the typical implementation of C++ standard library's std::sort. It combines the strengths of three algorithms:

  1. Quicksort: Fastest in the average case
  2. Heapsort: Guarantees worst-case O(n log n)
  3. Insertion sort: Fastest for small arrays

Strategy:

import math

def introsort(arr):
    """introsort: hybrid of quicksort + heapsort + insertion sort"""
    max_depth = 2 * int(math.log2(len(arr))) if arr else 0
    _introsort_helper(arr, 0, len(arr) - 1, max_depth)
    return arr


def _introsort_helper(arr, lo, hi, depth_limit):
    SIZE_THRESHOLD = 16
    
    while hi - lo + 1 > SIZE_THRESHOLD:
        if depth_limit == 0:
            # Too deep, switch to heapsort
            heap_sort_range(arr, lo, hi)
            return
        
        depth_limit -= 1
        # Median-of-three pivot selection
        mid = (lo + hi) // 2
        pivot_idx = median_of_three(arr, lo, mid, hi)
        arr[pivot_idx], arr[hi] = arr[hi], arr[pivot_idx]
        
        p = partition_lomuto(arr, lo, hi)
        _introsort_helper(arr, p + 1, hi, depth_limit)
        hi = p - 1  # Tail recursion optimization
    
    # Small arrays: insertion sort
    insertion_sort_range(arr, lo, hi)


def median_of_three(arr, a, b, c):
    """Median of three: return index of median among arr[a], arr[b], arr[c]"""
    if arr[a] > arr[b]:
        a, b = b, a
    if arr[b] > arr[c]:
        b, c = c, b
    if arr[a] > arr[b]:
        a, b = b, a
    return b

introsort elegantly combines three algorithms' strengths:

Thus introsort is: O(n log n) worst-case, O(n log n) average, in-place, unstable.


Level 4 · Practice and Interviews

21.19 Python Sorting API Deep Dive

sort() vs sorted()

# list.sort() — in-place, returns None, only works on lists
nums = [3, 1, 4, 1, 5]
nums.sort()
# nums = [1, 1, 3, 4, 5]

# sorted() — returns new list, works on any iterable
result = sorted((3, 1, 4, 1, 5))  # tuples work too
# result = [1, 1, 3, 4, 5]

# sorted accepts generators, dict views, etc.
sorted({"b": 2, "a": 1}.items(), key=lambda x: x[1])
# [('a', 1), ('b', 2)]

Key differences:

The key Parameter

# lambda
words = ["banana", "apple", "cherry"]
words.sort(key=lambda w: len(w))
# ['apple', 'banana', 'cherry']

# operator.itemgetter — faster than lambda (C implementation)
from operator import itemgetter
students = [("Alice", 85), ("Bob", 92), ("Charlie", 78)]
students.sort(key=itemgetter(1))  # sort by score
# [('Charlie', 78), ('Alice', 85), ('Bob', 92)]

# operator.attrgetter — sort by attribute
from operator import attrgetter
from dataclasses import dataclass

@dataclass
class Student:
    name: str
    gpa: float

students = [Student("Alice", 3.8), Student("Bob", 3.9)]
students.sort(key=attrgetter('gpa'))

# Multi-level sort: GPA descending, then name ascending
students.sort(key=lambda s: (-s.gpa, s.name))

functools.cmp_to_key

Python 3 removed the cmp parameter from sort, but some comparison logic is hard to express with key (e.g., custom comparison functions). Use cmp_to_key:

from functools import cmp_to_key

def compare_versions(v1, v2):
    """Compare version strings: '1.2.3' vs '1.2.10'"""
    parts1 = list(map(int, v1.split('.')))
    parts2 = list(map(int, v2.split('.')))
    for p1, p2 in zip(parts1, parts2):
        if p1 < p2:
            return -1
        elif p1 > p2:
            return 1
    return len(parts1) - len(parts2)

versions = ["1.2.3", "1.2.10", "1.1.0", "2.0.0"]
versions.sort(key=cmp_to_key(compare_versions))
# ['1.1.0', '1.2.3', '1.2.10', '2.0.0']

Classic interview application — Largest Number (LeetCode 179):

def largest_number(nums):
    """Given integers, concatenate them to form the largest number"""
    strs = [str(n) for n in nums]
    
    def compare(a, b):
        if a + b > b + a:
            return -1
        elif a + b < b + a:
            return 1
        return 0
    
    strs.sort(key=cmp_to_key(compare))
    result = ''.join(strs)
    return '0' if result[0] == '0' else result

21.20 High-Frequency Interview Sorting Problems

Problem 1: Merge Intervals (LeetCode 56)

Given a collection of intervals [start, end], merge all overlapping intervals.

def merge_intervals(intervals):
    """
    Sort then single-pass merge.
    Time O(n log n), Space O(n)
    """
    if not intervals:
        return []
    
    # Sort by start point
    intervals.sort(key=lambda x: x[0])
    
    merged = [intervals[0]]
    for start, end in intervals[1:]:
        if start <= merged[-1][1]:
            # Overlapping: merge (take larger endpoint)
            merged[-1] = [merged[-1][0], max(merged[-1][1], end)]
        else:
            merged.append([start, end])
    
    return merged

# Test
print(merge_intervals([[1,3],[2,6],[8,10],[15,18]]))
# [[1, 6], [8, 10], [15, 18]]

Insight: After sorting, overlapping intervals are necessarily adjacent. So sort + linear scan solves the problem.

Problem 2: Maximum Gap (LeetCode 164)

Given an unsorted array, find the maximum difference between successive elements in sorted form. Required: O(n) time and space.

def maximum_gap(nums):
    """
    Bucket sort insight: max gap >= ceil((max-min)/(n-1))
    Set bucket size to floor((max-min)/(n-1)), answer must be between buckets (not within)
    """
    n = len(nums)
    if n < 2:
        return 0
    
    min_val, max_val = min(nums), max(nums)
    if min_val == max_val:
        return 0
    
    # Bucket size and count
    bucket_size = max(1, (max_val - min_val) // (n - 1))
    bucket_count = (max_val - min_val) // bucket_size + 1
    
    # Each bucket stores only min and max
    buckets_min = [float('inf')] * bucket_count
    buckets_max = [float('-inf')] * bucket_count
    
    for num in nums:
        idx = (num - min_val) // bucket_size
        buckets_min[idx] = min(buckets_min[idx], num)
        buckets_max[idx] = max(buckets_max[idx], num)
    
    # Max gap = some bucket's min - previous non-empty bucket's max
    max_gap = 0
    prev_max = buckets_max[0]
    for i in range(1, bucket_count):
        if buckets_min[i] == float('inf'):
            continue  # Skip empty buckets
        max_gap = max(max_gap, buckets_min[i] - prev_max)
        prev_max = buckets_max[i]
    
    return max_gap

Key insight: Pigeonhole principle — if n numbers are uniformly spread over [min, max], the average gap between adjacent elements is (max-min)/(n-1). The maximum gap must be >= this average. Setting bucket size to this average ensures the maximum gap cannot occur within a bucket, only between adjacent buckets.

Problem 3: Sort Colors (LeetCode 75, Dutch National Flag)

Array contains only 0, 1, 2. Sort in-place with a single pass.

def sort_colors(nums):
    """
    Dutch National Flag: three-way partition
    lo: boundary of 0s (right side)
    hi: boundary of 2s (left side)
    i: current pointer
    """
    lo, i, hi = 0, 0, len(nums) - 1
    
    while i <= hi:
        if nums[i] == 0:
            nums[lo], nums[i] = nums[i], nums[lo]
            lo += 1
            i += 1
        elif nums[i] == 2:
            nums[hi], nums[i] = nums[i], nums[hi]
            hi -= 1
            # Don't increment i: swapped value hasn't been examined
        else:  # nums[i] == 1
            i += 1

# Test
arr = [2, 0, 2, 1, 1, 0]
sort_colors(arr)
print(arr)  # [0, 0, 1, 1, 2, 2]

This is a direct interview application of Dijkstra's Dutch National Flag algorithm. Note it's the same idea as quicksort's three-way partition.

21.21 When to Use Which Sort

In practice, sorting algorithm choice depends on:

Scenario Recommended Reason
General purpose TimSort (Python/Java default) Best for real-world data
Small arrays (n < 64) Insertion sort Small constant, cache-friendly
Large arrays, stability not needed Quicksort/introsort Fastest on average
Integers, small range Counting sort O(n + k), linear time
Integers, fixed width Radix sort O(d*n), linear time
Nearly sorted data TimSort or insertion sort Exploits existing order
Stability required Merge sort/TimSort Quicksort is unstable
Memory constrained Heap sort O(1) extra space
Data doesn't fit in memory External merge sort Minimizes disk I/O
Linked list sorting Merge sort No random access needed
Worst-case guarantee needed Heap sort/introsort Worst-case O(n log n)

Practical Python advice:

21.22 Parallelization Potential of Sorting Algorithms

Different sorting algorithms have vastly different parallelization potential:

Merge Sort — Naturally Parallel

The divide-and-conquer structure means both halves can be sorted independently, and the merge phase can be partially parallelized:

from concurrent.futures import ThreadPoolExecutor

def parallel_merge_sort(arr, depth=0, max_depth=3):
    """Parallel merge sort (using thread pool)"""
    if len(arr) <= 1:
        return arr
    
    mid = len(arr) // 2
    
    if depth < max_depth:
        with ThreadPoolExecutor(max_workers=2) as executor:
            left_future = executor.submit(parallel_merge_sort, arr[:mid], depth+1, max_depth)
            right_future = executor.submit(parallel_merge_sort, arr[mid:], depth+1, max_depth)
            left = left_future.result()
            right = right_future.result()
    else:
        left = parallel_merge_sort(arr[:mid], depth+1, max_depth)
        right = parallel_merge_sort(arr[mid:], depth+1, max_depth)
    
    return merge(left, right)

Note: Python's GIL limits multi-threaded parallelism. For true parallel sorting, use multiprocessing or C extensions.

Quicksort — Also Parallelizable

After partition, both halves are independent, but partition itself is serial. Quicksort's parallel efficiency depends on partition balance.

Sorting Networks — Best for Hardware Parallelism

Sorting networks are fixed compare-and-swap sequences where all comparisons are determined at compile time, making them ideal for GPU and FPGA parallel implementations. Batcher's bitonic merge sort is the classic parallel sorting network.

Parallel Sorting in Practice:


Chapter Summary

Sorting is computer science's "litmus test" — seemingly simple, yet involving core algorithmic ideas: divide and conquer, recursion, randomization, amortized analysis, cache optimization, and parallelization.

Key takeaways:

  1. Comparison sorting has a theoretical lower bound of O(n log n), but non-comparison sorts can achieve O(n) under specific conditions
  2. Quicksort is fastest in practice due to cache friendliness and small constants — but needs randomized pivot to avoid worst case
  3. TimSort is the optimal general-purpose sort for real-world data, exploiting existing order; it's the default in Python/Java
  4. introsort is C++'s choice: quicksort's speed + heapsort's worst-case guarantee + insertion sort's small-array efficiency
  5. Stability is a practical requirement: multi-key sorting, database queries, and UI sorting all need it
  6. Choose the right algorithm based on data characteristics: size, value range, distribution, stability requirements, memory constraints

In the next chapter, we'll discuss search and matching algorithms — close relatives of sorting. Many search problems begin with a sort step.

Rate this chapter
4.7  / 5  (9 ratings)

💬 Comments