Chapter 17

Topological Sort and Directed Graphs

Chapter 17: Topological Sort and Directed Graphs

In the previous chapter, we learned graph representation and traversal — BFS and DFS. Among the many applications of graphs, one class of problems is extremely common and important: given a set of tasks and their dependency relationships, find a valid execution order. For example: university courses have prerequisite relationships — you must complete linear algebra before taking machine learning; in a software project, module A depends on module B — you must compile B before A; in cooking, you must chop vegetables before stir-frying.

This is the problem topological sorting solves. It only applies to Directed Acyclic Graphs (DAGs) — if there is a cycle (A depends on B, B depends on C, C depends on A), no valid ordering can exist.

In this chapter, we will: (1) understand the concept and properties of topological sort; (2) master two classic implementations — BFS (Kahn's algorithm) and DFS; (3) apply topological sort to course scheduling, compilation dependencies, critical paths, and other practical problems; (4) solve related high-frequency interview problems.

Level 1 · What You Need to Know

17.1 What Is Topological Sort?

Definition: Given a DAG G = (V, E), a topological sort is a linear ordering of V such that for every directed edge (u, v), u appears before v in the ordering.

Intuitive understanding: Topological sort is an "ordering that respects dependencies." If A must be completed before B (there is an edge A→B), then A definitely comes before B in the sorted output.

Example: Course dependencies
  Math → Physics → Quantum Mechanics
  Math → Linear Algebra → Machine Learning
  Programming → Data Structures → Algorithms

Valid topological orderings:
  [Math, Programming, Physics, Linear Algebra, Data Structures, QM, ML, Algorithms]
  [Programming, Math, DS, Physics, Linear Algebra, Algorithms, QM, ML]
  ... (multiple valid orderings)

Note: Topological sort is usually NOT unique!

Key properties:

Existence: A topological sort exists ⟺ the graph is a DAG (acyclic)
Non-uniqueness: A DAG typically has multiple valid topological orderings
Only meaningful for directed graphs: Undirected graphs have no "direction" and cannot define "before/after" relationships

17.2 BFS Implementation: Kahn's Algorithm

Kahn's algorithm (1962) has an elegantly simple idea:

Find all vertices with in-degree 0 (tasks with no dependencies) — these can be executed first
Add them to the result and "remove" them from the graph (decrement in-degrees of their neighbors)
Repeat steps 1-2 until all vertices are processed, or no in-degree-0 vertex can be found (indicating a cycle)

from collections import deque, defaultdict

def topological_sort_kahn(num_vertices: int, edges: list) -> list:
    """
    Kahn's Algorithm (BFS Topological Sort)
    
    Args:
        num_vertices: Number of vertices (0 to num_vertices-1)
        edges: Edge list [(u, v), ...] meaning u → v dependency
    
    Returns:
        Topological ordering, or empty list if cycle exists
    
    Time: O(V + E)
    Space: O(V + E)
    """
    # Build graph + compute in-degrees
    graph = defaultdict(list)
    in_degree = [0] * num_vertices
    
    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1
    
    # Enqueue all vertices with in-degree 0
    queue = deque()
    for i in range(num_vertices):
        if in_degree[i] == 0:
            queue.append(i)
    
    result = []
    
    while queue:
        node = queue.popleft()
        result.append(node)
        
        # "Remove" node: decrement neighbors' in-degrees
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    # If result contains all vertices, sort succeeded (no cycle)
    if len(result) == num_vertices:
        return result
    else:
        return []  # Cycle exists


# Example
#   0 → 1 → 3
#   0 → 2 → 3
#   2 → 4
edges = [(0,1), (0,2), (1,3), (2,3), (2,4)]
print(topological_sort_kahn(5, edges))  # [0, 1, 2, 3, 4] or [0, 2, 1, 4, 3] etc.

# Cycle case
edges_cycle = [(0,1), (1,2), (2,0)]
print(topological_sort_kahn(3, edges_cycle))  # [] (cycle)

Why does Kahn's algorithm work?

Intuition: Vertices with in-degree 0 have no unmet dependencies, so they can safely be placed at the current position. After removing them, newly exposed in-degree-0 vertices can also be safely placed. This process is like "peeling an onion layer by layer."

Correctness proof:

If the graph is a DAG, at least one vertex with in-degree 0 must exist (otherwise we could follow incoming edges backwards from any vertex, forming a cycle in the finite graph — contradiction).
After removing in-degree-0 vertices, the remaining graph is still a DAG (subgraphs of DAGs are DAGs).
Therefore the algorithm never gets stuck and eventually processes all vertices.

17.3 DFS Implementation

The key insight for DFS topological sort: when a vertex "finishes" in DFS, all its successors (nodes that depend on it) have already been fully explored. Therefore, listing vertices in reverse order of finish time gives a topological sort.

def topological_sort_dfs(num_vertices: int, edges: list) -> list:
    """
    DFS Topological Sort
    
    Core idea: reverse DFS finish-time ordering
    
    Time: O(V + E)
    Space: O(V + E)
    """
    graph = defaultdict(list)
    for u, v in edges:
        graph[u].append(v)
    
    WHITE, GRAY, BLACK = 0, 1, 2
    color = [WHITE] * num_vertices
    result = []
    has_cycle = False
    
    def dfs(node):
        nonlocal has_cycle
        if has_cycle:
            return
        
        color[node] = GRAY
        
        for neighbor in graph[node]:
            if color[neighbor] == GRAY:
                has_cycle = True
                return
            if color[neighbor] == WHITE:
                dfs(neighbor)
                if has_cycle:
                    return
        
        color[node] = BLACK
        result.append(node)  # Add on finish (will be reversed later)
    
    for i in range(num_vertices):
        if color[i] == WHITE:
            dfs(i)
            if has_cycle:
                return []
    
    result.reverse()  # Reverse to get topological order
    return result


# Example
edges = [(0,1), (0,2), (1,3), (2,3), (2,4)]
print(topological_sort_dfs(5, edges))  # A valid topological ordering

# Cycle
edges_cycle = [(0,1), (1,2), (2,0)]
print(topological_sort_dfs(3, edges_cycle))  # []

Why does "reverse finish-time" produce a topological sort?

For any edge (u, v):

When DFS visits u, v is either white (undiscovered) or black (finished).
If v is white: DFS visits v and finishes v before finishing u. So f[v] < f[u].
If v is black: f[v] < f[u] (v finished before u started or before u finishes).
If v is gray: this is a back edge — meaning there is a cycle, not a DAG.

In summary, for DAGs, for any edge (u, v), f[u] > f[v]. Listing vertices by decreasing finish time places u before v. This is exactly the definition of topological sort.

17.4 Cycle Detection and Topological Sort

Cycle detection and topological sort are two sides of the same coin:

Topological sort succeeds → graph is a DAG (no cycle)
Topological sort fails → graph has a cycle

Therefore, many "does a cycle exist?" problems are essentially "try to do a topological sort."

Kahn's cycle detection: If after processing all in-degree-0 nodes, result length < V, some nodes can never reach in-degree 0 — they form a cycle.

DFS cycle detection: If DFS encounters a gray node (currently on the recursion stack), we have found a back edge — a cycle.

def detect_cycle_with_path(num_vertices: int, edges: list) -> list:
    """
    Detect cycle in directed graph and return cycle nodes
    Returns cycle node list if cycle exists, empty list otherwise
    """
    graph = defaultdict(list)
    for u, v in edges:
        graph[u].append(v)
    
    WHITE, GRAY, BLACK = 0, 1, 2
    color = [WHITE] * num_vertices
    parent = [-1] * num_vertices
    cycle = []
    
    def dfs(node):
        color[node] = GRAY
        for neighbor in graph[node]:
            if color[neighbor] == GRAY:
                # Found cycle from neighbor to node back to neighbor
                path = [neighbor, node]
                cur = node
                while parent[cur] != neighbor and parent[cur] != -1:
                    cur = parent[cur]
                    path.append(cur)
                cycle.extend(reversed(path))
                return True
            if color[neighbor] == WHITE:
                parent[neighbor] = node
                if dfs(neighbor):
                    return True
        color[node] = BLACK
        return False
    
    for i in range(num_vertices):
        if color[i] == WHITE:
            if dfs(i):
                return cycle
    return []

17.5 Comparing the Two Algorithms

Property	Kahn's (BFS)	DFS Topological Sort
Data structure	Queue + in-degree array	Recursion stack + color marking
Approach	"Output in-degree-0 first"	"Reverse finish-time order"
Cycle detection	result.size < V	Encountering gray node
Lexicographically smallest	Replace queue with min-heap	Hard to achieve
All topological orderings	Easy to extend with backtracking	Difficult
Online/incremental	Better support	Needs full re-run

When to use Kahn's?

Need lexicographically smallest topological ordering
Need to count the number of valid orderings
Need "layer" information (shortest path / critical path)

When to use DFS?

Already have a DFS framework (e.g., doing other analyses simultaneously)
Code brevity is preferred

17.6 Common Mistakes

Mistake 1: Reversing edge direction

# Problem says [a, b] means "to take a, must first take b"
# Edge direction should be b → a
# Many people get this backwards!

# Correct
for a, b in prerequisites:
    graph[b].append(a)  # b is prerequisite, a is dependent
    in_degree[a] += 1

# Wrong
for a, b in prerequisites:
    graph[a].append(b)  # Direction reversed!

Mistake 2: Forgetting disconnected graphs

# A DAG may be disconnected (isolated nodes or multiple components)
# Kahn's handles this naturally: isolated nodes have in-degree 0, get enqueued
# DFS: must call dfs() for ALL unvisited nodes
for i in range(num_vertices):  # Don't only start from node 0!
    if color[i] == WHITE:
        dfs(i)

Mistake 3: Assuming topological sort is unique

# DAG with edges 0→2, 1→2 has valid orderings [0,1,2] and [1,0,2]
# If the problem requires a unique answer (e.g., lexicographically smallest), 
# special handling is needed

Level 2 · How It Works Under the Hood

17.7 Complete Analysis of the Course Schedule Series

LeetCode 207: Course Schedule (Can You Finish?)

This problem appeared in the previous chapter. The core is detecting whether a directed graph has a cycle. Here we provide a detailed execution trace of Kahn's algorithm.

def canFinish(numCourses: int, prerequisites: list) -> bool:
    """
    LeetCode 207: Course Schedule
    Essence: directed graph cycle detection = attempt topological sort
    """
    graph = defaultdict(list)
    in_degree = [0] * numCourses
    
    for course, prereq in prerequisites:
        graph[prereq].append(course)
        in_degree[course] += 1
    
    queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
    processed = 0
    
    while queue:
        node = queue.popleft()
        processed += 1
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    return processed == numCourses

Execution trace:

Input: numCourses=4, prerequisites=[[1,0],[2,0],[3,1],[3,2]]
Graph: 0→1, 0→2, 1→3, 2→3
In-degrees: [0, 1, 1, 2]

Initial queue: [0] (in-degree 0)
Step 1: Process 0, neighbor 1 in-degree→0, neighbor 2 in-degree→0, queue=[1,2]
Step 2: Process 1, neighbor 3 in-degree→1, queue=[2]
Step 3: Process 2, neighbor 3 in-degree→0, queue=[3]
Step 4: Process 3, no neighbors, queue=[]

Processed 4 nodes = numCourses → no cycle → can finish

LeetCode 210: Course Schedule II (Return the Order)

def findOrder(numCourses: int, prerequisites: list) -> list:
    """
    LeetCode 210: Course Schedule II
    Return a valid course ordering (topological sort)
    Returns empty list if impossible
    """
    graph = defaultdict(list)
    in_degree = [0] * numCourses
    
    for course, prereq in prerequisites:
        graph[prereq].append(course)
        in_degree[course] += 1
    
    queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
    order = []
    
    while queue:
        node = queue.popleft()
        order.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    return order if len(order) == numCourses else []


# Test
print(findOrder(4, [[1,0],[2,0],[3,1],[3,2]]))  # [0, 1, 2, 3] or [0, 2, 1, 3]
print(findOrder(2, [[1,0]]))  # [0, 1]
print(findOrder(2, [[0,1],[1,0]]))  # [] (cycle)

DFS version (common interview follow-up):

def findOrder_dfs(numCourses: int, prerequisites: list) -> list:
    """DFS version of Course Schedule II"""
    graph = defaultdict(list)
    for course, prereq in prerequisites:
        graph[prereq].append(course)
    
    WHITE, GRAY, BLACK = 0, 1, 2
    color = [WHITE] * numCourses
    result = []
    
    def dfs(node) -> bool:
        """Returns True if no cycle"""
        color[node] = GRAY
        for neighbor in graph[node]:
            if color[neighbor] == GRAY:
                return False  # Cycle
            if color[neighbor] == WHITE:
                if not dfs(neighbor):
                    return False
        color[node] = BLACK
        result.append(node)
        return True
    
    for i in range(numCourses):
        if color[i] == WHITE:
            if not dfs(i):
                return []
    
    return result[::-1]  # Reverse

17.8 Compilation Dependency Analysis

One of the most classic engineering applications of topological sort is build systems. In large software projects, source files have dependencies (#include, import), and the compiler must compile them in the correct order.

Core principle of Make and Bazel:

Parse all file dependencies, build a dependency graph (DAG)
Topologically sort the dependency graph to get build order
Compile in topological order (ensuring each file's dependencies are compiled first)

class BuildSystem:
    """Simplified build system"""
    
    def __init__(self):
        self.dependencies = defaultdict(set)
        self.dependents = defaultdict(set)
    
    def add_dependency(self, module: str, depends_on: str):
        """Declare that module depends on depends_on"""
        self.dependencies[module].add(depends_on)
        self.dependents[depends_on].add(module)
        if module not in self.dependents:
            self.dependents[module]
        if depends_on not in self.dependencies:
            self.dependencies[depends_on]
    
    def get_build_order(self) -> list:
        """
        Return compilation order (topological sort)
        Raises exception on circular dependency
        """
        all_modules = set(self.dependencies.keys()) | set(self.dependents.keys())
        in_degree = {m: len(self.dependencies[m]) for m in all_modules}
        
        queue = deque([m for m in all_modules if in_degree[m] == 0])
        order = []
        
        while queue:
            module = queue.popleft()
            order.append(module)
            
            for dependent in self.dependents[module]:
                in_degree[dependent] -= 1
                if in_degree[dependent] == 0:
                    queue.append(dependent)
        
        if len(order) != len(all_modules):
            cycle_modules = [m for m in all_modules if in_degree[m] > 0]
            raise ValueError(f"Circular dependency detected: {cycle_modules}")
        
        return order
    
    def parallel_build_plan(self) -> list:
        """
        Return parallel build plan: each "wave" can be compiled in parallel
        This is essentially the layered result of BFS
        """
        all_modules = set(self.dependencies.keys()) | set(self.dependents.keys())
        in_degree = {m: len(self.dependencies[m]) for m in all_modules}
        
        queue = deque([m for m in all_modules if in_degree[m] == 0])
        waves = []
        
        while queue:
            wave = list(queue)
            waves.append(wave)
            next_queue = deque()
            
            for module in wave:
                for dependent in self.dependents[module]:
                    in_degree[dependent] -= 1
                    if in_degree[dependent] == 0:
                        next_queue.append(dependent)
            
            queue = next_queue
        
        return waves


# Example: C++ project compilation dependencies
build = BuildSystem()
build.add_dependency("main.cpp", "utils.h")
build.add_dependency("main.cpp", "network.h")
build.add_dependency("network.cpp", "network.h")
build.add_dependency("network.cpp", "utils.h")
build.add_dependency("network.h", "utils.h")
build.add_dependency("tests.cpp", "main.cpp")
build.add_dependency("tests.cpp", "network.cpp")

print("Build order:", build.get_build_order())
print("Parallel build plan:", build.parallel_build_plan())
# Wave 1: [utils.h]  (no dependencies)
# Wave 2: [network.h]  (depends only on utils.h)
# Wave 3: [main.cpp, network.cpp]  (can be parallel)
# Wave 4: [tests.cpp]  (depends on the above)

Key insight about parallel builds: In Kahn's algorithm, each "layer" (nodes whose in-degree simultaneously becomes 0) can be processed in parallel. This is the principle behind make -j8 accelerating compilation — it uses the layered structure of topological sort to maximize parallelism without violating dependencies.

17.9 Critical Path (Longest Path)

In project management, the critical path is the minimum time needed to complete a project — it is the longest path in the DAG from start to finish (because all tasks must be completed, and the bottleneck is the longest chain).

Problem definition: Given a DAG where each vertex has a weight (task duration), find the longest path from source to sink.

Key observation: Finding the longest path in a DAG is NOT NP-hard (unlike in general graphs)! Because we can use topological sort + dynamic programming.

def critical_path(num_tasks: int, edges: list, durations: list) -> tuple:
    """
    Critical Path Algorithm
    
    Args:
        num_tasks: Number of tasks
        edges: [(u, v), ...] meaning u must complete before v
        durations: Duration of each task
    
    Returns:
        (earliest completion time, list of tasks on critical path)
    """
    graph = defaultdict(list)
    in_degree = [0] * num_tasks
    
    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1
    
    # Step 1: Topological sort
    topo_order = []
    queue = deque([i for i in range(num_tasks) if in_degree[i] == 0])
    temp_in_degree = in_degree[:]
    
    while queue:
        node = queue.popleft()
        topo_order.append(node)
        for neighbor in graph[node]:
            temp_in_degree[neighbor] -= 1
            if temp_in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    # Step 2: Compute Earliest Start Time (EST) in topological order
    est = [0] * num_tasks
    predecessor = [-1] * num_tasks
    
    for node in topo_order:
        for neighbor in graph[node]:
            new_est = est[node] + durations[node]
            if new_est > est[neighbor]:
                est[neighbor] = new_est
                predecessor[neighbor] = node
    
    # Step 3: Find earliest project completion time
    project_end_time = 0
    end_task = -1
    for i in range(num_tasks):
        finish_time = est[i] + durations[i]
        if finish_time > project_end_time:
            project_end_time = finish_time
            end_task = i
    
    # Step 4: Backtrack critical path
    critical = []
    cur = end_task
    while cur != -1:
        critical.append(cur)
        cur = predecessor[cur]
    critical.reverse()
    
    return project_end_time, critical


# Example: Software project
# Tasks: 0=Requirements(3d), 1=Design(5d), 2=Coding(8d), 3=Testing(4d), 4=Docs(2d), 5=Deploy(1d)
tasks = 6
durations = [3, 5, 8, 4, 2, 1]
deps = [
    (0, 1),  # Requirements → Design
    (0, 4),  # Requirements → Docs
    (1, 2),  # Design → Coding
    (2, 3),  # Coding → Testing
    (3, 5),  # Testing → Deploy
    (4, 5),  # Docs → Deploy
]

total_time, critical = critical_path(tasks, deps, durations)
print(f"Earliest project completion: {total_time} days")
print(f"Critical path: {critical}")
# Critical path: Req(3) → Design(5) → Coding(8) → Testing(4) → Deploy(1) = 21 days
# Docs path: Req(3) → Docs(2) → Deploy(1) = 6 days (non-critical)

Engineering significance of CPM (Critical Path Method):

Determine minimum project duration: The critical path length is the incompressible minimum time
Identify bottlenecks: A 1-day delay on the critical path delays the entire project by 1 day; non-critical tasks have "float"
Resource allocation: Non-critical path tasks can be delayed to focus resources on the critical path

17.10 Topological Sort and Dynamic Programming

Topological sort and dynamic programming have a deep connection: DP on a DAG is computed in topological order.

Many DP problems are essentially computations on DAGs:

States are DAG vertices
State transitions are DAG edges
Computation order is topological sort

def dag_longest_path(num_vertices: int, edges: list, weights: dict) -> int:
    """
    Longest path in a DAG (with edge weights)
    weights: {(u,v): weight} edge weight dictionary
    """
    graph = defaultdict(list)
    in_degree = [0] * num_vertices
    
    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1
    
    # Topological sort
    topo = []
    queue = deque([i for i in range(num_vertices) if in_degree[i] == 0])
    while queue:
        node = queue.popleft()
        topo.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    # DP in topological order
    dist = [0] * num_vertices
    
    for node in topo:
        for neighbor in graph[node]:
            w = weights.get((node, neighbor), 1)
            dist[neighbor] = max(dist[neighbor], dist[node] + w)
    
    return max(dist)


# Example
edges = [(0,1), (0,2), (1,3), (2,3), (3,4)]
weights = {(0,1): 3, (0,2): 2, (1,3): 4, (2,3): 1, (3,4): 5}
print(dag_longest_path(5, edges, weights))  # 0→1→3→4 = 3+4+5 = 12

Classic example: Longest Increasing Subsequence (LIS)

LIS can be modeled as a DAG: if a[i] < a[j] and i < j, there is an edge i→j. LIS is the longest path in this DAG. Although in practice we use the O(n log n) greedy+binary-search solution, the DAG perspective helps understand the problem's essence.

17.11 Layered Topological Sort

Sometimes we need not just a valid topological order, but also the "level" of each node — the earliest round in which it can be processed.

def topological_sort_with_levels(num_vertices: int, edges: list) -> list:
    """
    Layered topological sort: return nodes for each level
    Nodes in the same level have no dependencies between them and can be processed in parallel
    """
    graph = defaultdict(list)
    in_degree = [0] * num_vertices
    
    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1
    
    current_level = [i for i in range(num_vertices) if in_degree[i] == 0]
    levels = []
    
    while current_level:
        levels.append(current_level)
        next_level = []
        
        for node in current_level:
            for neighbor in graph[node]:
                in_degree[neighbor] -= 1
                if in_degree[neighbor] == 0:
                    next_level.append(neighbor)
        
        current_level = next_level
    
    total = sum(len(level) for level in levels)
    if total != num_vertices:
        return []  # Cycle
    
    return levels


# Example
edges = [(0,2), (1,2), (2,3), (2,4), (3,5), (4,5)]
levels = topological_sort_with_levels(6, edges)
print(f"Levels: {levels}")
# [[0, 1], [2], [3, 4], [5]]
# Level 1: 0 and 1 can be parallel
# Level 2: 2 (depends on 0 and 1)
# Level 3: 3 and 4 can be parallel
# Level 4: 5 (depends on 3 and 4)

Meaning of level count: Level = longest path from any in-degree-0 node to the current node. This is equivalent to the "earliest start time" concept.

Level 3 · How the Specification Defines It

17.12 Kahn's 1962 Paper: The BFS Algorithm for Topological Sorting

In 1962, Arthur B. Kahn published "Topological sorting of large networks" (Communications of the ACM, Vol. 5, Issue 11), the first formal description of what we now call "Kahn's algorithm."

Historical context: In the early 1960s, computers were just beginning to be used for large-scale project management (PERT network diagrams, CPM critical path analysis). These applications needed to sort hundreds or thousands of tasks by dependencies. Kahn's paper directly addressed this engineering need.

Kahn's original description (simplified translation):

"Given a finite directed graph representing a partial ordering, the algorithm is:

Find all nodes with no predecessors; output them (their relative order is arbitrary).

Remove these nodes and all edges emanating from them.

Repeat steps 1-2 until the graph is empty. If the graph is non-empty but no nodes without predecessors exist, the graph contains a cycle."

Key contributions:

Formalized the topological sorting problem: Clearly defined linearization of partial orders
Provided an O(V + E) algorithm: Critical given the limited computing power of the era — efficiency meant being able to handle larger networks
Simultaneous cycle detection: The algorithm naturally terminates and reports when a cycle exists
Discussed parallelism: Kahn noted that all in-degree-0 nodes in each round can be processed in parallel — an observation that remains central to parallel build systems 60 years later

Complexity analysis (in Kahn's argument style):

Each vertex is enqueued exactly once and dequeued exactly once → O(V) vertex processing
Each edge is examined exactly once when its source vertex is removed → O(E) edge processing
Total: O(V + E)

This analysis is identical to BFS complexity analysis — in fact, Kahn's algorithm IS a specialized BFS.

17.13 Tarjan's DFS Topological Sort

Robert Tarjan in his 1972 paper "Depth-First Search and Linear Graph Algorithms" (SIAM Journal on Computing) systematized the use of DFS in graph algorithms, including topological sort based on DFS finish times.

Tarjan's insight: DFS finish times naturally give a reverse topological order. This observation seems simple, but it unifies multiple seemingly unrelated graph algorithms:

Topological sort: Reverse finish-time ordering
Strongly connected components (Tarjan's algorithm / Kosaraju's algorithm): Two-pass DFS using finish times
Articulation points and bridges: Using DFS tree structure

Why is Tarjan's method more "profound"?

Kahn's algorithm is "constructive" — it directly builds the ordering. Tarjan's method reveals the relationship between topological sort and the graph's depth structure:

Finish times reflect "dependency depth": a higher finish time means the node is a "shallower" dependency (finishing later = more things depend on it)
The existence of back edges is equivalent to the existence of cycles — a deeper structural insight than "can't find an in-degree-0 node"

Tarjan's correctness proof (simplified):

Let G = (V, E) be a DAG. For any edge (u, v) in E:

When DFS visits u, v is either white (undiscovered) or black (finished)
v cannot be gray (because gray means v is an ancestor of u; adding edge (u,v) would form a cycle — contradicting DAG)
If v is white: v becomes a descendant of u, f[v] < f[u]
If v is black: f[v] < f[u] (v finished before u)
In both cases, f[u] > f[v]

Therefore, vertices listed in decreasing finish time satisfy: for any edge (u,v), u comes before v. This is exactly the definition of topological sort. □

Philosophical difference between Tarjan and Kahn:

Kahn	Tarjan
Bottom-up	Top-down
"Do what's doable first"	"Go deep, then come back"
Incremental (suits dynamic graphs)	Global (single DFS traversal)
More intuitive	Reveals deeper structure

17.14 Partial Orders and Total Orders: Mathematical Foundations

A partial order is a relation satisfying reflexivity, antisymmetry, and transitivity. A DAG's reachability defines a partial order: if v is reachable from u, then u ≤ v. But not all vertex pairs are comparable (unreachable vertices have no ordering relation).

A total order requires all pairs to be comparable.

Topological sort = linear extension of a partial order: Extending a partial order to a total order while preserving all existing relationships.

Dilworth's Theorem (1950): In any finite partially ordered set, the maximum antichain size (largest set of mutually incomparable elements) equals the minimum number of chains (totally ordered subsets) needed to partition the set.

Applied to topological sort: maximum antichain = maximum number of tasks that can be processed simultaneously in any valid schedule.

def max_parallel_width(num_vertices: int, edges: list) -> int:
    """
    Compute DAG's maximum width (maximum antichain size)
    i.e., maximum number of simultaneously active tasks in any valid schedule
    Equals the size of the largest level in layered topological sort
    """
    levels = topological_sort_with_levels(num_vertices, edges)
    if not levels:
        return 0  # Cycle
    return max(len(level) for level in levels)

17.15 Topological Sort in Modern Systems

Package Managers (npm, pip, apt)

Every time you run pip install tensorflow, the package manager needs to:

Resolve all of tensorflow's dependencies (recursively)
Build the dependency graph
Detect circular dependencies (report error if found)
Topologically sort to determine installation order
Install in order (or parallelize independent packages)

Spreadsheets (Excel/Google Sheets)

Cell formula references form a DAG. When you modify a cell, the spreadsheet needs to recalculate all dependent cells in topological order.

class Spreadsheet:
    """Simplified spreadsheet engine"""
    
    def __init__(self):
        self.values = {}
        self.formulas = {}
        self.deps = defaultdict(set)   # cell → cells it depends on
        self.rdeps = defaultdict(set)  # cell → cells depending on it
    
    def set_formula(self, cell: str, formula, dependencies: list):
        """Set cell formula"""
        for old_dep in self.deps[cell]:
            self.rdeps[old_dep].discard(cell)
        
        self.formulas[cell] = formula
        self.deps[cell] = set(dependencies)
        for dep in dependencies:
            self.rdeps[dep].add(cell)
        
        self._recalculate(cell)
    
    def set_value(self, cell: str, value):
        """Set value directly (non-formula)"""
        self.values[cell] = value
        self.formulas.pop(cell, None)
        self.deps[cell] = set()
        self._propagate(cell)
    
    def _propagate(self, changed_cell: str):
        """Recalculate all affected cells in topological order"""
        affected = set()
        queue = deque([changed_cell])
        while queue:
            cell = queue.popleft()
            for dependent in self.rdeps[cell]:
                if dependent not in affected:
                    affected.add(dependent)
                    queue.append(dependent)
        # Topologically sort affected cells and recalculate...
    
    def _recalculate(self, cell: str):
        """Recalculate a single cell"""
        if cell in self.formulas:
            dep_values = {d: self.values.get(d, 0) for d in self.deps[cell]}
            self.values[cell] = self.formulas[cell](dep_values)

React/Vue Reactivity Systems

Frontend framework reactive updates are essentially topological sort:

State variables are source nodes in the DAG
Computed properties are intermediate nodes
UI components are leaf nodes
When source state changes, the framework updates all downstream dependencies in topological order

Level 4 · Edge Cases and Pitfalls

17.16 LeetCode 207: Course Schedule (Detailed Interview Analysis)

We have given code implementations in Levels 1 and 2. Here we focus on follow-up questions and edge cases in interviews.

Follow-up 1: What if the number of courses is very large (10⁶)?

# Kahn's is already O(V+E), can't improve asymptotically
# But we can optimize constants:
# 1. Use arrays instead of dicts (vertices are integers 0~n-1)
# 2. Use list for adjacency instead of defaultdict
# 3. Avoid creating new lists (in-place modification)

def canFinish_optimized(numCourses: int, prerequisites: list) -> bool:
    """Memory-optimized version"""
    graph = [[] for _ in range(numCourses)]
    in_degree = [0] * numCourses
    
    for course, prereq in prerequisites:
        graph[prereq].append(course)
        in_degree[course] += 1
    
    # Use list as queue (pointer-based dequeue)
    queue = [i for i in range(numCourses) if in_degree[i] == 0]
    idx = 0
    
    while idx < len(queue):
        node = queue[idx]
        idx += 1
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    return len(queue) == numCourses

Follow-up 2: Find all courses that form cycles

def find_cycle_courses(numCourses: int, prerequisites: list) -> list:
    """Find all courses involved in cycles"""
    graph = [[] for _ in range(numCourses)]
    in_degree = [0] * numCourses
    
    for course, prereq in prerequisites:
        graph[prereq].append(course)
        in_degree[course] += 1
    
    queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
    removed = set()
    
    while queue:
        node = queue.popleft()
        removed.add(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    # Remaining nodes not removed are all involved in cycles
    return [i for i in range(numCourses) if i not in removed]

Edge cases:

numCourses = 0: Return True (empty graph has no cycles)
prerequisites = []: All courses are independent, return True
Self-loop [0, 0]: Course depends on itself, cycle exists, return False
Duplicate edges: Does not affect correctness (in-degree may be over-counted, but Kahn's still works correctly)

17.17 LeetCode 210: Course Schedule II (Return Specific Ordering)

def findOrder(numCourses: int, prerequisites: list) -> list:
    """
    LeetCode 210: Course Schedule II
    Complete interview-level implementation
    """
    graph = [[] for _ in range(numCourses)]
    in_degree = [0] * numCourses
    
    for course, prereq in prerequisites:
        graph[prereq].append(course)
        in_degree[course] += 1
    
    queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
    order = []
    
    while queue:
        node = queue.popleft()
        order.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    return order if len(order) == numCourses else []


# For lexicographically smallest topological sort:
import heapq

def findOrder_lexicographic(numCourses: int, prerequisites: list) -> list:
    """Lexicographically smallest topological sort"""
    graph = [[] for _ in range(numCourses)]
    in_degree = [0] * numCourses
    
    for course, prereq in prerequisites:
        graph[prereq].append(course)
        in_degree[course] += 1
    
    # Use min-heap instead of regular queue
    heap = [i for i in range(numCourses) if in_degree[i] == 0]
    heapq.heapify(heap)
    order = []
    
    while heap:
        node = heapq.heappop(heap)
        order.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                heapq.heappush(heap, neighbor)
    
    return order if len(order) == numCourses else []


# Test
print(findOrder(4, [[1,0],[2,0],[3,1],[3,2]]))
# Possible: [0, 1, 2, 3] or [0, 2, 1, 3]

print(findOrder_lexicographic(4, [[1,0],[2,0],[3,1],[3,2]]))
# Always: [0, 1, 2, 3] (lexicographically smallest)

Interview tips:

Be able to write both Kahn's and DFS approaches
For lexicographically smallest → replace queue with min-heap
Time O(V + E), Space O(V + E)

17.18 LeetCode 269: Alien Dictionary

Problem: Given a list of words sorted in alien alphabet order, determine the order of characters.

Analysis: This is a "reverse-engineer sorting rules from sorted output" problem. The essence is building a constraint graph + topological sort.

def alienOrder(words: list) -> str:
    """
    LeetCode 269: Alien Dictionary
    Infer character ordering from sorted word list
    
    Approach:
    1. Compare adjacent words to find partial order between characters
    2. Build directed graph
    3. Topological sort
    
    Time: O(C) where C is total characters across all words
    Space: O(1) alphabet size fixed at 26
    """
    # Step 1: Collect all characters
    all_chars = set()
    for word in words:
        all_chars.update(word)
    
    # Step 2: Compare adjacent words, extract ordering constraints
    graph = defaultdict(set)
    in_degree = {c: 0 for c in all_chars}
    
    for i in range(len(words) - 1):
        word1, word2 = words[i], words[i + 1]
        
        # Special case: if word1 is longer and is a prefix superset of word2
        # e.g., ["abc", "ab"] → violates sorting rules
        if len(word1) > len(word2) and word1[:len(word2)] == word2:
            return ""  # Invalid input
        
        # Find first differing character
        for c1, c2 in zip(word1, word2):
            if c1 != c2:
                if c2 not in graph[c1]:
                    graph[c1].add(c2)
                    in_degree[c2] += 1
                break  # Only the first difference provides information
    
    # Step 3: Kahn's topological sort
    queue = deque([c for c in all_chars if in_degree[c] == 0])
    result = []
    
    while queue:
        char = queue.popleft()
        result.append(char)
        for neighbor in graph[char]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    if len(result) != len(all_chars):
        return ""  # Cycle (contradictory constraints)
    
    return "".join(result)


# Test
print(alienOrder(["wrt", "wrf", "er", "ett", "rftt"]))
# Output: "wertf"
# Reasoning:
# "wrt" vs "wrf": t before f → t→f
# "wrf" vs "er":  w before e → w→e
# "er" vs "ett":  r before t → r→t
# "ett" vs "rftt": e before r → e→r
# Graph: w→e→r→t→f
# Topological sort: wertf

print(alienOrder(["z", "x"]))  # "zx"
print(alienOrder(["z", "x", "z"]))  # "" (z<x and x<z, contradiction → cycle)

Interview tips:

Key insight: Only adjacent word comparisons are meaningful (non-adjacent relationships are transitively derived)
Only use the first differing character: Subsequent character comparisons are invalid (sorting only looks at the first difference)
Edge cases:
- Invalid input (e.g., ["abc", "ab"])
- Single word (all character orderings are arbitrary)
- Cycle (contradictory constraints)
If unique solution required: When topological sort has multiple in-degree-0 nodes at any step, the order is not unique

17.19 Summary of Topological Sort Pitfalls

Pitfall 1: Edge direction

Different problems have different edge direction conventions!

# Course Schedule: prerequisites[i] = [a, b] means "to take a, first take b"
# Edge direction: b → a (b is prerequisite)
for a, b in prerequisites:
    graph[b].append(a)

# Alien Dictionary: if c1 comes before c2
# Edge direction: c1 → c2
graph[c1].add(c2)

Pitfall 2: Disconnected DAGs

# If the graph has multiple connected components (isolated nodes), 
# they can be placed in any order relative to each other
# Kahn's handles this naturally: all isolated nodes have in-degree 0
# But some problems may require specific ordering (e.g., lexicographically smallest)

Pitfall 3: Duplicate edges

# Duplicate edges [(0,1), (0,1)] may over-count in-degree
# Solution 1: Deduplicate during graph construction
graph = defaultdict(set)  # Use set instead of list

# Solution 2: Deduplicate edges first
edges = list(set(edges))

Pitfall 4: Assuming topological sort works on cyclic graphs

# Topological sort only works on DAGs!
# If the graph has a cycle, must detect/report it first
# Kahn's: return length < V signals a cycle
# DFS: re-encountering a gray node signals a cycle

17.20 Universal Template for Topological Sort Interview Problems

def topological_sort_template(problem_input):
    """
    Universal template for topological sort interview problems
    
    Key steps:
    1. Recognize "this is topological sort" (dependencies, ordering, DAG)
    2. Build graph (watch the edge direction!)
    3. Choose algorithm (Kahn's more versatile, DFS more concise)
    4. Handle edges cases (cycle, isolated nodes, non-uniqueness)
    """
    # Step 1: Extract dependencies from problem description
    # Determine: what are vertices? edges? edge direction?
    
    # Step 2: Build graph
    graph = defaultdict(list)
    in_degree = defaultdict(int)
    all_nodes = set()
    
    for dependency in problem_input:
        u, v = extract_edge(dependency)
        graph[u].append(v)
        in_degree[v] += 1
        all_nodes.add(u)
        all_nodes.add(v)
    
    # Ensure all nodes have in-degree records
    for node in all_nodes:
        if node not in in_degree:
            in_degree[node] = 0
    
    # Step 3: Kahn's algorithm
    queue = deque([n for n in all_nodes if in_degree[n] == 0])
    result = []
    
    while queue:
        node = queue.popleft()
        result.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    # Step 4: Check success
    if len(result) != len(all_nodes):
        return "IMPOSSIBLE"  # Cycle
    
    return result

Signal words for recognizing topological sort problems:

"prerequisite / precondition"
"dependency relationship"
"ordering / arrangement"
"can we finish all?"
"task scheduling"
"compilation order"

17.21 Chapter Summary

Concept	Key Points
Topological sort	Linear ordering of DAG vertices respecting all edge directions
Kahn's algorithm	BFS: output in-degree-0 first, O(V+E)
DFS topological sort	Reverse finish-time order, O(V+E)
Cycle detection	Kahn's: result < V; DFS: encountering gray node
Critical path	DAG longest path = minimum project duration
Parallel scheduling	Layered Kahn's = maximum parallelism

Topological sort is one of the most fundamental and important algorithms for directed graphs. Its applications extend far beyond "sorting" itself — compilers, package managers, spreadsheets, reactive frameworks, project management tools all rely on topological sort under the hood. Understanding topological sort means understanding the core of "optimal scheduling under dependency constraints."

Rate this chapter

4.9 / 5 (15 ratings)

Topological Sort and Directed Graphs

Chapter 17: Topological Sort and Directed Graphs

Level 1 · What You Need to Know

17.1 What Is Topological Sort?

17.2 BFS Implementation: Kahn's Algorithm

17.3 DFS Implementation

17.4 Cycle Detection and Topological Sort

17.5 Comparing the Two Algorithms

17.6 Common Mistakes

Level 2 · How It Works Under the Hood

17.7 Complete Analysis of the Course Schedule Series

LeetCode 207: Course Schedule (Can You Finish?)

LeetCode 210: Course Schedule II (Return the Order)

17.8 Compilation Dependency Analysis

17.9 Critical Path (Longest Path)

17.10 Topological Sort and Dynamic Programming

17.11 Layered Topological Sort

Level 3 · How the Specification Defines It

17.12 Kahn's 1962 Paper: The BFS Algorithm for Topological Sorting

17.13 Tarjan's DFS Topological Sort

17.14 Partial Orders and Total Orders: Mathematical Foundations

17.15 Topological Sort in Modern Systems

Package Managers (npm, pip, apt)

Spreadsheets (Excel/Google Sheets)

React/Vue Reactivity Systems

Level 4 · Edge Cases and Pitfalls

17.16 LeetCode 207: Course Schedule (Detailed Interview Analysis)

17.17 LeetCode 210: Course Schedule II (Return Specific Ordering)

17.18 LeetCode 269: Alien Dictionary

17.19 Summary of Topological Sort Pitfalls

17.20 Universal Template for Topological Sort Interview Problems

17.21 Chapter Summary

💬 Comments