Topological Sort and Directed Graphs
Chapter 17: Topological Sort and Directed Graphs
In the previous chapter, we learned graph representation and traversal — BFS and DFS. Among the many applications of graphs, one class of problems is extremely common and important: given a set of tasks and their dependency relationships, find a valid execution order. For example: university courses have prerequisite relationships — you must complete linear algebra before taking machine learning; in a software project, module A depends on module B — you must compile B before A; in cooking, you must chop vegetables before stir-frying.
This is the problem topological sorting solves. It only applies to Directed Acyclic Graphs (DAGs) — if there is a cycle (A depends on B, B depends on C, C depends on A), no valid ordering can exist.
In this chapter, we will: (1) understand the concept and properties of topological sort; (2) master two classic implementations — BFS (Kahn's algorithm) and DFS; (3) apply topological sort to course scheduling, compilation dependencies, critical paths, and other practical problems; (4) solve related high-frequency interview problems.
Level 1 · What You Need to Know
17.1 What Is Topological Sort?
Definition: Given a DAG G = (V, E), a topological sort is a linear ordering of V such that for every directed edge (u, v), u appears before v in the ordering.
Intuitive understanding: Topological sort is an "ordering that respects dependencies." If A must be completed before B (there is an edge A→B), then A definitely comes before B in the sorted output.
Example: Course dependencies
Math → Physics → Quantum Mechanics
Math → Linear Algebra → Machine Learning
Programming → Data Structures → Algorithms
Valid topological orderings:
[Math, Programming, Physics, Linear Algebra, Data Structures, QM, ML, Algorithms]
[Programming, Math, DS, Physics, Linear Algebra, Algorithms, QM, ML]
... (multiple valid orderings)
Note: Topological sort is usually NOT unique!
Key properties:
- Existence: A topological sort exists ⟺ the graph is a DAG (acyclic)
- Non-uniqueness: A DAG typically has multiple valid topological orderings
- Only meaningful for directed graphs: Undirected graphs have no "direction" and cannot define "before/after" relationships
17.2 BFS Implementation: Kahn's Algorithm
Kahn's algorithm (1962) has an elegantly simple idea:
- Find all vertices with in-degree 0 (tasks with no dependencies) — these can be executed first
- Add them to the result and "remove" them from the graph (decrement in-degrees of their neighbors)
- Repeat steps 1-2 until all vertices are processed, or no in-degree-0 vertex can be found (indicating a cycle)
from collections import deque, defaultdict
def topological_sort_kahn(num_vertices: int, edges: list) -> list:
"""
Kahn's Algorithm (BFS Topological Sort)
Args:
num_vertices: Number of vertices (0 to num_vertices-1)
edges: Edge list [(u, v), ...] meaning u → v dependency
Returns:
Topological ordering, or empty list if cycle exists
Time: O(V + E)
Space: O(V + E)
"""
# Build graph + compute in-degrees
graph = defaultdict(list)
in_degree = [0] * num_vertices
for u, v in edges:
graph[u].append(v)
in_degree[v] += 1
# Enqueue all vertices with in-degree 0
queue = deque()
for i in range(num_vertices):
if in_degree[i] == 0:
queue.append(i)
result = []
while queue:
node = queue.popleft()
result.append(node)
# "Remove" node: decrement neighbors' in-degrees
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
# If result contains all vertices, sort succeeded (no cycle)
if len(result) == num_vertices:
return result
else:
return [] # Cycle exists
# Example
# 0 → 1 → 3
# 0 → 2 → 3
# 2 → 4
edges = [(0,1), (0,2), (1,3), (2,3), (2,4)]
print(topological_sort_kahn(5, edges)) # [0, 1, 2, 3, 4] or [0, 2, 1, 4, 3] etc.
# Cycle case
edges_cycle = [(0,1), (1,2), (2,0)]
print(topological_sort_kahn(3, edges_cycle)) # [] (cycle)
Why does Kahn's algorithm work?
Intuition: Vertices with in-degree 0 have no unmet dependencies, so they can safely be placed at the current position. After removing them, newly exposed in-degree-0 vertices can also be safely placed. This process is like "peeling an onion layer by layer."
Correctness proof:
- If the graph is a DAG, at least one vertex with in-degree 0 must exist (otherwise we could follow incoming edges backwards from any vertex, forming a cycle in the finite graph — contradiction).
- After removing in-degree-0 vertices, the remaining graph is still a DAG (subgraphs of DAGs are DAGs).
- Therefore the algorithm never gets stuck and eventually processes all vertices.
17.3 DFS Implementation
The key insight for DFS topological sort: when a vertex "finishes" in DFS, all its successors (nodes that depend on it) have already been fully explored. Therefore, listing vertices in reverse order of finish time gives a topological sort.
def topological_sort_dfs(num_vertices: int, edges: list) -> list:
"""
DFS Topological Sort
Core idea: reverse DFS finish-time ordering
Time: O(V + E)
Space: O(V + E)
"""
graph = defaultdict(list)
for u, v in edges:
graph[u].append(v)
WHITE, GRAY, BLACK = 0, 1, 2
color = [WHITE] * num_vertices
result = []
has_cycle = False
def dfs(node):
nonlocal has_cycle
if has_cycle:
return
color[node] = GRAY
for neighbor in graph[node]:
if color[neighbor] == GRAY:
has_cycle = True
return
if color[neighbor] == WHITE:
dfs(neighbor)
if has_cycle:
return
color[node] = BLACK
result.append(node) # Add on finish (will be reversed later)
for i in range(num_vertices):
if color[i] == WHITE:
dfs(i)
if has_cycle:
return []
result.reverse() # Reverse to get topological order
return result
# Example
edges = [(0,1), (0,2), (1,3), (2,3), (2,4)]
print(topological_sort_dfs(5, edges)) # A valid topological ordering
# Cycle
edges_cycle = [(0,1), (1,2), (2,0)]
print(topological_sort_dfs(3, edges_cycle)) # []
Why does "reverse finish-time" produce a topological sort?
For any edge (u, v):
- When DFS visits u, v is either white (undiscovered) or black (finished).
- If v is white: DFS visits v and finishes v before finishing u. So f[v] < f[u].
- If v is black: f[v] < f[u] (v finished before u started or before u finishes).
- If v is gray: this is a back edge — meaning there is a cycle, not a DAG.
In summary, for DAGs, for any edge (u, v), f[u] > f[v]. Listing vertices by decreasing finish time places u before v. This is exactly the definition of topological sort.
17.4 Cycle Detection and Topological Sort
Cycle detection and topological sort are two sides of the same coin:
- Topological sort succeeds → graph is a DAG (no cycle)
- Topological sort fails → graph has a cycle
Therefore, many "does a cycle exist?" problems are essentially "try to do a topological sort."
Kahn's cycle detection: If after processing all in-degree-0 nodes, result length < V, some nodes can never reach in-degree 0 — they form a cycle.
DFS cycle detection: If DFS encounters a gray node (currently on the recursion stack), we have found a back edge — a cycle.
def detect_cycle_with_path(num_vertices: int, edges: list) -> list:
"""
Detect cycle in directed graph and return cycle nodes
Returns cycle node list if cycle exists, empty list otherwise
"""
graph = defaultdict(list)
for u, v in edges:
graph[u].append(v)
WHITE, GRAY, BLACK = 0, 1, 2
color = [WHITE] * num_vertices
parent = [-1] * num_vertices
cycle = []
def dfs(node):
color[node] = GRAY
for neighbor in graph[node]:
if color[neighbor] == GRAY:
# Found cycle from neighbor to node back to neighbor
path = [neighbor, node]
cur = node
while parent[cur] != neighbor and parent[cur] != -1:
cur = parent[cur]
path.append(cur)
cycle.extend(reversed(path))
return True
if color[neighbor] == WHITE:
parent[neighbor] = node
if dfs(neighbor):
return True
color[node] = BLACK
return False
for i in range(num_vertices):
if color[i] == WHITE:
if dfs(i):
return cycle
return []
17.5 Comparing the Two Algorithms
| Property | Kahn's (BFS) | DFS Topological Sort |
|---|---|---|
| Data structure | Queue + in-degree array | Recursion stack + color marking |
| Approach | "Output in-degree-0 first" | "Reverse finish-time order" |
| Cycle detection | result.size < V | Encountering gray node |
| Lexicographically smallest | Replace queue with min-heap | Hard to achieve |
| All topological orderings | Easy to extend with backtracking | Difficult |
| Online/incremental | Better support | Needs full re-run |
When to use Kahn's?
- Need lexicographically smallest topological ordering
- Need to count the number of valid orderings
- Need "layer" information (shortest path / critical path)
When to use DFS?
- Already have a DFS framework (e.g., doing other analyses simultaneously)
- Code brevity is preferred
17.6 Common Mistakes
Mistake 1: Reversing edge direction
# Problem says [a, b] means "to take a, must first take b"
# Edge direction should be b → a
# Many people get this backwards!
# Correct
for a, b in prerequisites:
graph[b].append(a) # b is prerequisite, a is dependent
in_degree[a] += 1
# Wrong
for a, b in prerequisites:
graph[a].append(b) # Direction reversed!
Mistake 2: Forgetting disconnected graphs
# A DAG may be disconnected (isolated nodes or multiple components)
# Kahn's handles this naturally: isolated nodes have in-degree 0, get enqueued
# DFS: must call dfs() for ALL unvisited nodes
for i in range(num_vertices): # Don't only start from node 0!
if color[i] == WHITE:
dfs(i)
Mistake 3: Assuming topological sort is unique
# DAG with edges 0→2, 1→2 has valid orderings [0,1,2] and [1,0,2]
# If the problem requires a unique answer (e.g., lexicographically smallest),
# special handling is needed
Level 2 · How It Works Under the Hood
17.7 Complete Analysis of the Course Schedule Series
LeetCode 207: Course Schedule (Can You Finish?)
This problem appeared in the previous chapter. The core is detecting whether a directed graph has a cycle. Here we provide a detailed execution trace of Kahn's algorithm.
def canFinish(numCourses: int, prerequisites: list) -> bool:
"""
LeetCode 207: Course Schedule
Essence: directed graph cycle detection = attempt topological sort
"""
graph = defaultdict(list)
in_degree = [0] * numCourses
for course, prereq in prerequisites:
graph[prereq].append(course)
in_degree[course] += 1
queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
processed = 0
while queue:
node = queue.popleft()
processed += 1
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
return processed == numCourses
Execution trace:
Input: numCourses=4, prerequisites=[[1,0],[2,0],[3,1],[3,2]]
Graph: 0→1, 0→2, 1→3, 2→3
In-degrees: [0, 1, 1, 2]
Initial queue: [0] (in-degree 0)
Step 1: Process 0, neighbor 1 in-degree→0, neighbor 2 in-degree→0, queue=[1,2]
Step 2: Process 1, neighbor 3 in-degree→1, queue=[2]
Step 3: Process 2, neighbor 3 in-degree→0, queue=[3]
Step 4: Process 3, no neighbors, queue=[]
Processed 4 nodes = numCourses → no cycle → can finish
LeetCode 210: Course Schedule II (Return the Order)
def findOrder(numCourses: int, prerequisites: list) -> list:
"""
LeetCode 210: Course Schedule II
Return a valid course ordering (topological sort)
Returns empty list if impossible
"""
graph = defaultdict(list)
in_degree = [0] * numCourses
for course, prereq in prerequisites:
graph[prereq].append(course)
in_degree[course] += 1
queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
order = []
while queue:
node = queue.popleft()
order.append(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
return order if len(order) == numCourses else []
# Test
print(findOrder(4, [[1,0],[2,0],[3,1],[3,2]])) # [0, 1, 2, 3] or [0, 2, 1, 3]
print(findOrder(2, [[1,0]])) # [0, 1]
print(findOrder(2, [[0,1],[1,0]])) # [] (cycle)
DFS version (common interview follow-up):
def findOrder_dfs(numCourses: int, prerequisites: list) -> list:
"""DFS version of Course Schedule II"""
graph = defaultdict(list)
for course, prereq in prerequisites:
graph[prereq].append(course)
WHITE, GRAY, BLACK = 0, 1, 2
color = [WHITE] * numCourses
result = []
def dfs(node) -> bool:
"""Returns True if no cycle"""
color[node] = GRAY
for neighbor in graph[node]:
if color[neighbor] == GRAY:
return False # Cycle
if color[neighbor] == WHITE:
if not dfs(neighbor):
return False
color[node] = BLACK
result.append(node)
return True
for i in range(numCourses):
if color[i] == WHITE:
if not dfs(i):
return []
return result[::-1] # Reverse
17.8 Compilation Dependency Analysis
One of the most classic engineering applications of topological sort is build systems. In large software projects, source files have dependencies (#include, import), and the compiler must compile them in the correct order.
Core principle of Make and Bazel:
- Parse all file dependencies, build a dependency graph (DAG)
- Topologically sort the dependency graph to get build order
- Compile in topological order (ensuring each file's dependencies are compiled first)
class BuildSystem:
"""Simplified build system"""
def __init__(self):
self.dependencies = defaultdict(set)
self.dependents = defaultdict(set)
def add_dependency(self, module: str, depends_on: str):
"""Declare that module depends on depends_on"""
self.dependencies[module].add(depends_on)
self.dependents[depends_on].add(module)
if module not in self.dependents:
self.dependents[module]
if depends_on not in self.dependencies:
self.dependencies[depends_on]
def get_build_order(self) -> list:
"""
Return compilation order (topological sort)
Raises exception on circular dependency
"""
all_modules = set(self.dependencies.keys()) | set(self.dependents.keys())
in_degree = {m: len(self.dependencies[m]) for m in all_modules}
queue = deque([m for m in all_modules if in_degree[m] == 0])
order = []
while queue:
module = queue.popleft()
order.append(module)
for dependent in self.dependents[module]:
in_degree[dependent] -= 1
if in_degree[dependent] == 0:
queue.append(dependent)
if len(order) != len(all_modules):
cycle_modules = [m for m in all_modules if in_degree[m] > 0]
raise ValueError(f"Circular dependency detected: {cycle_modules}")
return order
def parallel_build_plan(self) -> list:
"""
Return parallel build plan: each "wave" can be compiled in parallel
This is essentially the layered result of BFS
"""
all_modules = set(self.dependencies.keys()) | set(self.dependents.keys())
in_degree = {m: len(self.dependencies[m]) for m in all_modules}
queue = deque([m for m in all_modules if in_degree[m] == 0])
waves = []
while queue:
wave = list(queue)
waves.append(wave)
next_queue = deque()
for module in wave:
for dependent in self.dependents[module]:
in_degree[dependent] -= 1
if in_degree[dependent] == 0:
next_queue.append(dependent)
queue = next_queue
return waves
# Example: C++ project compilation dependencies
build = BuildSystem()
build.add_dependency("main.cpp", "utils.h")
build.add_dependency("main.cpp", "network.h")
build.add_dependency("network.cpp", "network.h")
build.add_dependency("network.cpp", "utils.h")
build.add_dependency("network.h", "utils.h")
build.add_dependency("tests.cpp", "main.cpp")
build.add_dependency("tests.cpp", "network.cpp")
print("Build order:", build.get_build_order())
print("Parallel build plan:", build.parallel_build_plan())
# Wave 1: [utils.h] (no dependencies)
# Wave 2: [network.h] (depends only on utils.h)
# Wave 3: [main.cpp, network.cpp] (can be parallel)
# Wave 4: [tests.cpp] (depends on the above)
Key insight about parallel builds: In Kahn's algorithm, each "layer" (nodes whose in-degree simultaneously becomes 0) can be processed in parallel. This is the principle behind make -j8 accelerating compilation — it uses the layered structure of topological sort to maximize parallelism without violating dependencies.
17.9 Critical Path (Longest Path)
In project management, the critical path is the minimum time needed to complete a project — it is the longest path in the DAG from start to finish (because all tasks must be completed, and the bottleneck is the longest chain).
Problem definition: Given a DAG where each vertex has a weight (task duration), find the longest path from source to sink.
Key observation: Finding the longest path in a DAG is NOT NP-hard (unlike in general graphs)! Because we can use topological sort + dynamic programming.
def critical_path(num_tasks: int, edges: list, durations: list) -> tuple:
"""
Critical Path Algorithm
Args:
num_tasks: Number of tasks
edges: [(u, v), ...] meaning u must complete before v
durations: Duration of each task
Returns:
(earliest completion time, list of tasks on critical path)
"""
graph = defaultdict(list)
in_degree = [0] * num_tasks
for u, v in edges:
graph[u].append(v)
in_degree[v] += 1
# Step 1: Topological sort
topo_order = []
queue = deque([i for i in range(num_tasks) if in_degree[i] == 0])
temp_in_degree = in_degree[:]
while queue:
node = queue.popleft()
topo_order.append(node)
for neighbor in graph[node]:
temp_in_degree[neighbor] -= 1
if temp_in_degree[neighbor] == 0:
queue.append(neighbor)
# Step 2: Compute Earliest Start Time (EST) in topological order
est = [0] * num_tasks
predecessor = [-1] * num_tasks
for node in topo_order:
for neighbor in graph[node]:
new_est = est[node] + durations[node]
if new_est > est[neighbor]:
est[neighbor] = new_est
predecessor[neighbor] = node
# Step 3: Find earliest project completion time
project_end_time = 0
end_task = -1
for i in range(num_tasks):
finish_time = est[i] + durations[i]
if finish_time > project_end_time:
project_end_time = finish_time
end_task = i
# Step 4: Backtrack critical path
critical = []
cur = end_task
while cur != -1:
critical.append(cur)
cur = predecessor[cur]
critical.reverse()
return project_end_time, critical
# Example: Software project
# Tasks: 0=Requirements(3d), 1=Design(5d), 2=Coding(8d), 3=Testing(4d), 4=Docs(2d), 5=Deploy(1d)
tasks = 6
durations = [3, 5, 8, 4, 2, 1]
deps = [
(0, 1), # Requirements → Design
(0, 4), # Requirements → Docs
(1, 2), # Design → Coding
(2, 3), # Coding → Testing
(3, 5), # Testing → Deploy
(4, 5), # Docs → Deploy
]
total_time, critical = critical_path(tasks, deps, durations)
print(f"Earliest project completion: {total_time} days")
print(f"Critical path: {critical}")
# Critical path: Req(3) → Design(5) → Coding(8) → Testing(4) → Deploy(1) = 21 days
# Docs path: Req(3) → Docs(2) → Deploy(1) = 6 days (non-critical)
Engineering significance of CPM (Critical Path Method):
- Determine minimum project duration: The critical path length is the incompressible minimum time
- Identify bottlenecks: A 1-day delay on the critical path delays the entire project by 1 day; non-critical tasks have "float"
- Resource allocation: Non-critical path tasks can be delayed to focus resources on the critical path
17.10 Topological Sort and Dynamic Programming
Topological sort and dynamic programming have a deep connection: DP on a DAG is computed in topological order.
Many DP problems are essentially computations on DAGs:
- States are DAG vertices
- State transitions are DAG edges
- Computation order is topological sort
def dag_longest_path(num_vertices: int, edges: list, weights: dict) -> int:
"""
Longest path in a DAG (with edge weights)
weights: {(u,v): weight} edge weight dictionary
"""
graph = defaultdict(list)
in_degree = [0] * num_vertices
for u, v in edges:
graph[u].append(v)
in_degree[v] += 1
# Topological sort
topo = []
queue = deque([i for i in range(num_vertices) if in_degree[i] == 0])
while queue:
node = queue.popleft()
topo.append(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
# DP in topological order
dist = [0] * num_vertices
for node in topo:
for neighbor in graph[node]:
w = weights.get((node, neighbor), 1)
dist[neighbor] = max(dist[neighbor], dist[node] + w)
return max(dist)
# Example
edges = [(0,1), (0,2), (1,3), (2,3), (3,4)]
weights = {(0,1): 3, (0,2): 2, (1,3): 4, (2,3): 1, (3,4): 5}
print(dag_longest_path(5, edges, weights)) # 0→1→3→4 = 3+4+5 = 12
Classic example: Longest Increasing Subsequence (LIS)
LIS can be modeled as a DAG: if a[i] < a[j] and i < j, there is an edge i→j. LIS is the longest path in this DAG. Although in practice we use the O(n log n) greedy+binary-search solution, the DAG perspective helps understand the problem's essence.
17.11 Layered Topological Sort
Sometimes we need not just a valid topological order, but also the "level" of each node — the earliest round in which it can be processed.
def topological_sort_with_levels(num_vertices: int, edges: list) -> list:
"""
Layered topological sort: return nodes for each level
Nodes in the same level have no dependencies between them and can be processed in parallel
"""
graph = defaultdict(list)
in_degree = [0] * num_vertices
for u, v in edges:
graph[u].append(v)
in_degree[v] += 1
current_level = [i for i in range(num_vertices) if in_degree[i] == 0]
levels = []
while current_level:
levels.append(current_level)
next_level = []
for node in current_level:
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
next_level.append(neighbor)
current_level = next_level
total = sum(len(level) for level in levels)
if total != num_vertices:
return [] # Cycle
return levels
# Example
edges = [(0,2), (1,2), (2,3), (2,4), (3,5), (4,5)]
levels = topological_sort_with_levels(6, edges)
print(f"Levels: {levels}")
# [[0, 1], [2], [3, 4], [5]]
# Level 1: 0 and 1 can be parallel
# Level 2: 2 (depends on 0 and 1)
# Level 3: 3 and 4 can be parallel
# Level 4: 5 (depends on 3 and 4)
Meaning of level count: Level = longest path from any in-degree-0 node to the current node. This is equivalent to the "earliest start time" concept.
Level 3 · How the Specification Defines It
17.12 Kahn's 1962 Paper: The BFS Algorithm for Topological Sorting
In 1962, Arthur B. Kahn published "Topological sorting of large networks" (Communications of the ACM, Vol. 5, Issue 11), the first formal description of what we now call "Kahn's algorithm."
Historical context: In the early 1960s, computers were just beginning to be used for large-scale project management (PERT network diagrams, CPM critical path analysis). These applications needed to sort hundreds or thousands of tasks by dependencies. Kahn's paper directly addressed this engineering need.
Kahn's original description (simplified translation):
"Given a finite directed graph representing a partial ordering, the algorithm is:
- Find all nodes with no predecessors; output them (their relative order is arbitrary).
- Remove these nodes and all edges emanating from them.
- Repeat steps 1-2 until the graph is empty. If the graph is non-empty but no nodes without predecessors exist, the graph contains a cycle."
Key contributions:
- Formalized the topological sorting problem: Clearly defined linearization of partial orders
- Provided an O(V + E) algorithm: Critical given the limited computing power of the era — efficiency meant being able to handle larger networks
- Simultaneous cycle detection: The algorithm naturally terminates and reports when a cycle exists
- Discussed parallelism: Kahn noted that all in-degree-0 nodes in each round can be processed in parallel — an observation that remains central to parallel build systems 60 years later
Complexity analysis (in Kahn's argument style):
- Each vertex is enqueued exactly once and dequeued exactly once → O(V) vertex processing
- Each edge is examined exactly once when its source vertex is removed → O(E) edge processing
- Total: O(V + E)
This analysis is identical to BFS complexity analysis — in fact, Kahn's algorithm IS a specialized BFS.
17.13 Tarjan's DFS Topological Sort
Robert Tarjan in his 1972 paper "Depth-First Search and Linear Graph Algorithms" (SIAM Journal on Computing) systematized the use of DFS in graph algorithms, including topological sort based on DFS finish times.
Tarjan's insight: DFS finish times naturally give a reverse topological order. This observation seems simple, but it unifies multiple seemingly unrelated graph algorithms:
- Topological sort: Reverse finish-time ordering
- Strongly connected components (Tarjan's algorithm / Kosaraju's algorithm): Two-pass DFS using finish times
- Articulation points and bridges: Using DFS tree structure
Why is Tarjan's method more "profound"?
Kahn's algorithm is "constructive" — it directly builds the ordering. Tarjan's method reveals the relationship between topological sort and the graph's depth structure:
- Finish times reflect "dependency depth": a higher finish time means the node is a "shallower" dependency (finishing later = more things depend on it)
- The existence of back edges is equivalent to the existence of cycles — a deeper structural insight than "can't find an in-degree-0 node"
Tarjan's correctness proof (simplified):
Let G = (V, E) be a DAG. For any edge (u, v) in E:
- When DFS visits u, v is either white (undiscovered) or black (finished)
- v cannot be gray (because gray means v is an ancestor of u; adding edge (u,v) would form a cycle — contradicting DAG)
- If v is white: v becomes a descendant of u, f[v] < f[u]
- If v is black: f[v] < f[u] (v finished before u)
- In both cases, f[u] > f[v]
Therefore, vertices listed in decreasing finish time satisfy: for any edge (u,v), u comes before v. This is exactly the definition of topological sort. □
Philosophical difference between Tarjan and Kahn:
| Kahn | Tarjan |
|---|---|
| Bottom-up | Top-down |
| "Do what's doable first" | "Go deep, then come back" |
| Incremental (suits dynamic graphs) | Global (single DFS traversal) |
| More intuitive | Reveals deeper structure |
17.14 Partial Orders and Total Orders: Mathematical Foundations
A partial order is a relation satisfying reflexivity, antisymmetry, and transitivity. A DAG's reachability defines a partial order: if v is reachable from u, then u ≤ v. But not all vertex pairs are comparable (unreachable vertices have no ordering relation).
A total order requires all pairs to be comparable.
Topological sort = linear extension of a partial order: Extending a partial order to a total order while preserving all existing relationships.
Dilworth's Theorem (1950): In any finite partially ordered set, the maximum antichain size (largest set of mutually incomparable elements) equals the minimum number of chains (totally ordered subsets) needed to partition the set.
Applied to topological sort: maximum antichain = maximum number of tasks that can be processed simultaneously in any valid schedule.
def max_parallel_width(num_vertices: int, edges: list) -> int:
"""
Compute DAG's maximum width (maximum antichain size)
i.e., maximum number of simultaneously active tasks in any valid schedule
Equals the size of the largest level in layered topological sort
"""
levels = topological_sort_with_levels(num_vertices, edges)
if not levels:
return 0 # Cycle
return max(len(level) for level in levels)
17.15 Topological Sort in Modern Systems
Package Managers (npm, pip, apt)
Every time you run pip install tensorflow, the package manager needs to:
- Resolve all of tensorflow's dependencies (recursively)
- Build the dependency graph
- Detect circular dependencies (report error if found)
- Topologically sort to determine installation order
- Install in order (or parallelize independent packages)
Spreadsheets (Excel/Google Sheets)
Cell formula references form a DAG. When you modify a cell, the spreadsheet needs to recalculate all dependent cells in topological order.
class Spreadsheet:
"""Simplified spreadsheet engine"""
def __init__(self):
self.values = {}
self.formulas = {}
self.deps = defaultdict(set) # cell → cells it depends on
self.rdeps = defaultdict(set) # cell → cells depending on it
def set_formula(self, cell: str, formula, dependencies: list):
"""Set cell formula"""
for old_dep in self.deps[cell]:
self.rdeps[old_dep].discard(cell)
self.formulas[cell] = formula
self.deps[cell] = set(dependencies)
for dep in dependencies:
self.rdeps[dep].add(cell)
self._recalculate(cell)
def set_value(self, cell: str, value):
"""Set value directly (non-formula)"""
self.values[cell] = value
self.formulas.pop(cell, None)
self.deps[cell] = set()
self._propagate(cell)
def _propagate(self, changed_cell: str):
"""Recalculate all affected cells in topological order"""
affected = set()
queue = deque([changed_cell])
while queue:
cell = queue.popleft()
for dependent in self.rdeps[cell]:
if dependent not in affected:
affected.add(dependent)
queue.append(dependent)
# Topologically sort affected cells and recalculate...
def _recalculate(self, cell: str):
"""Recalculate a single cell"""
if cell in self.formulas:
dep_values = {d: self.values.get(d, 0) for d in self.deps[cell]}
self.values[cell] = self.formulas[cell](dep_values)
React/Vue Reactivity Systems
Frontend framework reactive updates are essentially topological sort:
- State variables are source nodes in the DAG
- Computed properties are intermediate nodes
- UI components are leaf nodes
- When source state changes, the framework updates all downstream dependencies in topological order
Level 4 · Edge Cases and Pitfalls
17.16 LeetCode 207: Course Schedule (Detailed Interview Analysis)
We have given code implementations in Levels 1 and 2. Here we focus on follow-up questions and edge cases in interviews.
Follow-up 1: What if the number of courses is very large (10⁶)?
# Kahn's is already O(V+E), can't improve asymptotically
# But we can optimize constants:
# 1. Use arrays instead of dicts (vertices are integers 0~n-1)
# 2. Use list for adjacency instead of defaultdict
# 3. Avoid creating new lists (in-place modification)
def canFinish_optimized(numCourses: int, prerequisites: list) -> bool:
"""Memory-optimized version"""
graph = [[] for _ in range(numCourses)]
in_degree = [0] * numCourses
for course, prereq in prerequisites:
graph[prereq].append(course)
in_degree[course] += 1
# Use list as queue (pointer-based dequeue)
queue = [i for i in range(numCourses) if in_degree[i] == 0]
idx = 0
while idx < len(queue):
node = queue[idx]
idx += 1
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
return len(queue) == numCourses
Follow-up 2: Find all courses that form cycles
def find_cycle_courses(numCourses: int, prerequisites: list) -> list:
"""Find all courses involved in cycles"""
graph = [[] for _ in range(numCourses)]
in_degree = [0] * numCourses
for course, prereq in prerequisites:
graph[prereq].append(course)
in_degree[course] += 1
queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
removed = set()
while queue:
node = queue.popleft()
removed.add(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
# Remaining nodes not removed are all involved in cycles
return [i for i in range(numCourses) if i not in removed]
Edge cases:
numCourses = 0: Return True (empty graph has no cycles)prerequisites = []: All courses are independent, return True- Self-loop
[0, 0]: Course depends on itself, cycle exists, return False - Duplicate edges: Does not affect correctness (in-degree may be over-counted, but Kahn's still works correctly)
17.17 LeetCode 210: Course Schedule II (Return Specific Ordering)
def findOrder(numCourses: int, prerequisites: list) -> list:
"""
LeetCode 210: Course Schedule II
Complete interview-level implementation
"""
graph = [[] for _ in range(numCourses)]
in_degree = [0] * numCourses
for course, prereq in prerequisites:
graph[prereq].append(course)
in_degree[course] += 1
queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
order = []
while queue:
node = queue.popleft()
order.append(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
return order if len(order) == numCourses else []
# For lexicographically smallest topological sort:
import heapq
def findOrder_lexicographic(numCourses: int, prerequisites: list) -> list:
"""Lexicographically smallest topological sort"""
graph = [[] for _ in range(numCourses)]
in_degree = [0] * numCourses
for course, prereq in prerequisites:
graph[prereq].append(course)
in_degree[course] += 1
# Use min-heap instead of regular queue
heap = [i for i in range(numCourses) if in_degree[i] == 0]
heapq.heapify(heap)
order = []
while heap:
node = heapq.heappop(heap)
order.append(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
heapq.heappush(heap, neighbor)
return order if len(order) == numCourses else []
# Test
print(findOrder(4, [[1,0],[2,0],[3,1],[3,2]]))
# Possible: [0, 1, 2, 3] or [0, 2, 1, 3]
print(findOrder_lexicographic(4, [[1,0],[2,0],[3,1],[3,2]]))
# Always: [0, 1, 2, 3] (lexicographically smallest)
Interview tips:
- Be able to write both Kahn's and DFS approaches
- For lexicographically smallest → replace queue with min-heap
- Time O(V + E), Space O(V + E)
17.18 LeetCode 269: Alien Dictionary
Problem: Given a list of words sorted in alien alphabet order, determine the order of characters.
Analysis: This is a "reverse-engineer sorting rules from sorted output" problem. The essence is building a constraint graph + topological sort.
def alienOrder(words: list) -> str:
"""
LeetCode 269: Alien Dictionary
Infer character ordering from sorted word list
Approach:
1. Compare adjacent words to find partial order between characters
2. Build directed graph
3. Topological sort
Time: O(C) where C is total characters across all words
Space: O(1) alphabet size fixed at 26
"""
# Step 1: Collect all characters
all_chars = set()
for word in words:
all_chars.update(word)
# Step 2: Compare adjacent words, extract ordering constraints
graph = defaultdict(set)
in_degree = {c: 0 for c in all_chars}
for i in range(len(words) - 1):
word1, word2 = words[i], words[i + 1]
# Special case: if word1 is longer and is a prefix superset of word2
# e.g., ["abc", "ab"] → violates sorting rules
if len(word1) > len(word2) and word1[:len(word2)] == word2:
return "" # Invalid input
# Find first differing character
for c1, c2 in zip(word1, word2):
if c1 != c2:
if c2 not in graph[c1]:
graph[c1].add(c2)
in_degree[c2] += 1
break # Only the first difference provides information
# Step 3: Kahn's topological sort
queue = deque([c for c in all_chars if in_degree[c] == 0])
result = []
while queue:
char = queue.popleft()
result.append(char)
for neighbor in graph[char]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
if len(result) != len(all_chars):
return "" # Cycle (contradictory constraints)
return "".join(result)
# Test
print(alienOrder(["wrt", "wrf", "er", "ett", "rftt"]))
# Output: "wertf"
# Reasoning:
# "wrt" vs "wrf": t before f → t→f
# "wrf" vs "er": w before e → w→e
# "er" vs "ett": r before t → r→t
# "ett" vs "rftt": e before r → e→r
# Graph: w→e→r→t→f
# Topological sort: wertf
print(alienOrder(["z", "x"])) # "zx"
print(alienOrder(["z", "x", "z"])) # "" (z<x and x<z, contradiction → cycle)
Interview tips:
- Key insight: Only adjacent word comparisons are meaningful (non-adjacent relationships are transitively derived)
- Only use the first differing character: Subsequent character comparisons are invalid (sorting only looks at the first difference)
- Edge cases:
- Invalid input (e.g., ["abc", "ab"])
- Single word (all character orderings are arbitrary)
- Cycle (contradictory constraints)
- If unique solution required: When topological sort has multiple in-degree-0 nodes at any step, the order is not unique
17.19 Summary of Topological Sort Pitfalls
Pitfall 1: Edge direction
Different problems have different edge direction conventions!
# Course Schedule: prerequisites[i] = [a, b] means "to take a, first take b"
# Edge direction: b → a (b is prerequisite)
for a, b in prerequisites:
graph[b].append(a)
# Alien Dictionary: if c1 comes before c2
# Edge direction: c1 → c2
graph[c1].add(c2)
Pitfall 2: Disconnected DAGs
# If the graph has multiple connected components (isolated nodes),
# they can be placed in any order relative to each other
# Kahn's handles this naturally: all isolated nodes have in-degree 0
# But some problems may require specific ordering (e.g., lexicographically smallest)
Pitfall 3: Duplicate edges
# Duplicate edges [(0,1), (0,1)] may over-count in-degree
# Solution 1: Deduplicate during graph construction
graph = defaultdict(set) # Use set instead of list
# Solution 2: Deduplicate edges first
edges = list(set(edges))
Pitfall 4: Assuming topological sort works on cyclic graphs
# Topological sort only works on DAGs!
# If the graph has a cycle, must detect/report it first
# Kahn's: return length < V signals a cycle
# DFS: re-encountering a gray node signals a cycle
17.20 Universal Template for Topological Sort Interview Problems
def topological_sort_template(problem_input):
"""
Universal template for topological sort interview problems
Key steps:
1. Recognize "this is topological sort" (dependencies, ordering, DAG)
2. Build graph (watch the edge direction!)
3. Choose algorithm (Kahn's more versatile, DFS more concise)
4. Handle edges cases (cycle, isolated nodes, non-uniqueness)
"""
# Step 1: Extract dependencies from problem description
# Determine: what are vertices? edges? edge direction?
# Step 2: Build graph
graph = defaultdict(list)
in_degree = defaultdict(int)
all_nodes = set()
for dependency in problem_input:
u, v = extract_edge(dependency)
graph[u].append(v)
in_degree[v] += 1
all_nodes.add(u)
all_nodes.add(v)
# Ensure all nodes have in-degree records
for node in all_nodes:
if node not in in_degree:
in_degree[node] = 0
# Step 3: Kahn's algorithm
queue = deque([n for n in all_nodes if in_degree[n] == 0])
result = []
while queue:
node = queue.popleft()
result.append(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
# Step 4: Check success
if len(result) != len(all_nodes):
return "IMPOSSIBLE" # Cycle
return result
Signal words for recognizing topological sort problems:
- "prerequisite / precondition"
- "dependency relationship"
- "ordering / arrangement"
- "can we finish all?"
- "task scheduling"
- "compilation order"
17.21 Chapter Summary
| Concept | Key Points |
|---|---|
| Topological sort | Linear ordering of DAG vertices respecting all edge directions |
| Kahn's algorithm | BFS: output in-degree-0 first, O(V+E) |
| DFS topological sort | Reverse finish-time order, O(V+E) |
| Cycle detection | Kahn's: result < V; DFS: encountering gray node |
| Critical path | DAG longest path = minimum project duration |
| Parallel scheduling | Layered Kahn's = maximum parallelism |
Topological sort is one of the most fundamental and important algorithms for directed graphs. Its applications extend far beyond "sorting" itself — compilers, package managers, spreadsheets, reactive frameworks, project management tools all rely on topological sort under the hood. Understanding topological sort means understanding the core of "optimal scheduling under dependency constraints."