Queues and Deques
Chapter 7: Queues and Deques
You stand in line at the cafeteria: whoever arrives first gets served first. You take a number at the bank: the smallest number gets called to the window first. Your computer has ten programs running simultaneously, and the operating system lines them up in a queue, taking turns allocating CPU time.
The common mathematical structure behind these scenarios is the queue. The first element to enter is the first one to leave — First In, First Out, or FIFO.
Queues and stacks (Chapter 5) are "mirror image" data structures: stacks are LIFO, queues are FIFO. Stacks solve "the most recent thing gets processed first" problems (recursion, undo), while queues solve "the earliest thing gets processed first" problems (scheduling, breadth-first search). Together, they cover the vast majority of "process events in temporal order" scenarios.
In this chapter, we start from the simplest queue, progressively dive into circular queues and deques, and finally touch on priority queues and message queues. You will discover that queues are the bridge connecting data structures to operating systems, network protocols, and distributed systems.
Level 1 · What You Need to Know
1.1 The Basic Concept of a Queue
A queue is a restricted linear data structure. It only allows insertion at one end (the rear or back) and deletion at the other end (the front).
Core operations of a queue:
| Operation | Meaning | Time Complexity |
|---|---|---|
enqueue(x) |
Add element x to the rear | O(1) |
dequeue() |
Remove and return the front element | O(1) |
peek() / front() |
View the front element without removing it | O(1) |
Additional operations:
is_empty()— check whether the queue is emptysize()— return the number of elements in the queue
Formal description of the FIFO property:
Given a queue Q on which we perform enqueue(a₁), enqueue(a₂), ..., enqueue(aₙ) in sequence, then performing n dequeue() operations yields the sequence a₁, a₂, ..., aₙ. That is: the output order is identical to the input order.
Why this constraint? The same reasoning as with stacks: constraints bring semantic clarity. When you see a queue, you immediately know "first come, first served." This fairness assumption holds naturally in countless scenarios — waiting in line, scheduling, message passing. The design philosophy of queues is: use structure to guarantee fairness, rather than relying on programmers to manually maintain order.
1.2 Implementing a Queue with Python's collections.deque
Many beginners use Python's list to implement a queue:
# ❌ Wrong approach: using list as a queue
queue = []
queue.append("A") # enqueue
queue.append("B")
queue.pop(0) # dequeue — this is O(n)!
This is a common performance trap. list.pop(0) needs to shift all subsequent elements forward by one position, giving O(n) time complexity. If your queue has one million elements, every dequeue operation moves one million elements — clearly unacceptable.
The correct approach is to use collections.deque:
from collections import deque
class Queue:
"""Queue implementation based on collections.deque"""
def __init__(self):
self._data = deque()
def enqueue(self, item):
"""Add element to the rear — O(1)"""
self._data.append(item)
def dequeue(self):
"""Remove and return the front element — O(1)
Raises:
IndexError: if the queue is empty
"""
if self.is_empty():
raise IndexError("dequeue from empty queue")
return self._data.popleft()
def peek(self):
"""View the front element without removing it — O(1)
Raises:
IndexError: if the queue is empty
"""
if self.is_empty():
raise IndexError("peek from empty queue")
return self._data[0]
def is_empty(self):
"""Check whether the queue is empty — O(1)"""
return len(self._data) == 0
def size(self):
"""Return the number of elements in the queue — O(1)"""
return len(self._data)
def __repr__(self):
return f"Queue({list(self._data)})"
# Usage example
q = Queue()
q.enqueue("Alice")
q.enqueue("Bob")
q.enqueue("Charlie")
print(q.dequeue()) # Alice — first in, first out
print(q.dequeue()) # Bob
print(q.peek()) # Charlie — view without removing
print(q.size()) # 1
Why is deque's popleft() O(1)? Because deque internally is not a contiguous array, but a doubly-linked list of fixed-length blocks. Operations at either end never require moving other elements. We will analyze its internal implementation in detail in Level 2.
1.3 Implementing a Circular Queue
In many low-level systems (OS kernels, embedded devices, network buffers), we cannot use a dynamically-allocated deque. Instead, we need a fixed-size array to implement a queue. This is the circular queue.
The core idea of a circular queue: use an array and two pointers (front, rear) to maintain the queue. When a pointer reaches the end of the array, the modulo operation wraps it back to the beginning, forming a logically circular structure.
class CircularQueue:
"""Array-based circular queue implementation"""
def __init__(self, capacity: int):
"""
Args:
capacity: maximum number of elements the queue can hold
"""
self._data = [None] * (capacity + 1) # allocate one extra space to distinguish full from empty
self._front = 0
self._rear = 0
self._capacity = capacity + 1
def enqueue(self, item) -> bool:
"""Add element to the rear — O(1)
Returns:
True if successful, False if the queue is full
"""
if self.is_full():
return False
self._data[self._rear] = item
self._rear = (self._rear + 1) % self._capacity
return True
def dequeue(self):
"""Remove and return the front element — O(1)
Raises:
IndexError: if the queue is empty
"""
if self.is_empty():
raise IndexError("dequeue from empty circular queue")
item = self._data[self._front]
self._data[self._front] = None # help GC
self._front = (self._front + 1) % self._capacity
return item
def peek(self):
"""View the front element — O(1)"""
if self.is_empty():
raise IndexError("peek from empty circular queue")
return self._data[self._front]
def is_empty(self) -> bool:
"""The queue is empty when: front == rear"""
return self._front == self._rear
def is_full(self) -> bool:
"""The queue is full when: (rear + 1) % capacity == front
Note: We sacrifice one position to distinguish full from empty.
Without this, both states would have front == rear, making them
indistinguishable.
"""
return (self._rear + 1) % self._capacity == self._front
def size(self) -> int:
"""Number of elements currently in the queue"""
return (self._rear - self._front + self._capacity) % self._capacity
def __repr__(self):
if self.is_empty():
return "CircularQueue([])"
items = []
i = self._front
while i != self._rear:
items.append(self._data[i])
i = (i + 1) % self._capacity
return f"CircularQueue({items})"
# Usage example
cq = CircularQueue(5)
for i in range(5):
cq.enqueue(i)
print(cq) # CircularQueue([0, 1, 2, 3, 4])
print(cq.is_full()) # True
print(cq.dequeue()) # 0
print(cq.dequeue()) # 1
cq.enqueue(5) # reuses previously freed space
cq.enqueue(6)
print(cq) # CircularQueue([2, 3, 4, 5, 6])
Key design decisions in circular queues:
-
Why allocate one extra space? If
capacity = 5, we actually allocate 6 slots. This way, the "full" condition is(rear + 1) % capacity == front, and the "empty" condition isfront == rear— the two never conflict. An alternative approach is to maintain asizecounter, but this adds overhead to every operation (though just an increment/decrement, it matters in high-frequency scenarios). -
The meaning of the modulo operation:
(index + 1) % capacityimplements "wrap back to the beginning when reaching the end." This is the essence of "circular" — the pointer walks around the array in a ring. -
Difference from Python's deque: A circular queue has a fixed size; when full, it rejects new elements (or raises an exception).
dequeis dynamically sized and auto-expands. Circular queues suit "buffer" scenarios — where you want to control memory usage rather than allow unbounded growth.
1.4 BFS Basics: Level-Order Traversal and Shortest Paths
The most classic algorithmic application of queues is Breadth-First Search (BFS). BFS is a "layer-by-layer outward expansion" search strategy that naturally requires a queue to maintain "the nodes to visit in the next layer."
The basic BFS template:
from collections import deque
def bfs(graph, start):
"""BFS graph traversal
Args:
graph: graph represented as adjacency list, graph[node] is all neighbors of node
start: starting node
Returns:
visited: all nodes reachable from start
"""
visited = set([start])
queue = deque([start])
while queue:
node = queue.popleft() # dequeue from front
# process current node
print(node, end=" ")
for neighbor in graph[node]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor) # enqueue at rear
return visited
BFS for shortest path (unweighted graphs):
In unweighted graphs (all edges have weight 1), BFS naturally finds the shortest path from the source to any reachable node. The intuition is straightforward: BFS visits nodes at distance 1 first, then distance 2, and so on. The first time it reaches any node, that path must be the shortest.
def shortest_path_bfs(graph, start, target):
"""Find the shortest path from start to target in an unweighted graph
Returns:
Path as a list, or None if unreachable
"""
if start == target:
return [start]
visited = {start}
queue = deque([(start, [start])]) # (current node, path from start)
while queue:
node, path = queue.popleft()
for neighbor in graph[node]:
if neighbor not in visited:
new_path = path + [neighbor]
if neighbor == target:
return new_path
visited.add(neighbor)
queue.append((neighbor, new_path))
return None # unreachable
# Example
graph = {
'A': ['B', 'C'],
'B': ['A', 'D', 'E'],
'C': ['A', 'F'],
'D': ['B'],
'E': ['B', 'F'],
'F': ['C', 'E']
}
path = shortest_path_bfs(graph, 'A', 'F')
print(path) # ['A', 'C', 'F'] — shortest path, length 2
Level-order traversal is BFS applied to trees:
def level_order_traversal(root):
"""Level-order traversal of a binary tree
Returns:
result: 2D list where result[i] contains all node values at level i
"""
if not root:
return []
result = []
queue = deque([root])
while queue:
level_size = len(queue) # number of nodes at current level
level_nodes = []
for _ in range(level_size):
node = queue.popleft()
level_nodes.append(node.val)
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
result.append(level_nodes)
return result
For more BFS applications — graph traversal, shortest path variants, topological sorting — we will elaborate in Chapter 10 (Graph Fundamentals) and Chapter 16 (Advanced Graph Algorithms).
1.5 Introduction to Priority Queues
A regular queue's rule is "first come, first served." But in many real scenarios, we need "highest priority served first" — this is the priority queue.
The priority queue's interface is similar to a regular queue, but dequeue does not return the earliest-entered element. Instead, it returns the element with the highest priority.
import heapq
class PriorityQueue:
"""Min-priority queue based on heapq"""
def __init__(self):
self._heap = []
self._counter = 0 # tiebreaker for equal priorities
def enqueue(self, item, priority):
"""Add element to the priority queue
Args:
item: the element
priority: priority value (lower number = higher priority)
"""
heapq.heappush(self._heap, (priority, self._counter, item))
self._counter += 1
def dequeue(self):
"""Remove and return the highest-priority (lowest number) element"""
if self.is_empty():
raise IndexError("dequeue from empty priority queue")
priority, _, item = heapq.heappop(self._heap)
return item
def peek(self):
"""View the highest-priority element"""
if self.is_empty():
raise IndexError("peek from empty priority queue")
return self._heap[0][2]
def is_empty(self):
return len(self._heap) == 0
def size(self):
return len(self._heap)
# Example: Hospital emergency triage
er = PriorityQueue()
er.enqueue("Common cold", priority=5)
er.enqueue("Heart attack", priority=1)
er.enqueue("Fracture", priority=3)
print(er.dequeue()) # Heart attack — highest priority
print(er.dequeue()) # Fracture
print(er.dequeue()) # Common cold
Efficient implementation of priority queues relies on the heap data structure. We will cover heap principles and implementation in detail in Chapter 12. For now, you just need to know: Python's heapq module gives you an O(log n) enqueue, O(log n) dequeue priority queue.
1.6 Common Mistake: Using list as a Queue
Let us quantify exactly how slow "using list as a queue" is:
import time
from collections import deque
def benchmark_list_queue(n):
"""Performance of list-as-queue"""
q = []
start = time.perf_counter()
for i in range(n):
q.append(i)
for i in range(n):
q.pop(0) # O(n) operation!
return time.perf_counter() - start
def benchmark_deque_queue(n):
"""Performance of deque-as-queue"""
q = deque()
start = time.perf_counter()
for i in range(n):
q.append(i)
for i in range(n):
q.popleft() # O(1) operation
return time.perf_counter() - start
# Test different scales
for n in [10_000, 100_000, 1_000_000]:
t_list = benchmark_list_queue(n)
t_deque = benchmark_deque_queue(n)
print(f"n={n:>10,}: list={t_list:.4f}s, deque={t_deque:.4f}s, "
f"ratio={t_list/t_deque:.1f}x")
Typical output (exact numbers vary by machine):
n= 10,000: list=0.0150s, deque=0.0008s, ratio=18.8x
n= 100,000: list=1.4500s, deque=0.0070s, ratio=207.1x
n= 1,000,000: list=148.30s, deque=0.0680s, ratio=2181.0x
The gap is quadratic: the list version's total time is O(n²) (n times O(n) pop(0)), while the deque version is O(n) (n times O(1) popleft()). When n grows 10x, the list version slows 100x, while the deque version slows only 10x.
Remember this rule: In Python, never use list.pop(0) as a queue's dequeue operation. Use collections.deque.
Level 2 · How It Works Under the Hood
2.1 The Internal Implementation of collections.deque
In CPython, the source code for collections.deque lives in Modules/_collectionsmodule.c. Its internal structure is neither a simple array nor a simple linked list, but a hybrid: a doubly-linked list of fixed-length blocks.
Each block is an array of 64 PyObject* pointers (BLOCKLEN = 64 in CPython 3.x). Multiple blocks are linked together via a doubly-linked list. The deque object itself maintains:
leftblock— pointer to the leftmost blockrightblock— pointer to the rightmost blockleftindex— index of the next available position in the leftblockrightindex— index of the last element in the rightblock
Internal layout of deque:
leftblock ... rightblock
┌──────────┐ ┌──────────┐ ┌──────────┐
│ [64 slots] │ ←→ │ [64 slots] │ ←→ │ [64 slots] │
└──────────┘ └──────────┘ └──────────┘
↑ leftindex rightindex ↑
The process of appendleft() / popleft():
popleft(): Retrieve the element atleftblock[leftindex], thenleftindex += 1. Ifleftindexexceeds BLOCKLEN (the current block is exhausted), free this block, pointleftblockto the next block, and setleftindex = 0.appendleft(x):leftindex -= 1, place x atleftblock[leftindex]. Ifleftindex < 0(the current block's left side is full), allocate a new block and link it to the left of leftblock.
Why is this design better than a pure linked list?
- Cache-friendly: The 64 pointers within each block are contiguous in memory, giving high CPU cache hit rates when accessing them. In a pure linked list, each node may be scattered across memory.
- Low memory overhead: A pure linked list requires
prevandnextpointers (16 bytes) per element, plus the Python object header (56 bytes). In deque's block structure, each element only needs an 8-byte pointer, and the per-block prev/next overhead is amortized across 64 elements. - O(1) operations at both ends: append/appendleft/pop/popleft are all true O(1), not amortized.
Deque's drawback: Random access is O(n). deque[i] needs to start from one end, jump block by block to find the block containing the i-th element, then locate it within the block. Unlike an array's O(1) direct indexing.
Raymond Hettinger (the primary implementer of deque, Python core developer) has explained this design decision multiple times in the CPython issue tracker and PyCon talks: "deque is positioned as an efficient container for double-ended operations, not a general-purpose random-access sequence. If you need random access, use list."
2.2 Circular Queue vs. Linked-List Queue: Performance Comparison
Queues have two classic underlying implementations: array (circular queue) and linked list. Each has its strengths:
| Dimension | Circular Queue (Array) | Linked-List Queue |
|---|---|---|
| enqueue | O(1), direct write | O(1), allocate new node |
| dequeue | O(1), move pointer | O(1), delete head node |
| Memory layout | Contiguous, cache-friendly | Scattered, cache-unfriendly |
| Memory overhead | Fixed size, may waste space | On-demand, no waste |
| Maximum capacity | Fixed (must be specified) | Theoretically unlimited |
| Allocation/deallocation | None (reuses array space) | Every operation involves alloc/free |
When to choose which?
- Circular queue: When you know the maximum queue length, pursue maximum performance, or work in environments that disallow dynamic memory allocation (embedded, kernel). Linux kernel's
kfifois a circular queue. - Linked-list queue: When queue length is unpredictable, or element sizes vary greatly. Java's
LinkedListimplementing theQueueinterface is a linked-list queue. - Block-based deque (deque's design): Combines advantages of both — decent cache performance (contiguous within blocks) and dynamic growth. Python's
dequeand C++'sstd::dequeboth use this design.
Performance measurement:
import time
from collections import deque
class LinkedNode:
__slots__ = ('val', 'next')
def __init__(self, val):
self.val = val
self.next = None
class LinkedQueue:
"""Queue implemented with a linked list"""
def __init__(self):
self._head = None
self._tail = None
self._size = 0
def enqueue(self, item):
node = LinkedNode(item)
if self._tail:
self._tail.next = node
else:
self._head = node
self._tail = node
self._size += 1
def dequeue(self):
if not self._head:
raise IndexError("empty")
val = self._head.val
self._head = self._head.next
if not self._head:
self._tail = None
self._size -= 1
return val
def benchmark(queue_class, n, *args):
q = queue_class(*args) if args else queue_class()
start = time.perf_counter()
for i in range(n):
if hasattr(q, 'enqueue'):
q.enqueue(i)
else:
q.append(i)
for i in range(n):
if hasattr(q, 'dequeue'):
q.dequeue()
else:
q.popleft()
return time.perf_counter() - start
n = 1_000_000
print(f"deque: {benchmark(deque, n):.4f}s")
print(f"LinkedQueue: {benchmark(LinkedQueue, n):.4f}s")
print(f"CircularQueue:{benchmark(CircularQueue, n, n):.4f}s")
Typical results:
deque: 0.068s
LinkedQueue: 0.350s
CircularQueue:0.120s
deque is fastest because it is implemented in C and its block structure is cache-friendly. CircularQueue is also decent (limited by pure Python overhead). LinkedQueue is slowest because every enqueue allocates a new Python object (node) and every dequeue frees one — Python's object creation overhead is substantial.
2.3 Applications of Double-Ended Queues
A double-ended queue (deque) allows insertion and deletion at both ends simultaneously. Python's collections.deque is itself a full double-ended queue.
Application 1: Palindrome Checking
A palindrome is a string that reads the same forwards and backwards. A deque provides an elegant palindrome check: simultaneously remove characters from both ends and compare them.
from collections import deque
def is_palindrome(s: str) -> bool:
"""Check palindrome using a deque
Idea: Place characters into a deque, then simultaneously
remove from both ends and compare. If every pair matches,
it's a palindrome.
"""
# Preprocessing: keep only alphanumeric characters, lowercase
chars = deque(c.lower() for c in s if c.isalnum())
while len(chars) > 1:
front = chars.popleft()
back = chars.pop()
if front != back:
return False
return True
print(is_palindrome("A man, a plan, a canal: Panama")) # True
print(is_palindrome("race a car")) # False
This solution has O(n) time complexity and O(n) space complexity. While the two-pointer approach achieves O(1) space, the deque method more intuitively demonstrates "double-ended operations" in concept.
Application 2: Sliding Window Maximum
This is a classic interview problem (LeetCode #239): given an array and a window size k, return the maximum value in each window.
The brute-force approach is O(nk), but using a deque achieves O(n):
from collections import deque
def max_sliding_window(nums: list, k: int) -> list:
"""Sliding window maximum — O(n)
Maintain a monotonically decreasing deque:
- The deque stores indices
- Values corresponding to indices in the deque are decreasing
- The front always holds the index of the current window's maximum
Why decreasing? Because if nums[i] >= nums[j] and i > j,
then j can never be the maximum of any future window (i is larger
and appears later). So j can be safely discarded.
"""
if not nums or k == 0:
return []
dq = deque() # stores indices, corresponding values decreasing
result = []
for i in range(len(nums)):
# Remove expired indices from the front (outside window)
while dq and dq[0] < i - k + 1:
dq.popleft()
# Remove all indices from the rear whose values <= current
while dq and nums[dq[-1]] <= nums[i]:
dq.pop()
dq.append(i)
# Once the window is formed, record the maximum
if i >= k - 1:
result.append(nums[dq[0]])
return result
nums = [1, 3, -1, -3, 5, 3, 6, 7]
print(max_sliding_window(nums, 3))
# [3, 3, 5, 5, 6, 7]
This algorithm is also discussed in Chapter 6 (Linked Lists and Sliding Windows). Here we present it from the queue perspective: the deque maintains a monotonically decreasing sequence of "candidate maximums." Each element enters and leaves the deque at most once, so total time complexity is O(n).
2.4 Implementing a Stack Using Queues / a Queue Using Stacks
These two problems are excellent exercises for understanding the essential difference between queues and stacks, and they are high-frequency interview questions.
Implementing a Stack with Two Queues
from collections import deque
class StackUsingQueues:
"""Implement a stack using two queues
Strategy: Make push bear O(n) cost, pop is O(1).
On push: place the new element into empty queue q2, then transfer all
elements from q1 to q2 one by one. Finally swap q1 and q2.
This ensures q1's front is always the most recently pushed element.
"""
def __init__(self):
self._q1 = deque() # main queue
self._q2 = deque() # auxiliary queue
def push(self, x):
"""O(n) — place new element at the front"""
self._q2.append(x)
while self._q1:
self._q2.append(self._q1.popleft())
self._q1, self._q2 = self._q2, self._q1
def pop(self):
"""O(1) — front of queue is the stack top"""
if not self._q1:
raise IndexError("pop from empty stack")
return self._q1.popleft()
def top(self):
"""O(1)"""
if not self._q1:
raise IndexError("top from empty stack")
return self._q1[0]
def empty(self):
return len(self._q1) == 0
Implementing a Queue with Two Stacks (Amortized O(1))
This is more interesting because it demonstrates the power of amortized analysis:
class QueueUsingStacks:
"""Implement a queue using two stacks — amortized O(1)
Strategy:
- in_stack: handles enqueue (push to in_stack)
- out_stack: handles dequeue (pop from out_stack)
When out_stack is empty, transfer all elements from in_stack to out_stack.
After transfer, out_stack's top is the earliest-enqueued element.
Amortized analysis: each element is pushed at most twice (into in_stack once,
into out_stack once) and popped at most twice (from in_stack once, from
out_stack once). So n operations cost O(n) total, amortized O(1) each.
"""
def __init__(self):
self._in_stack = [] # enqueue end
self._out_stack = [] # dequeue end
def enqueue(self, x):
"""O(1) — just push to in_stack"""
self._in_stack.append(x)
def dequeue(self):
"""Amortized O(1) — pop from out_stack"""
if not self._out_stack:
if not self._in_stack:
raise IndexError("dequeue from empty queue")
# Transfer in_stack to out_stack
while self._in_stack:
self._out_stack.append(self._in_stack.pop())
return self._out_stack.pop()
def peek(self):
"""Amortized O(1)"""
if not self._out_stack:
if not self._in_stack:
raise IndexError("peek from empty queue")
while self._in_stack:
self._out_stack.append(self._in_stack.pop())
return self._out_stack[-1]
def is_empty(self):
return not self._in_stack and not self._out_stack
# Verification
q = QueueUsingStacks()
q.enqueue(1)
q.enqueue(2)
q.enqueue(3)
print(q.dequeue()) # 1 — FIFO!
q.enqueue(4)
print(q.dequeue()) # 2
print(q.dequeue()) # 3
print(q.dequeue()) # 4
The key insight of amortized analysis: Although a single dequeue in the worst case is O(n) (transferring all elements from in_stack to out_stack), each element is transferred only once. So n dequeue operations cost O(n) total, amortized O(1) each. This is the same analysis approach as dynamic array's amortized O(1) append.
2.5 Multi-Source BFS
Standard BFS starts from a single source. Multi-source BFS starts from multiple sources simultaneously, expanding layer by layer. Think of it as simultaneously releasing water from multiple cities on a map — the water fronts expand outward and eventually meet.
Classic problem: Given a 0/1 matrix where 0 represents empty land and 1 represents buildings, find the distance from each empty cell to the nearest building.
from collections import deque
def distance_to_nearest_building(grid):
"""Multi-source BFS: compute distance from each cell to nearest building
Idea: Place all buildings (value 1) into the queue simultaneously as
starting points, then BFS outward. The level number when each empty
cell is first visited is its distance to the nearest building.
"""
rows, cols = len(grid), len(grid[0])
distances = [[float('inf')] * cols for _ in range(rows)]
queue = deque()
# All buildings are starting points
for r in range(rows):
for c in range(cols):
if grid[r][c] == 1:
queue.append((r, c, 0))
distances[r][c] = 0
# BFS
directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
while queue:
r, c, dist = queue.popleft()
for dr, dc in directions:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols and distances[nr][nc] == float('inf'):
distances[nr][nc] = dist + 1
queue.append((nr, nc, dist + 1))
return distances
grid = [
[0, 0, 0, 1],
[0, 0, 0, 0],
[1, 0, 0, 0],
[0, 0, 0, 0]
]
result = distance_to_nearest_building(grid)
for row in result:
print(row)
# [2, 3, 2, 0]
# [1, 2, 2, 1]
# [0, 1, 2, 2]
# [1, 2, 3, 3]
Multi-source BFS has the same time complexity as single-source BFS: O(V + E) — because each node is still visited at most once. Its elegance lies in: by placing multiple starting points into the initial queue, it reduces a "multiple single-source BFS" problem of O(k(V+E)) to a single O(V+E) pass.
Similar applications include LeetCode #542 (01 Matrix) and "Rotting Oranges" (#994).
Level 3 · How the Specification Defines It
3.1 Queues in Operating Systems
Queues are among the most frequently used data structures in operating systems. Virtually every scenario that requires "waiting in line to be processed" employs some form of queue.
Process Scheduling Queues
The way operating systems manage the CPU is by maintaining a set of queues. Taking Linux's O(1) scheduler (before CFS) as an example (Ingo Molnar, 2003):
- Ready Queue: All processes ready to run but not yet holding the CPU are lined up here.
- Multilevel Feedback Queue (MLFQ): Proposed by Fernando Corbato in 1962 (Corbato later received the Turing Award partly for this work), it uses multiple queues with different priorities to balance the needs of interactive and batch processes.
- Real-time Scheduling Queue: Used for hard or soft real-time tasks, typically ordered by priority (effectively a priority queue).
Linux O(1) Scheduler Structure (simplified):
Priority Level 0: [P1] → [P5] → [P9] ← highest priority
Priority Level 1: [P2] → [P7]
Priority Level 2: [P3] → [P4] → [P8] → [P6]
...
Priority Level 139: [P10] ← lowest priority
Each priority level is a FIFO queue.
The scheduler selects the front process from the highest non-empty priority.
Why use queues rather than other structures? Because queues naturally guarantee fairness: within the same priority level, processes that arrived first run first. This is the most basic mechanism for preventing starvation.
I/O Request Queues
When multiple processes simultaneously request disk reads/writes, the operating system places these requests into an I/O queue. Disk scheduling algorithms (SCAN, C-SCAN, Deadline, etc.) select the next request to service from this queue.
Linux's Block I/O layer uses struct request_queue to manage I/O requests. Each block device (e.g., /dev/sda) has its own request queue.
// Simplified structure of a request queue in the Linux kernel
struct request_queue {
struct list_head queue_head; // request linked list
struct elevator_type *elevator; // scheduling algorithm (elevator algorithm)
unsigned long nr_requests; // max requests in queue
// ...
};
Why do I/O queues typically have a maximum length? If the queue could grow without bound, a runaway process could issue massive numbers of I/O requests, exhausting all memory. The queue length limit (typically 128 or 256 by default) provides a backpressure mechanism: when the queue is full, new requests are blocked, throttling the request generation rate.
3.2 The Concept of Message Queues
From in-process queues to inter-process queues to inter-machine queues, the concept of a queue naturally extends to message queues.
Message queues are core infrastructure in distributed systems. They decouple message producers from consumers, allowing them to operate at different rates.
Evolution from single-machine queues to distributed message queues:
- In-thread queues:
queue.Queue(Python),BlockingQueue(Java) - Inter-process queues: POSIX message queues (
mq_open), System V message queues - Inter-machine message queues: RabbitMQ, Apache Kafka, Redis Streams, Amazon SQS
Basic message queue model:
Producer → [Message Queue] → Consumer
Key properties:
- Persistence: Messages written to disk survive broker restarts
- Ordering: Messages within the same partition maintain FIFO order
- Acknowledgment: Consumers send ACK after processing; otherwise messages re-enter the queue
- Backpressure: When the queue is full, producers are throttled or rejected
Apache Kafka's Log Structure (Jay Kreps et al., LinkedIn, 2011):
Kafka's core abstraction is not a traditional "queue" but a distributed append-only log. Each partition is essentially a FIFO queue, and consumers track their reading position via offsets. This design enables Kafka to:
- Support multiple consumers independently consuming the same data
- Allow consumers to rewind to any offset and re-consume
- Write sequentially to disk, achieving extremely high throughput (millions of messages per second)
We will discuss Kafka's detailed architecture in the dedicated message queue book.
3.3 Blocking Queues and the Producer-Consumer Model
The Producer-Consumer Pattern is one of the most fundamental design patterns in concurrent programming. Its core component is a bounded blocking queue:
- Bounded: The queue has a maximum capacity
- Blocking: When full, enqueue blocks and waits; when empty, dequeue blocks and waits
import threading
import queue
import time
import random
def producer(q: queue.Queue, name: str):
"""Producer: generates data and places it in the queue"""
for i in range(10):
item = f"{name}-item-{i}"
q.put(item) # automatically blocks if queue is full
print(f"[{name}] Produced: {item}")
time.sleep(random.uniform(0.01, 0.1))
def consumer(q: queue.Queue, name: str):
"""Consumer: takes data from the queue and processes it"""
while True:
try:
item = q.get(timeout=1) # wait up to 1 second
print(f" [{name}] Consumed: {item}")
time.sleep(random.uniform(0.05, 0.15)) # simulate processing
q.task_done()
except queue.Empty:
print(f" [{name}] No more items, exiting.")
break
# Create a bounded queue with capacity 5
bounded_queue = queue.Queue(maxsize=5)
# Start producers and consumers
producers = [
threading.Thread(target=producer, args=(bounded_queue, f"P{i}"))
for i in range(2)
]
consumers = [
threading.Thread(target=consumer, args=(bounded_queue, f"C{i}"))
for i in range(3)
]
for t in producers + consumers:
t.start()
for t in producers:
t.join()
bounded_queue.join() # wait for all tasks to complete
Implementation principles of a blocking queue:
Python's queue.Queue uses condition variables (threading.Condition) internally to implement blocking:
# Core logic of queue.Queue (simplified)
class SimpleBlockingQueue:
def __init__(self, maxsize):
self._queue = deque()
self._maxsize = maxsize
self._mutex = threading.Lock()
self._not_full = threading.Condition(self._mutex)
self._not_empty = threading.Condition(self._mutex)
def put(self, item):
with self._not_full:
while len(self._queue) >= self._maxsize:
self._not_full.wait() # block when full
self._queue.append(item)
self._not_empty.notify() # wake up waiting consumers
def get(self):
with self._not_empty:
while len(self._queue) == 0:
self._not_empty.wait() # block when empty
item = self._queue.popleft()
self._not_full.notify() # wake up waiting producers
return item
Why bounded? Unbounded queues seem convenient, but they are a hazard in production. If consumers cannot keep up with producers, an unbounded queue grows without limit, eventually exhausting memory and causing OOM (Out Of Memory). The backpressure mechanism of bounded queues forces producers to slow down when consumers are overwhelmed — this is critical for system stability.
This principle holds in distributed systems too: Kafka partitions have capacity limits, TCP has receive windows, HTTP/2 has flow control — all of these are essentially applications of "bounded queue + backpressure."
3.4 Multiple Implementations of Priority Queues Compared
The priority queue abstraction is simple: insert(item, priority) and extract_min(). But it has multiple implementations with vastly different performance:
| Implementation | insert | extract_min | peek | Notes |
|---|---|---|---|---|
| Unsorted array | O(1) | O(n) | O(n) | Fast insert, slow extract |
| Sorted array | O(n) | O(1) | O(1) | Fast extract, slow insert |
| Binary heap | O(log n) | O(log n) | O(1) | Balanced tradeoff |
| Fibonacci heap | O(1) amortized | O(log n) amortized | O(1) | Theoretically optimal, complex implementation |
Why is the binary heap the default choice in practice?
- Balanced time complexity: Both insert and extract are O(log n), with no glaring weakness.
- Good spatial locality: A binary heap can be implemented using an array (no pointers needed), making it CPU cache-friendly.
- Simple implementation: Only needs sift_up and sift_down operations.
- Small constant factors: Compared to balanced trees like red-black trees, heap operations are simpler with smaller constants.
Michael Fredman and Robert Tarjan proposed the Fibonacci heap in their 1987 paper "Fibonacci Heaps and Their Uses in Improved Network Optimization Algorithms," optimizing decrease_key to amortized O(1), which is significant for Dijkstra's algorithm. In practice, however, due to large constant factors and implementation complexity, Fibonacci heaps only offer advantages when the number of edges far exceeds the number of vertices.
# Comparison of three priority queue implementations
class UnsortedPQ:
"""Priority queue using unsorted array"""
def __init__(self):
self._data = []
def insert(self, item, priority):
self._data.append((priority, item)) # O(1)
def extract_min(self):
if not self._data:
raise IndexError("empty")
min_idx = 0
for i in range(1, len(self._data)):
if self._data[i][0] < self._data[min_idx][0]:
min_idx = i
# Swap min element with last, then pop — O(n)
self._data[min_idx], self._data[-1] = self._data[-1], self._data[min_idx]
return self._data.pop()[1]
class SortedPQ:
"""Priority queue using sorted array (descending by priority)"""
def __init__(self):
self._data = [] # sorted descending, smallest at end
def insert(self, item, priority):
# Binary search for insertion position — O(log n) search + O(n) shift
import bisect
bisect.insort(self._data, (priority, item))
def extract_min(self):
if not self._data:
raise IndexError("empty")
return self._data.pop(0)[1] # O(n) shift
import heapq
class HeapPQ:
"""Priority queue using binary heap"""
def __init__(self):
self._heap = []
self._counter = 0
def insert(self, item, priority):
heapq.heappush(self._heap, (priority, self._counter, item)) # O(log n)
self._counter += 1
def extract_min(self):
if not self._heap:
raise IndexError("empty")
return heapq.heappop(self._heap)[2] # O(log n)
When to use which?
- If you have only a few elements (< 100), anything works — even O(n) scanning is fast.
- If inserts far outnumber extracts (e.g., collecting lots of data then extracting the smallest few), the unsorted array's O(1) insert may be better.
- In the general case, the binary heap is the best choice. In Python, use the
heapqmodule directly.
We will cover heap internals, heap sort, and the heapq module's implementation in detail in Chapter 12.
Level 4 · Edge Cases and Pitfalls
4.1 High-Frequency Interview Problems
LeetCode #622: Design Circular Queue
class MyCircularQueue:
"""
LeetCode 622: Design Circular Queue
Key points:
1. Array + front/rear pointers
2. Distinguish full from empty: sacrifice one slot, or use a size counter
3. Modulo operation achieves circularity
"""
def __init__(self, k: int):
self._data = [0] * (k + 1)
self._front = 0
self._rear = 0
self._cap = k + 1
def enQueue(self, value: int) -> bool:
if self.isFull():
return False
self._data[self._rear] = value
self._rear = (self._rear + 1) % self._cap
return True
def deQueue(self) -> bool:
if self.isEmpty():
return False
self._front = (self._front + 1) % self._cap
return True
def Front(self) -> int:
if self.isEmpty():
return -1
return self._data[self._front]
def Rear(self) -> int:
if self.isEmpty():
return -1
return self._data[(self._rear - 1 + self._cap) % self._cap]
def isEmpty(self) -> bool:
return self._front == self._rear
def isFull(self) -> bool:
return (self._rear + 1) % self._cap == self._front
Interview traps:
Rear()returns the rear element, not the rear pointer position.rearpoints to "the next writable position," so the rear element is at(rear - 1 + cap) % cap.- Edge case:
rear - 1can be negative (when rear = 0), so add cap before taking modulo to guarantee correctness.
LeetCode #232: Implement Queue using Stacks
class MyQueue:
"""
LeetCode 232: Implement Queue using Stacks
Two stacks: in_stack and out_stack.
push → in_stack
pop/peek → out_stack (transfer from in_stack when empty)
Amortized O(1). In interviews, be prepared to clearly explain the amortized analysis.
"""
def __init__(self):
self._in = []
self._out = []
def push(self, x: int) -> None:
self._in.append(x)
def pop(self) -> int:
self._transfer()
return self._out.pop()
def peek(self) -> int:
self._transfer()
return self._out[-1]
def empty(self) -> bool:
return not self._in and not self._out
def _transfer(self):
"""Transfer all elements from in to out when out is empty"""
if not self._out:
while self._in:
self._out.append(self._in.pop())
Interview follow-ups:
- "Why is it amortized O(1)?" — Explain using the potential method or accounting method: each element is transferred at most once, so n operations cost O(n) total.
- "What if we need strictly O(1) pop?" — Impossible with only stacks. You could use two queues to rearrange during push (but then push is O(n)).
LeetCode #933: Number of Recent Calls
from collections import deque
class RecentCounter:
"""
LeetCode 933: Number of Recent Calls
Maintain a queue storing timestamps of requests within the past 3000 ms.
On each new request, remove all expired timestamps.
This is a classic application of queues as "sliding time windows."
"""
def __init__(self):
self._queue = deque()
def ping(self, t: int) -> int:
self._queue.append(t)
# Remove all requests older than 3000ms
while self._queue[0] < t - 3000:
self._queue.popleft()
return len(self._queue)
# Example
rc = RecentCounter()
print(rc.ping(1)) # 1 — window [-2999, 1], only 1
print(rc.ping(100)) # 2 — window [-2900, 100], has 1, 100
print(rc.ping(3001)) # 3 — window [1, 3001], has 1, 100, 3001
print(rc.ping(3002)) # 3 — window [2, 3002], has 100, 3001, 3002
4.2 deque vs list: Comprehensive Performance Comparison
Let us perform a thorough performance comparison:
import time
from collections import deque
def test_append_right(n):
"""Right-end append: list vs deque"""
# list
lst = []
start = time.perf_counter()
for i in range(n):
lst.append(i)
t_list = time.perf_counter() - start
# deque
dq = deque()
start = time.perf_counter()
for i in range(n):
dq.append(i)
t_deque = time.perf_counter() - start
return t_list, t_deque
def test_append_left(n):
"""Left-end append: list vs deque"""
# list
lst = []
start = time.perf_counter()
for i in range(n):
lst.insert(0, i) # O(n)!
t_list = time.perf_counter() - start
# deque
dq = deque()
start = time.perf_counter()
for i in range(n):
dq.appendleft(i) # O(1)
t_deque = time.perf_counter() - start
return t_list, t_deque
def test_pop_left(n):
"""Left-end pop: list vs deque"""
# list
lst = list(range(n))
start = time.perf_counter()
for i in range(n):
lst.pop(0) # O(n)!
t_list = time.perf_counter() - start
# deque
dq = deque(range(n))
start = time.perf_counter()
for i in range(n):
dq.popleft() # O(1)
t_deque = time.perf_counter() - start
return t_list, t_deque
def test_random_access(n):
"""Random access: list vs deque"""
lst = list(range(n))
dq = deque(range(n))
import random
indices = [random.randint(0, n-1) for _ in range(10000)]
start = time.perf_counter()
for i in indices:
_ = lst[i] # O(1)
t_list = time.perf_counter() - start
start = time.perf_counter()
for i in indices:
_ = dq[i] # O(n)!
t_deque = time.perf_counter() - start
return t_list, t_deque
n = 100_000
print("=" * 60)
print(f"Performance comparison: list vs deque (n={n:,})")
print("=" * 60)
t_list, t_deque = test_append_right(n)
print(f"Append right: list={t_list:.4f}s deque={t_deque:.4f}s "
f"winner={'tie' if abs(t_list-t_deque)/max(t_list,t_deque) < 0.2 else 'list' if t_list < t_deque else 'deque'}")
t_list, t_deque = test_append_left(n)
print(f"Append left: list={t_list:.4f}s deque={t_deque:.4f}s "
f"deque is {t_list/t_deque:.0f}x faster")
t_list, t_deque = test_pop_left(n)
print(f"Pop left: list={t_list:.4f}s deque={t_deque:.4f}s "
f"deque is {t_list/t_deque:.0f}x faster")
t_list, t_deque = test_random_access(n)
print(f"Random access: list={t_list:.4f}s deque={t_deque:.4f}s "
f"list is {t_deque/t_list:.0f}x faster")
Typical results:
============================================================
Performance comparison: list vs deque (n=100,000)
============================================================
Append right: list=0.0052s deque=0.0048s winner=tie
Append left: list=2.1500s deque=0.0048s deque is 448x faster
Pop left: list=1.8900s deque=0.0041s deque is 461x faster
Random access: list=0.0003s deque=0.2100s list is 700x faster
Summary of findings:
| Operation | list | deque | Winner |
|---|---|---|---|
| Right-end append/pop | O(1) amortized | O(1) | Tie |
| Left-end insert/pop | O(n) | O(1) | deque |
Random access [i] |
O(1) | O(n) | list |
| Memory contiguity | Contiguous | Block-based | list slightly better |
Selection rules:
- Only operating at the right end → use list (simpler, O(1) random access)
- Need left-end operations → use deque
- Need random access → use list
- Implementing a queue → always use deque
4.3 When to Use a Queue vs. When to Use a Stack
This is a seemingly simple but genuinely deep question. The core difference lies in the order of search/processing:
| Property | Stack | Queue |
|---|---|---|
| Order | LIFO — last in, first out | FIFO — first in, first out |
| Search strategy | DFS — depth-first | BFS — breadth-first |
| Search behavior | Go deep first, then backtrack | Expand layer by layer |
| Finds shortest path? | Not guaranteed | Guaranteed (unweighted graphs) |
| Space complexity | O(max depth) | O(max width) |
| Typical applications | Recursion, backtracking, bracket matching | Scheduling, level-order traversal, shortest path |
DFS vs BFS selection guide:
- Finding shortest path → BFS. DFS may find a very long path, while BFS guarantees the shortest.
- Checking reachability (only need to know if reachable) → Either works. DFS typically uses less space (recursion stack).
- Search space is very deep but answer is shallow → BFS finds it faster.
- Search space is very wide but answer is deep → DFS saves space.
- Need to enumerate all possibilities → Either works, depends on problem structure.
from collections import deque
def solve_maze_bfs(maze, start, end):
"""BFS maze solving — guaranteed shortest path"""
rows, cols = len(maze), len(maze[0])
visited = {start}
queue = deque([(start, [start])])
while queue:
(r, c), path = queue.popleft()
if (r, c) == end:
return path # First arrival is guaranteed shortest!
for dr, dc in [(0,1), (0,-1), (1,0), (-1,0)]:
nr, nc = r + dr, c + dc
if (0 <= nr < rows and 0 <= nc < cols
and maze[nr][nc] == 0 and (nr, nc) not in visited):
visited.add((nr, nc))
queue.append(((nr, nc), path + [(nr, nc)]))
return None # no solution
def solve_maze_dfs(maze, start, end):
"""DFS maze solving — finds a path, but not guaranteed shortest"""
rows, cols = len(maze), len(maze[0])
visited = set()
def dfs(r, c, path):
if (r, c) == end:
return path
visited.add((r, c))
for dr, dc in [(0,1), (0,-1), (1,0), (-1,0)]:
nr, nc = r + dr, c + dc
if (0 <= nr < rows and 0 <= nc < cols
and maze[nr][nc] == 0 and (nr, nc) not in visited):
result = dfs(nr, nc, path + [(nr, nc)])
if result:
return result
return None
return dfs(start[0], start[1], [start])
# 0 = path, 1 = wall
maze = [
[0, 0, 1, 0, 0],
[0, 1, 0, 1, 0],
[0, 0, 0, 0, 0],
[1, 1, 0, 1, 0],
[0, 0, 0, 1, 0]
]
bfs_path = solve_maze_bfs(maze, (0, 0), (4, 4))
dfs_path = solve_maze_dfs(maze, (0, 0), (4, 4))
print(f"BFS path length: {len(bfs_path)}") # shortest
print(f"DFS path length: {len(dfs_path)}") # may be longer
Another practical consideration: recursion depth limits.
Python's default recursion depth limit is 1000. If your DFS search space exceeds 1000 levels, recursive DFS will raise RecursionError. Solutions:
sys.setrecursionlimit(larger_number)— Simple but inelegant; may cause stack overflow.- Use an explicit stack instead of recursion — Convert DFS to an iterative version with a manually managed stack.
- Switch to BFS — If the problem permits.
def dfs_iterative(graph, start):
"""Iterative DFS — using explicit stack instead of recursion"""
visited = set()
stack = [start]
while stack:
node = stack.pop() # stack: last in, first out
if node in visited:
continue
visited.add(node)
print(node, end=" ")
for neighbor in graph[node]:
if neighbor not in visited:
stack.append(neighbor)
return visited
Note: Iterative DFS and recursive DFS may not have exactly the same traversal order (depending on the order neighbors are pushed), but both are depth-first.
4.4 Summary of Edge Cases and Pitfalls
Pitfall 1: deque's maxlen parameter
from collections import deque
# maxlen automatically discards elements from the other end!
dq = deque(maxlen=3)
dq.append(1)
dq.append(2)
dq.append(3)
dq.append(4) # 1 is automatically discarded!
print(dq) # deque([2, 3, 4], maxlen=3)
dq.appendleft(0) # 4 is automatically discarded!
print(dq) # deque([0, 2, 3], maxlen=3)
This feature can be used to implement a "fixed-size sliding window," but if you do not expect this behavior, it can lead to hard-to-find bugs.
Pitfall 2: deque in multithreaded environments
collections.deque's append() and popleft() are thread-safe in CPython due to the GIL (individual operations are atomic). But this does not mean your code is thread-safe:
from collections import deque
dq = deque()
# ❌ This code is NOT thread-safe!
if len(dq) > 0: # Another thread may dequeue between check and operation
item = dq.popleft() # dq may already be empty
# ✅ Correct approach: use try/except
try:
item = dq.popleft()
except IndexError:
item = None
# ✅ Or use queue.Queue (has built-in locking)
import queue
q = queue.Queue()
Pitfall 3: Forgetting to mark visited in BFS
# ❌ Wrong: marking when dequeuing
while queue:
node = queue.popleft()
visited.add(node) # Too late! The same node may have been enqueued multiple times
for neighbor in graph[node]:
if neighbor not in visited:
queue.append(neighbor)
# ✅ Correct: marking when enqueuing
while queue:
node = queue.popleft()
for neighbor in graph[node]:
if neighbor not in visited:
visited.add(neighbor) # Mark when enqueuing
queue.append(neighbor)
Marking when dequeuing allows the same node to be enqueued multiple times (multiple neighbors point to it), wasting space and time. Marking when enqueuing ensures each node is enqueued at most once.
Pitfall 4: Size calculation in circular queues
# When rear >= front: size = rear - front
# When rear < front: size = rear - front + capacity
# Unified formula:
size = (rear - front + capacity) % capacity
This formula yields 0 when rear == front (empty) and capacity - 1 when (rear + 1) % capacity == front (full, since we sacrificed one slot).
Summary
A queue is the mathematical embodiment of "fairness." When you need to guarantee "first come, first served" semantics, a queue is your choice.
Core takeaways from this chapter:
- Use deque, not list — This is the most important queue practice in Python.
- Circular queues use modulo — Understanding
(index + 1) % capacitymeans understanding the essence of circularity. - BFS uses queues — Queues guarantee "layer-by-layer expansion" order, naturally finding shortest paths.
- Priority queues use heaps — Detailed in Chapter 12.
- Blocking queues decouple producers and consumers — This is the cornerstone of concurrent programming and distributed systems.
Queues appear simple, but their applications span from OS scheduling to distributed messaging systems, threading through every layer of abstraction in computer science. Understanding queues means understanding the fundamental computational pattern of "ordered processing."
Next steps:
- Chapter 10: Graph Traversal (complete coverage of BFS and DFS)
- Chapter 12: Heaps and Priority Queues
- Chapter 16: Advanced Graph Algorithms (Dijkstra and other weighted shortest paths)