Graph Traversals: BFS and DFS
A practical guide to BFS and DFS on graphs — recursive and iterative DFS, BFS with a deque, shortest paths on unweighted graphs, connected components, cycle detection, and five classic practice problems.
What you'll learn
- ✓BFS with a deque and the visited-set pattern
- ✓DFS recursive and iterative — when to use each
- ✓Shortest path on an unweighted graph via BFS
- ✓Counting connected components
- ✓Cycle detection in directed and undirected graphs
- ✓Five classic practice problems solved end-to-end
Prerequisites
- •Read Graphs: Introduction
Once you can store a graph as an adjacency list, you can traverse it. BFS and DFS are the two universal tools — between them they solve a startling fraction of graph problems. This post covers both, the variations that come up in interviews, and a handful of classic problems where they shine.
The setup
We’ll use this graph throughout:
1 --- 2
| |
3 --- 4 --- 5
|
6
graph = {
1: [2, 3],
2: [1, 4],
3: [1, 4],
4: [2, 3, 5, 6],
5: [4],
6: [4],
}
BFS — Breadth-First Search
BFS explores the graph level by level: all vertices at distance 1 from the start, then all at distance 2, and so on. It uses a queue (FIFO) and a visited set to avoid revisiting.
from collections import deque
def bfs(graph, start):
visited = {start}
order = []
q = deque([start])
while q:
node = q.popleft()
order.append(node)
for nbr in graph[node]:
if nbr not in visited:
visited.add(nbr)
q.append(nbr)
return order
print(bfs(graph, 1)) # [1, 2, 3, 4, 5, 6]
Two things to get right every time:
- Mark a node visited when you enqueue it, not when you dequeue it. Otherwise the same node can land in the queue multiple times.
- Use
collections.deque, not a list.popleft()isO(1)on a deque,O(n)on a list.
DFS — Depth-First Search
DFS goes as deep as possible before backtracking. It uses a stack — explicit or via recursion — and the same visited set.
Recursive DFS
def dfs_recursive(graph, start):
visited = set()
order = []
def visit(node):
visited.add(node)
order.append(node)
for nbr in graph[node]:
if nbr not in visited:
visit(nbr)
visit(start)
return order
print(dfs_recursive(graph, 1)) # [1, 2, 4, 3, 5, 6] (order depends on adjacency)
This is the form that reads most naturally — it’s what most people write first. The recursion depth is the longest DFS path, which is O(V) in the worst case. Watch out for Python’s recursion limit on very large or chained graphs.
Iterative DFS
For deep graphs, an explicit stack avoids hitting the recursion limit.
def dfs_iterative(graph, start):
visited = {start}
order = []
stack = [start]
while stack:
node = stack.pop()
order.append(node)
for nbr in graph[node]:
if nbr not in visited:
visited.add(nbr)
stack.append(nbr)
return order
The only difference from BFS is stack.pop() (LIFO) instead of q.popleft() (FIFO). One byte of code, fundamentally different traversal order.
Try it yourself. Run BFS and iterative DFS on the example graph starting from vertex 4. Predict the visitation order before running. Notice how DFS often produces a long chain (4, 6, 5, …) while BFS produces a wide sweep.
Shortest path on an unweighted graph
BFS gives shortest paths by edge count “for free” — the first time you discover a vertex is the shortest way to reach it. Track parents (or distances) as you go.
def shortest_path(graph, start, end):
if start == end:
return [start]
parent = {start: None}
q = deque([start])
while q:
node = q.popleft()
for nbr in graph[node]:
if nbr not in parent:
parent[nbr] = node
if nbr == end:
# reconstruct
path = [end]
while parent[path[-1]] is not None:
path.append(parent[path[-1]])
return path[::-1]
q.append(nbr)
return None # not reachable
print(shortest_path(graph, 1, 6)) # [1, 2, 4, 6]
This works because BFS visits vertices in order of increasing distance. The first time end is discovered, the path through parent is guaranteed shortest.
This only works for unweighted graphs. For weighted graphs you need Dijkstra (non-negative weights) or Bellman-Ford (any weights).
Connected components
A connected component is a maximal set of vertices reachable from each other. Count them by running BFS/DFS from every unvisited vertex:
def count_components(graph):
visited = set()
components = 0
for start in graph:
if start in visited:
continue
components += 1
# BFS to mark the component
q = deque([start])
visited.add(start)
while q:
node = q.popleft()
for nbr in graph[node]:
if nbr not in visited:
visited.add(nbr)
q.append(nbr)
return components
print(count_components(graph)) # 1
The outer loop ensures you start a fresh traversal from every island. Each vertex is visited once across all traversals, so the whole thing is O(V + E).
Cycle detection
Undirected graphs
A cycle exists if during DFS you reach a visited vertex that isn’t your immediate parent.
def has_cycle_undirected(graph):
visited = set()
def dfs(node, parent):
visited.add(node)
for nbr in graph[node]:
if nbr not in visited:
if dfs(nbr, node):
return True
elif nbr != parent:
return True
return False
for start in graph:
if start not in visited:
if dfs(start, None):
return True
return False
The nbr != parent check is essential — without it, the edge you just came from would look like a cycle.
Directed graphs
In a directed graph, “visited” alone isn’t enough. You need three states: unseen, in the current DFS stack, fully processed. A back-edge to an ancestor on the current stack signals a cycle.
WHITE, GRAY, BLACK = 0, 1, 2
def has_cycle_directed(graph):
color = {v: WHITE for v in graph}
def dfs(node):
color[node] = GRAY
for nbr in graph.get(node, []):
if color.get(nbr, WHITE) == GRAY:
return True
if color.get(nbr, WHITE) == WHITE and dfs(nbr):
return True
color[node] = BLACK
return False
return any(color[v] == WHITE and dfs(v) for v in graph)
This three-color DFS is the standard approach. It’s also the foundation of topological sort.
Complexity
Both BFS and DFS visit each vertex and each edge a constant number of times:
- Time:
O(V + E). - Space:
O(V)for the visited set plusO(V)for the queue/stack.
This is as fast as it gets — you can’t beat looking at every vertex and every edge at least once.
Quick decision guide
| Problem | Use |
|---|---|
| Shortest path on unweighted graph | BFS |
| Level-by-level processing | BFS |
| Find any path, any reachable vertex | Either |
| Cycle detection | DFS |
| Topological sort | DFS |
| Connected components | Either |
| Path counting / all paths between two nodes | DFS |
| Memory-constrained wide graph | DFS |
| Memory-constrained deep graph | BFS |
Practice Problems
Five classic problems where these tools shine.
Problem 1: Number of Islands
Problem. Given a 2D grid of '1's (land) and '0's (water), count the number of islands. An island is land connected horizontally or vertically.
Approach. Treat each cell as a vertex; orthogonal land cells are neighbors. Flood-fill each unvisited land cell with DFS or BFS, incrementing the count.
def num_islands(grid):
if not grid:
return 0
rows, cols = len(grid), len(grid[0])
visited = set()
def dfs(r, c):
if (r, c) in visited:
return
if r < 0 or r >= rows or c < 0 or c >= cols:
return
if grid[r][c] == '0':
return
visited.add((r, c))
dfs(r+1, c); dfs(r-1, c); dfs(r, c+1); dfs(r, c-1)
count = 0
for r in range(rows):
for c in range(cols):
if grid[r][c] == '1' and (r, c) not in visited:
count += 1
dfs(r, c)
return count
grid = [
['1','1','0','0'],
['1','0','0','1'],
['0','0','1','1'],
['0','0','0','0'],
]
print(num_islands(grid)) # 2
Time O(rows * cols).
Problem 2: Clone Graph
Problem. Given a reference to a node in a connected undirected graph (each node has a value and a list of neighbors), return a deep copy of the graph.
Approach. BFS or DFS, keeping a dict that maps original nodes to their clones. When you visit a neighbor for the first time, create its clone before recursing.
class Node:
def __init__(self, val, neighbors=None):
self.val = val
self.neighbors = neighbors or []
def clone_graph(node):
if node is None:
return None
cloned = {node: Node(node.val)}
q = deque([node])
while q:
cur = q.popleft()
for nbr in cur.neighbors:
if nbr not in cloned:
cloned[nbr] = Node(nbr.val)
q.append(nbr)
cloned[cur].neighbors.append(cloned[nbr])
return cloned[node]
The key invariant: a node is in cloned if-and-only-if its clone has been created. Building neighbor links is safe once both endpoints exist.
Problem 3: Course Schedule
Problem. Given numCourses and a list of prerequisite pairs [a, b] meaning “to take a you must take b first,” return True if you can finish all courses.
Approach. Build a directed graph from prerequisites. The answer is “can finish” iff the graph has no cycle. Use the three-color DFS.
def can_finish(num_courses, prereqs):
graph = defaultdict(list)
for a, b in prereqs:
graph[b].append(a) # b -> a
WHITE, GRAY, BLACK = 0, 1, 2
color = [WHITE] * num_courses
def has_cycle(node):
color[node] = GRAY
for nbr in graph[node]:
if color[nbr] == GRAY:
return True
if color[nbr] == WHITE and has_cycle(nbr):
return True
color[node] = BLACK
return False
for v in range(num_courses):
if color[v] == WHITE and has_cycle(v):
return False
return True
print(can_finish(2, [[1, 0]])) # True
print(can_finish(2, [[1, 0], [0, 1]])) # False
Time O(V + E).
Problem 4: Pacific Atlantic Water Flow
Problem. Given a m x n matrix of heights, water can flow from a cell to any neighbor with equal or lower height. The Pacific touches the top and left edges; the Atlantic touches the bottom and right. Return all cells from which water can reach both oceans.
Approach. Reverse the question — instead of “can this cell reach an ocean,” ask “which cells can the ocean reach if water flowed uphill?” Run BFS/DFS from each ocean’s border cells with the reversed comparison. The intersection of the two reachable sets is the answer.
def pacific_atlantic(heights):
if not heights:
return []
rows, cols = len(heights), len(heights[0])
def bfs(starts):
seen = set(starts)
q = deque(starts)
while q:
r, c = q.popleft()
for dr, dc in [(1,0),(-1,0),(0,1),(0,-1)]:
nr, nc = r+dr, c+dc
if (0 <= nr < rows and 0 <= nc < cols
and (nr, nc) not in seen
and heights[nr][nc] >= heights[r][c]):
seen.add((nr, nc))
q.append((nr, nc))
return seen
pacific = [(0, c) for c in range(cols)] + [(r, 0) for r in range(rows)]
atlantic = [(rows-1, c) for c in range(cols)] + [(r, cols-1) for r in range(rows)]
return [list(cell) for cell in bfs(pacific) & bfs(atlantic)]
Time O(rows * cols).
Problem 5: Rotting Oranges
Problem. In an m x n grid, 0 is empty, 1 is a fresh orange, 2 is a rotten orange. Every minute, any fresh orange adjacent (4-directionally) to a rotten one becomes rotten. Return the minimum minutes until no fresh oranges remain, or -1 if impossible.
Approach. Multi-source BFS. Start by enqueuing all initially rotten oranges. Each BFS layer corresponds to one minute. Track fresh count; if any are left at the end, return -1.
def oranges_rotting(grid):
rows, cols = len(grid), len(grid[0])
q = deque()
fresh = 0
for r in range(rows):
for c in range(cols):
if grid[r][c] == 2:
q.append((r, c, 0))
elif grid[r][c] == 1:
fresh += 1
minutes = 0
while q:
r, c, t = q.popleft()
minutes = max(minutes, t)
for dr, dc in [(1,0),(-1,0),(0,1),(0,-1)]:
nr, nc = r+dr, c+dc
if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == 1:
grid[nr][nc] = 2
fresh -= 1
q.append((nr, nc, t+1))
return minutes if fresh == 0 else -1
The multi-source pattern — push all sources at the start — is the trick that turns “from every rotten orange” into a single linear pass.
Try it yourself. Take any of the five problems above and rewrite it using the other traversal (BFS where I used DFS, or vice-versa). Notice where one is clearly the right tool — Rotting Oranges in particular needs BFS for the layer-based timing.
Recap
You now know:
- BFS uses a deque and visits vertices in order of distance from the start
- DFS uses a stack (or recursion) and dives as deep as possible
- Mark vertices as visited when you enqueue/push, not when you pop
- BFS gives shortest paths on unweighted graphs
- DFS underpins cycle detection and topological sort; remember the three-color trick for directed cycles
- Multi-source BFS turns “from any of these starts” into a single traversal
Next steps
You now have a working toolkit for trees and graphs. The next areas to explore — weighted shortest paths with Dijkstra, minimum spanning trees with Kruskal and Prim, and topological sort for DAGs — all build directly on BFS, DFS, and adjacency lists.
Questions or feedback? Email codeloomdevv@gmail.com.