LeetCode Word Ladder: BFS on Word Graphs
LeetCode 127 walked through — the wildcard-pattern adjacency trick, why BFS is mandatory, and the bidirectional speedup.
What you'll learn
- ✓How to build an implicit graph over words
- ✓Why BFS — not DFS — gives the shortest ladder
- ✓The wildcard-pattern preprocessing trick
- ✓How bidirectional BFS cuts work roughly in half (in exponent)
Prerequisites
- •Graph basics — see Graphs: Introduction
- •BFS comfort — see BFS and DFS
LeetCode 127 — Word Ladder — is the canonical example of a problem where the graph is not given to you. You have to see one. Words are nodes, single-letter substitutions are edges, and shortest path is BFS. The interesting design choice is how you enumerate neighbors fast enough to survive a large word list.
The Problem
Given two words beginWord and endWord and a dictionary wordList, return the length of the shortest transformation sequence from beginWord to endWord such that:
- Only one letter changes per step.
- Every intermediate word is in
wordList. beginWorddoes not need to be inwordList, butendWorddoes.
Return 0 if no such sequence exists. The length includes both endpoints.
Brute Force
For every node you visit, scan the entire word list and check which words differ by exactly one letter. With dictionary size N and word length L, neighbor generation costs O(N * L) per node. Total work climbs toward O(N^2 * L).
from collections import deque
def ladderLength(beginWord, endWord, wordList):
words = set(wordList)
if endWord not in words:
return 0
q = deque([(beginWord, 1)])
visited = {beginWord}
while q:
word, dist = q.popleft()
if word == endWord:
return dist
for cand in list(words):
if sum(a != b for a, b in zip(word, cand)) == 1 and cand not in visited:
visited.add(cand)
q.append((cand, dist + 1))
return 0
Correct, but it dies on the test cases LeetCode actually ships.
Optimal Solution
Precompute a map from wildcard pattern to words. For each word, replace each letter with * and put the word into the bucket for that pattern. Two words share a bucket exactly when they differ in one position.
from collections import deque, defaultdict
def ladderLength(beginWord, endWord, wordList):
words = set(wordList)
if endWord not in words:
return 0
L = len(beginWord)
buckets = defaultdict(list)
for w in words:
for i in range(L):
buckets[w[:i] + "*" + w[i+1:]].append(w)
q = deque([(beginWord, 1)])
visited = {beginWord}
while q:
word, dist = q.popleft()
if word == endWord:
return dist
for i in range(L):
pat = word[:i] + "*" + word[i+1:]
for nbr in buckets[pat]:
if nbr not in visited:
visited.add(nbr)
q.append((nbr, dist + 1))
buckets[pat] = [] # avoid revisiting via this pattern
return 0
Clearing the bucket after use is a small but real speed win — it prevents the same pattern from being revisited deeper in the BFS.
Walk Through an Example
Take beginWord = "hit", endWord = "cog", wordList = ["hot", "dot", "dog", "lot", "log", "cog"].
- Build buckets:
"*ot"->["hot", "dot", "lot"],"h*t"->["hot"],"do*"->["dot", "dog"], and so on. - BFS from
hitwith distance 1.- Patterns for
hit:"*it","h*t","hi*". The only hit is"h*t"->"hot". Enqueue (hot, 2).
- Patterns for
- Pop
hot. Patterns hit"hot","dot","lot". Enqueue (dot, 3), (lot, 3). - Pop
dot. Patterns yield"dog". Enqueue (dog, 4). - Pop
lot. Patterns yield"log". Enqueue (log, 4). - Pop
dog. Patterns include"*og", which leads to"cog". Enqueue (cog, 5). - Pop
log."*og"bucket already cleared. - Pop
cog. Match. Return 5.
Edge Cases
endWordnot in dictionary — return 0 up front. This avoids wasted BFS.beginWord == endWord— by problem convention, the answer is 1, though some interview variants treat this as 0. Clarify before coding.- Empty word list — return 0.
- Multiple shortest ladders — return the length; the path itself is LeetCode 126.
Complexity Analysis
Let N be the dictionary size and L the word length. Bucket construction is O(N * L^2) — there are L patterns per word and each pattern is built in O(L). BFS visits each word at most once and inspects L patterns per pop, each touching a bucket of total size bounded by O(N). Time bound: O(N * L^2). Space is O(N * L^2) for the bucket map.
Bidirectional BFS — run BFS from both ends and stop when the frontiers meet — cuts the explored set roughly square root in size. Worth mentioning in interviews, though the implementation has more moving parts.
How to Explain It in an Interview
Start by saying the words form an implicit graph: edge if they differ by one letter, shortest path by BFS. Then volunteer the cost problem: “Neighbor generation is the bottleneck — naive scan is O(N) per node. I’ll preprocess wildcard patterns so neighbor lookup is O(L) per node.” Sketch the bucket map. Mention the bucket-clearing optimization.
If the interviewer asks “can you do better,” answer bidirectional BFS. If they ask about all shortest ladders (LeetCode 126), say the BFS becomes a layered graph and the reconstruction is the hard part.
Related Problems
- LeetCode 126 — Word Ladder II (return every shortest sequence)
- LeetCode 433 — Minimum Genetic Mutation (same structure, smaller alphabet)
- LeetCode 752 — Open the Lock (BFS on strings with single-position changes)
- LeetCode 815 — Bus Routes (BFS on a constructed graph)
The last one is a great follow-up because it forces you to design a different node type from the obvious one.
Wrap up
Word Ladder is the test of whether you can see the graph that is not drawn. The wildcard bucket is the tactical insight; the strategic insight is BFS for shortest path on unweighted graphs. Pair this problem with BFS and DFS and you have a complete answer to half of the graph problems on a typical interview list.