Skip to content
C Codeloom
DSA

LeetCode Edit Distance: 2D DP Step by Step

LeetCode 72 fully unpacked — the three-operation recurrence, the 2D table, and the rolling-array space optimization that makes interviewers nod.

·5 min read · By Yash Kesharwani
Intermediate 10 min read

What you'll learn

  • The three-operation Levenshtein recurrence
  • How to set up the 2D base cases without bugs
  • The rolling-array trick that drops space to O(n)
  • Why this template recurs across other 2D string DPs

Prerequisites

LeetCode 72 — Edit Distance — is the gold-standard 2D DP problem. There are exactly three allowed operations, exactly three transitions, and exactly one recurrence. Once you have written it correctly once, you have written Longest Common Subsequence, Distinct Subsequences, Interleaving String, and Regular Expression Matching once too. The pattern transfers.

The Problem

Given two strings word1 and word2, return the minimum number of operations to convert word1 into word2. Each operation is one of:

  • Insert a character.
  • Delete a character.
  • Replace one character with another.

This minimum is the Levenshtein distance.

Brute Force

Recurse from the back. Let f(i, j) be the edit distance between word1[:i] and word2[:j]. If word1[i-1] == word2[j-1], no operation is needed at this position and the answer is f(i-1, j-1). Otherwise take the best of three: insert (f(i, j-1) + 1), delete (f(i-1, j) + 1), replace (f(i-1, j-1) + 1).

def minDistance(word1, word2):
    def f(i, j):
        if i == 0:
            return j
        if j == 0:
            return i
        if word1[i - 1] == word2[j - 1]:
            return f(i - 1, j - 1)
        return 1 + min(f(i - 1, j), f(i, j - 1), f(i - 1, j - 1))
    return f(len(word1), len(word2))

Without memoization this is exponential — every call branches three ways for non-matching characters.

Optimal Solution

Tabulate. Let dp[i][j] be the edit distance between the first i characters of word1 and the first j characters of word2. The base cases are dp[0][j] = j (all insertions) and dp[i][0] = i (all deletions). The recurrence matches the recursion above.

def minDistance(word1, word2):
    m, n = len(word1), len(word2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    for i in range(m + 1):
        dp[i][0] = i
    for j in range(n + 1):
        dp[0][j] = j
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i - 1] == word2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1]
            else:
                dp[i][j] = 1 + min(dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1])
    return dp[m][n]

Each row depends only on the previous row and the current row’s left neighbor, so a rolling array drops space from O(m * n) to O(n):

def minDistance(word1, word2):
    m, n = len(word1), len(word2)
    prev = list(range(n + 1))
    for i in range(1, m + 1):
        curr = [i] + [0] * n
        for j in range(1, n + 1):
            if word1[i - 1] == word2[j - 1]:
                curr[j] = prev[j - 1]
            else:
                curr[j] = 1 + min(prev[j], curr[j - 1], prev[j - 1])
        prev = curr
    return prev[n]

Walk Through an Example

Convert "horse" to "ros".

The DP table fills as follows (rows = horse, cols = ros):

        ''  r  o  s
  ''     0  1  2  3
  h      1  1  2  3
  o      2  2  1  2
  r      3  2  2  2
  s      4  3  3  2
  e      5  4  4  3

Final answer: dp[5][3] = 3. One witnessing edit sequence: replace h with r, delete r, delete e. Three operations.

The replace step at dp[1][1] = 1 is the most instructive cell to inspect — it pulls from dp[0][0] = 0 and adds one.

Edge Cases

  • Either string empty — answer equals the length of the other string. The base row and column handle this.
  • Identical strings — the diagonal stays at zero throughout. Worth confirming with a quick mental run.
  • Strings that share no characters — every cell except the base row/column hits the three-way min.
  • Unicode characters that are visually similar but encode differently — the algorithm operates on code points, not glyphs. If interviewing for a localization-heavy team, mention normalization.

Complexity Analysis

Time is O(m * n) for both the 2D and rolling-array versions. Space is O(m * n) for the full table and O(min(m, n)) if you allocate the rolling array along the shorter dimension. The naive recursion is exponential without memoization and O(m * n) with it.

If “O(m * n) DP” is starting to feel like a familiar shape, that is the point — see Classic DP Problems for several siblings.

How to Explain It in an Interview

Start with the operations. Define dp[i][j] precisely. Articulate base cases explicitly — these are where candidates typically blow it. Walk through the recurrence for both the match case and the mismatch case, naming each of the three transitions (delete = up, insert = left, replace = diagonal).

Mention the rolling array as the immediate optimization. If asked whether you can do better than O(m * n), say “Not in general, though there is a Hyyro-Ukkonen bit-parallel algorithm for small alphabets — not interview-relevant.” This shows breadth without rambling.

  • LeetCode 1143 — Longest Common Subsequence (same shape, different recurrence)
  • LeetCode 583 — Delete Operation for Two Strings
  • LeetCode 97 — Interleaving String
  • LeetCode 115 — Distinct Subsequences

Once you can flip between these by changing only the recurrence body, you have internalized the 2D string DP template.

Wrap up

Edit Distance is the boss fight of 2D DP and the friendliest one to prepare for, because everything generalizes from it. Practice writing the table by hand for a small example; the muscle memory of “diagonal, up, left, plus one” is what carries you through the trickier follow-ups. If the recurrence still feels foreign, anchor it in the DP introduction first and come back.