Problem Overview
LeetCode 72 Edit Distance is one of the most celebrated dynamic programming problems. Given two strings word1 and word2, you must find the minimum number of operations required to convert word1 into word2. The allowed operations are: insert a character, delete a character, or replace a character. The classic example is converting "horse" to "ros" in 3 operations.
Edit distance appears throughout computer science: spell checkers use it to suggest corrections, diff tools use it to compare files line by line, DNA sequence alignment uses it to measure similarity between genomes, and natural language processing uses it for fuzzy string matching. Understanding LeetCode 72 gives you a window into all of these applications.
The problem is tagged Hard on LeetCode but the DP structure is clean and follows a consistent pattern. Once you internalize the three transitions — insert, delete, replace — and understand how they map to movements in the 2D table, the solution writes itself. The key challenge is setting up the base cases and avoiding off-by-one errors.
- Given two strings word1 and word2, return the minimum number of operations to convert word1 to word2
- Allowed operations: insert a character, delete a character, replace a character
- Each operation counts as 1 step regardless of which character is involved
- Classic example: "horse" → "ros" requires 3 operations (edit distance = 3)
- Constraints: 0 <= word1.length, word2.length <= 500
- Both strings consist of lowercase English letters only
2D DP Table
Define dp[i][j] as the minimum number of operations needed to convert word1[0..i-1] into word2[0..j-1] — that is, the first i characters of word1 into the first j characters of word2. The table has (m+1) rows and (n+1) columns, where m = len(word1) and n = len(word2). The extra row and column handle the base cases of empty prefixes.
The recurrence has two cases. If word1[i-1] == word2[j-1] (the current characters match), no operation is needed for this character pair and dp[i][j] = dp[i-1][j-1]. If the characters differ, we take the minimum of three sub-problems and add 1: dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]). Each of those three neighbors corresponds to a different operation.
Fill the table row by row from top-left to bottom-right. Every cell depends only on the cell directly above, to the left, and diagonally above-left — all of which are already computed when you reach cell (i, j). The final answer is dp[m][n], the bottom-right corner representing the full transformation of word1 into word2.
Three Operations Map Directly to Three DP Transitions
The three DP neighbors map directly to the three operations: dp[i][j-1]+1 is insert (we insert a character into word1 to match word2[j-1], advancing j by 1), dp[i-1][j]+1 is delete (we delete word1[i-1] from word1, advancing i by 1), and dp[i-1][j-1]+1 is replace (we replace word1[i-1] with word2[j-1], advancing both i and j). When characters match, dp[i-1][j-1]+0 — a match costs nothing.
Base Cases
The base cases correspond to converting between a non-empty string and an empty string. dp[0][j] = j means: to convert an empty word1 prefix into the first j characters of word2, you need exactly j insert operations — insert each of the j characters one by one. dp[i][0] = i means: to convert the first i characters of word1 into an empty word2 prefix, you need exactly i delete operations.
Concretely, the first row dp[0][0..n] is initialized to [0, 1, 2, ..., n] and the first column dp[0..m][0] is initialized to [0, 1, 2, ..., m]. These represent the cost of going from or to an empty string. Every other cell in the table is filled by the recurrence after these borders are set.
A common mistake is initializing the wrong dimension or forgetting the extra row and column. Remember: the dp table is (m+1) x (n+1), not m x n. Row 0 represents an empty word1 (zero characters), not the first character. Column 0 represents an empty word2. Getting this right is the foundation for the rest of the solution.
- 1Create a (m+1) x (n+1) table dp initialized to zeros
- 2Initialize first row: dp[0][j] = j for j from 0 to n (cost to insert j characters into empty string)
- 3Initialize first column: dp[i][0] = i for i from 0 to m (cost to delete i characters down to empty string)
- 4For i from 1 to m, for j from 1 to n:
- 5 If word1[i-1] == word2[j-1]: dp[i][j] = dp[i-1][j-1] (characters match, zero cost)
- 6 Else: dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) (delete, insert, or replace)
- 7Return dp[m][n]
Connection to LCS
Edit distance and longest common subsequence (LCS) are deeply related problems. When only insert and delete operations are allowed (no replace), the minimum edit distance equals m + n - 2 * LCS(word1, word2). The intuition is that we keep all LCS characters in place, delete the non-LCS characters from word1, and insert the non-LCS characters from word2. The LCS characters never need to be touched.
When replace operations are allowed (the full edit distance), the relationship is more subtle. Replace is strictly more powerful than a delete followed by an insert (two operations reduced to one), so the edit distance with replace is always less than or equal to the insert/delete-only distance. The DP structures for both problems are analogous: both use 2D tables indexed by prefix lengths, and both have a character-match case that costs zero.
Understanding this connection clarifies why the edit distance DP feels similar to the LCS DP. In LCS, the match case takes dp[i-1][j-1]+1 (extend the sequence); in edit distance, the match case takes dp[i-1][j-1]+0 (no cost). The mismatch cases differ: LCS takes max of two neighbors, edit distance takes min of three neighbors plus 1. Same skeleton, different objective and cost structure.
Edit Distance is Levenshtein Distance — Used Everywhere in CS
Edit distance is also called Levenshtein distance after Vladimir Levenshtein who described it in 1965. It is used in spell checkers (suggest the word with minimum edit distance to the typo), DNA sequence alignment (measure genetic similarity), git diff (compute line-level changes between files), and NLP (measure string similarity for fuzzy matching). Understanding LeetCode 72 opens doors to all of these real-world algorithms — it is far more than a LeetCode problem.
Space Optimization
The 2D DP table uses O(m*n) space, but notice that when filling row i, you only need row i-1 and the current row being built. No earlier rows are referenced. This allows a space optimization to O(n) using a single 1D array representing the previous row, plus a variable to track the diagonal (dp[i-1][j-1]) before it is overwritten.
The trick with the diagonal: as you scan left to right updating dp[j], dp[j] before the update holds the value from the previous row (that is, dp[i-1][j]), and dp[j-1] already holds the current row value to the left (dp[i][j-1]). The diagonal dp[i-1][j-1] must be saved in a variable prev before dp[j-1] gets updated. After the update, store the old dp[j] in prev for the next column.
If you want to further optimize, observe that if m > n you can swap word1 and word2 (edit distance is symmetric — the same either way) to keep only O(min(m,n)) space in the array. This is the most space-efficient version: one rolling array of length min(m,n)+1 and one scalar variable for the diagonal.
- 1Initialize dp = [0, 1, 2, ..., n] (the base row for empty word1)
- 2For each row i from 1 to m:
- 3 Set prev = dp[0] (save diagonal before overwriting), then set dp[0] = i (base column value)
- 4 For each column j from 1 to n:
- 5 Save temp = dp[j] (this is dp[i-1][j], the old value before overwrite)
- 6 If word1[i-1] == word2[j-1]: dp[j] = prev (prev holds dp[i-1][j-1])
- 7 Else: dp[j] = 1 + min(dp[j], dp[j-1], prev)
- 8 Set prev = temp (prev becomes dp[i-1][j] for the next column)
- 9Return dp[n]
Code Walkthrough — Python and Java
Python 2D DP: m, n = len(word1), len(word2); dp = [[0]*(n+1) for _ in range(m+1)]; initialize first row and column; nested loops filling the rest. Python is expressive enough that the entire 2D version fits in under 10 lines. For the 1D version: dp = list(range(n+1)); outer loop over i; prev = dp[0], dp[0] = i; inner loop saving prev, updating dp[j]; return dp[n].
Java 2D DP: int[][] dp = new int[m+1][n+1]; initialize borders with for loops; nested for loops with the three-way min using Math.min(Math.min(dp[i-1][j], dp[i][j-1]), dp[i-1][j-1])+1; return dp[m][n]. Java 1D DP follows the same prev variable pattern as Python but requires explicit temp variables. Both versions are O(m*n) time; the 1D version is O(n) space.
The 2D version is recommended for interviews because it is easier to explain and trace. Write the table on a whiteboard, fill in base cases, then demonstrate the recurrence on one or two cells. The 1D version is a follow-up optimization you mention after nailing the 2D solution. Both Python and Java implementations should be practiced until you can write them from memory without looking up the recurrence.
Match Case is dp[i-1][j-1] With Cost Zero — NOT dp[i-1][j-1]+1
When word1[i-1] == word2[j-1], the correct transition is dp[i][j] = dp[i-1][j-1] with cost zero — NOT dp[i-1][j-1]+1. A character match means no operation is needed; you simply inherit the cost of converting the shorter prefixes. Confusing this with the replace case (which adds 1) doubles your edit distance for every matching character and produces completely wrong answers. Always write the match case separately and clearly as dp[i-1][j-1] with no addition.