Edit Distance (Levenshtein Distance)

Find the minimum number of operations (insert, delete, replace) required to transform one string into another.

Language Selection

Choose your preferred programming language

Showing: Python

Edit Distance (Levenshtein Distance)

Problem Statement

Given two strings word1 and word2, return the minimum number of operations required to convert word1 to word2.

You have the following three operations permitted on a word:

  • Insert a character
  • Delete a character
  • Replace a character

Examples

Example 1:

Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation: 
horse -> rorse (replace 'h' with 'r')
rorse -> rose (remove 'r')
rose -> ros (remove 'e')

Example 2:

Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation: 
intention -> inention (remove 't')
inention -> enention (replace 'i' with 'e')
enention -> exention (replace 'n' with 'x')
exention -> exection (replace 'n' with 'c')
exection -> execution (insert 'u')

Constraints

  • 0 <= word1.length, word2.length <= 500
  • word1 and word2 consist of lowercase English letters only

Approach 1: Dynamic Programming (2D Table)

Algorithm

Use a 2D DP table where dp[i][j] represents the minimum edit distance between the first i characters of word1 and the first j characters of word2.

Recurrence Relation:

  • If characters match: dp[i][j] = dp[i-1][j-1]
  • If characters don’t match: dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1])

Implementation

Python:

def min_distance(word1, word2):
    """
    Find minimum edit distance using 2D DP
    Time: O(m*n)
    Space: O(m*n)
    """
    m, n = len(word1), len(word2)
    
    # Create DP table
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Initialize base cases
    for i in range(m + 1):
        dp[i][0] = i  # word1 to empty string
    for j in range(n + 1):
        dp[0][j] = j  # empty string to word2
    
    # Fill DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                dp[i][j] = dp[i-1][j-1]
            else:
                dp[i][j] = 1 + min(
                    dp[i-1][j],    # delete from word1
                    dp[i][j-1],    # insert into word1
                    dp[i-1][j-1]   # replace in word1
                )
    
    return dp[m][n]

Java:

class Solution {
    public int minDistance(String word1, String word2) {
        int m = word1.length(), n = word2.length();
        int[][] dp = new int[m + 1][n + 1];
        
        // Initialize base cases
        for (int i = 0; i <= m; i++) {
            dp[i][0] = i;
        }
        for (int j = 0; j <= n; j++) {
            dp[0][j] = j;
        }
        
        // Fill DP table
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (word1.charAt(i-1) == word2.charAt(j-1)) {
                    dp[i][j] = dp[i-1][j-1];
                } else {
                    dp[i][j] = 1 + Math.min(
                        dp[i-1][j],
                        Math.min(dp[i][j-1], dp[i-1][j-1])
                    );
                }
            }
        }
        
        return dp[m][n];
    }
}

Complexity Analysis

  • Time Complexity: O(m*n) where m and n are lengths of the strings
  • Space Complexity: O(m*n) for the DP table

Key Insights

  1. Dynamic Programming: This is a classic 2D DP problem with optimal substructure
  2. Three Operations: Insert, delete, and replace operations correspond to different DP transitions
  3. Base Cases: Empty string transformations are the base cases
  4. Recurrence Relation: The key insight is the recurrence relation for different operations

Edge Cases

  1. Empty strings: One or both strings are empty
  2. Identical strings: Edit distance is 0
  3. Single character: One string has one character
  4. Completely different: No common characters

Test Cases

# Test cases
assert min_distance("horse", "ros") == 3
assert min_distance("intention", "execution") == 5
assert min_distance("", "a") == 1
assert min_distance("a", "") == 1
assert min_distance("", "") == 0
assert min_distance("abc", "abc") == 0

Follow-up Questions

  1. Edit Distance with Costs: Different costs for insert, delete, replace
  2. Longest Common Subsequence: Find LCS between two strings
  3. Edit Distance Path: Return the actual sequence of operations
  4. Approximate String Matching: Find all substrings within edit distance k

Common Mistakes

  1. Index confusion: Off-by-one errors in DP table indexing
  2. Base case initialization: Incorrect initialization of DP table
  3. Recurrence relation: Wrong formula for the three operations
  4. Boundary conditions: Not handling empty strings properly

Interview Tips

  1. Start with examples: Walk through the DP table construction
  2. Explain the recurrence: Clearly explain why each operation corresponds to a specific transition
  3. Handle edge cases: Always consider empty strings and identical strings
  4. Trace through algorithm: Show how the DP table gets filled