Edit Distance

Find minimum operations to convert one string to another

Language Selection

Choose your preferred programming language

Showing: Python

Problem Statement

Given two strings word1 and word2, return the minimum number of operations required to convert word1 to word2.

You have the following three operations permitted on a word:

  • Insert a character
  • Delete a character
  • Replace a character

Input:

  • Two strings word1 and word2

Output:

  • Minimum number of operations to convert word1 to word2

Constraints:

  • 0 ≤ word1.length, word2.length ≤ 500
  • word1 and word2 consist of lowercase English letters

Examples:

Example 1:

Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation: 
horse -> rorse (replace 'h' with 'r')
rorse -> rose (remove 'r')
rose -> ros (remove 'e')

Example 2:

Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation: 
intention -> inention (remove 't')
inention -> enention (replace 'i' with 'e')
enention -> exention (replace 'n' with 'x')
exention -> exection (replace 'n' with 'c')
exection -> execution (insert 'u')

Solutions

Approach 1: 2D Dynamic Programming

Algorithm:

  1. Create 2D DP table where dp[i][j] = min operations to convert word1[0:i] to word2[0:j]
  2. Base cases: dp[i][0] = i (delete all), dp[0][j] = j (insert all)
  3. If characters match: dp[i][j] = dp[i-1][j-1]
  4. If characters don’t match: dp[i][j] = 1 + min(insert, delete, replace)

Python:

def minDistance(word1, word2):
    """
    2D DP solution for edit distance
    Time: O(m * n) where m, n are lengths of word1, word2
    Space: O(m * n) for DP table
    """
    m, n = len(word1), len(word2)
    
    # dp[i][j] = min operations to convert word1[0:i] to word2[0:j]
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Base cases
    # Convert empty string to word2[0:j] requires j insertions
    for j in range(n + 1):
        dp[0][j] = j
    
    # Convert word1[0:i] to empty string requires i deletions
    for i in range(m + 1):
        dp[i][0] = i
    
    # Fill DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                # Characters match, no operation needed
                dp[i][j] = dp[i-1][j-1]
            else:
                # Characters don't match, try all three operations
                insert_op = dp[i][j-1] + 1      # Insert word2[j-1]
                delete_op = dp[i-1][j] + 1      # Delete word1[i-1]
                replace_op = dp[i-1][j-1] + 1   # Replace word1[i-1] with word2[j-1]
                
                dp[i][j] = min(insert_op, delete_op, replace_op)
    
    return dp[m][n]

def minDistance_with_operations(word1, word2):
    """
    Edit distance with operation tracking
    Time: O(m * n)
    Space: O(m * n)
    """
    m, n = len(word1), len(word2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    ops = [[[] for _ in range(n + 1)] for _ in range(m + 1)]
    
    # Base cases
    for j in range(n + 1):
        dp[0][j] = j
        ops[0][j] = [f"Insert '{word2[k]}'" for k in range(j)]
    
    for i in range(m + 1):
        dp[i][0] = i
        ops[i][0] = [f"Delete '{word1[k]}'" for k in range(i)]
    
    # Fill DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                dp[i][j] = dp[i-1][j-1]
                ops[i][j] = ops[i-1][j-1]
            else:
                insert_cost = dp[i][j-1] + 1
                delete_cost = dp[i-1][j] + 1
                replace_cost = dp[i-1][j-1] + 1
                
                min_cost = min(insert_cost, delete_cost, replace_cost)
                dp[i][j] = min_cost
                
                if min_cost == replace_cost:
                    ops[i][j] = ops[i-1][j-1] + [f"Replace '{word1[i-1]}' with '{word2[j-1]}'"]
                elif min_cost == delete_cost:
                    ops[i][j] = ops[i-1][j] + [f"Delete '{word1[i-1]}'"]
                else:
                    ops[i][j] = ops[i][j-1] + [f"Insert '{word2[j-1]}'"]
    
    return dp[m][n], ops[m][n]

# Example usage
print(minDistance("horse", "ros"))  # Output: 3
dist, operations = minDistance_with_operations("horse", "ros")
print(f"Distance: {dist}")
for op in operations:
    print(f"  {op}")

Java:

class Solution {
    /**
     * 2D DP solution for edit distance
     * Time: O(m * n)
     * Space: O(m * n)
     */
    public int minDistance(String word1, String word2) {
        int m = word1.length();
        int n = word2.length();
        
        // dp[i][j] = min operations to convert word1[0:i] to word2[0:j]
        int[][] dp = new int[m + 1][n + 1];
        
        // Base cases
        for (int j = 0; j <= n; j++) {
            dp[0][j] = j;  // Insert j characters
        }
        for (int i = 0; i <= m; i++) {
            dp[i][0] = i;  // Delete i characters
        }
        
        // Fill DP table
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                    // Characters match
                    dp[i][j] = dp[i - 1][j - 1];
                } else {
                    // Characters don't match
                    int insertOp = dp[i][j - 1] + 1;     // Insert
                    int deleteOp = dp[i - 1][j] + 1;     // Delete
                    int replaceOp = dp[i - 1][j - 1] + 1; // Replace
                    
                    dp[i][j] = Math.min(insertOp, Math.min(deleteOp, replaceOp));
                }
            }
        }
        
        return dp[m][n];
    }
}

Approach 2: Space Optimized DP

Since we only need the previous row, optimize space to O(n).

Python:

def minDistance_optimized(word1, word2):
    """
    Space-optimized DP solution
    Time: O(m * n)
    Space: O(min(m, n))
    """
    # Make word2 the shorter string for space optimization
    if len(word1) < len(word2):
        word1, word2 = word2, word1
    
    m, n = len(word1), len(word2)
    
    # Only need current and previous row
    prev = list(range(n + 1))
    curr = [0] * (n + 1)
    
    for i in range(1, m + 1):
        curr[0] = i  # Base case for current row
        
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                curr[j] = prev[j-1]
            else:
                curr[j] = 1 + min(
                    curr[j-1],    # Insert
                    prev[j],      # Delete
                    prev[j-1]     # Replace
                )
        
        # Swap rows
        prev, curr = curr, prev
    
    return prev[n]

# Example usage
print(minDistance_optimized("horse", "ros"))  # Output: 3

Key Insights

  1. Three Operations: Insert, delete, replace are the allowed operations
  2. State Definition: dp[i][j] represents min operations to convert word1[0:i] to word2[0:j]
  3. Base Cases: Converting to/from empty string requires insertions/deletions
  4. Character Match: If characters match, no operation needed
  5. Character Mismatch: Try all three operations and take minimum

Test Cases

def test_edit_distance():
    test_cases = [
        ("horse", "ros", 3),
        ("intention", "execution", 5),
        ("", "abc", 3),
        ("abc", "", 3),
        ("abc", "abc", 0),
        ("a", "b", 1),
        ("sunday", "saturday", 3)
    ]
    
    for word1, word2, expected in test_cases:
        result = minDistance(word1, word2)
        assert result == expected
        print(f"editDistance('{word1}', '{word2}') = {result} ✓")

test_edit_distance()