Longest Common Subsequence

Find the length of the longest common subsequence between two strings

Language Selection

Choose your preferred programming language

Showing: Python

Problem Statement

Given two strings text1 and text2, return the length of their longest common subsequence. If there is no common subsequence, return 0.

A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.

Input:

  • Two strings text1 and text2

Output:

  • Length of the longest common subsequence

Constraints:

  • 1 ≤ text1.length, text2.length ≤ 1000
  • text1 and text2 consist of only lowercase English characters

Examples:

Example 1:

Input: text1 = "abcde", text2 = "ace" 
Output: 3  
Explanation: The longest common subsequence is "ace" and its length is 3.

Example 2:

Input: text1 = "abc", text2 = "abc"
Output: 3
Explanation: The longest common subsequence is "abc" and its length is 3.

Example 3:

Input: text1 = "abc", text2 = "def"
Output: 0
Explanation: There is no such common subsequence, so the result is 0.

Example 4:

Input: text1 = "abcdgh", text2 = "aedfhr"
Output: 3
Explanation: LCS is "adh" with length 3.

Solutions

Approach 1: 2D Dynamic Programming

Algorithm:

  1. Create a 2D DP table where dp[i][j] = LCS length of text1[0…i-1] and text2[0…j-1]
  2. If characters match: dp[i][j] = dp[i-1][j-1] + 1
  3. If characters don’t match: dp[i][j] = max(dp[i-1][j], dp[i][j-1])

Python:

def longestCommonSubsequence(text1, text2):
    """
    2D DP solution for LCS
    Time: O(m * n) where m, n are lengths of text1, text2
    Space: O(m * n) for DP table
    """
    m, n = len(text1), len(text2)
    
    # dp[i][j] = LCS length of text1[0:i] and text2[0:j]
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Fill DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if text1[i-1] == text2[j-1]:
                # Characters match, extend LCS
                dp[i][j] = dp[i-1][j-1] + 1
            else:
                # Characters don't match, take max of excluding either character
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])
    
    return dp[m][n]

def longestCommonSubsequence_with_path(text1, text2):
    """
    LCS with path reconstruction
    Time: O(m * n)
    Space: O(m * n)
    """
    m, n = len(text1), len(text2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Fill DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if text1[i-1] == text2[j-1]:
                dp[i][j] = dp[i-1][j-1] + 1
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])
    
    # Reconstruct LCS
    lcs = []
    i, j = m, n
    while i > 0 and j > 0:
        if text1[i-1] == text2[j-1]:
            lcs.append(text1[i-1])
            i -= 1
            j -= 1
        elif dp[i-1][j] > dp[i][j-1]:
            i -= 1
        else:
            j -= 1
    
    return dp[m][n], ''.join(reversed(lcs))

# Example usage
print(longestCommonSubsequence("abcde", "ace"))  # Output: 3
length, lcs = longestCommonSubsequence_with_path("abcde", "ace")
print(f"Length: {length}, LCS: {lcs}")  # Output: Length: 3, LCS: ace

Java:

class Solution {
    /**
     * 2D DP solution for LCS
     * Time: O(m * n)
     * Space: O(m * n)
     */
    public int longestCommonSubsequence(String text1, String text2) {
        int m = text1.length();
        int n = text2.length();
        
        // dp[i][j] = LCS length of text1[0...i-1] and text2[0...j-1]
        int[][] dp = new int[m + 1][n + 1];
        
        // Fill DP table
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (text1.charAt(i - 1) == text2.charAt(j - 1)) {
                    // Characters match
                    dp[i][j] = dp[i - 1][j - 1] + 1;
                } else {
                    // Characters don't match
                    dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
                }
            }
        }
        
        return dp[m][n];
    }
    
    /**
     * LCS with path reconstruction
     * Time: O(m * n)
     * Space: O(m * n)
     */
    public String getLCS(String text1, String text2) {
        int m = text1.length();
        int n = text2.length();
        int[][] dp = new int[m + 1][n + 1];
        
        // Fill DP table
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (text1.charAt(i - 1) == text2.charAt(j - 1)) {
                    dp[i][j] = dp[i - 1][j - 1] + 1;
                } else {
                    dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
                }
            }
        }
        
        // Reconstruct LCS
        StringBuilder lcs = new StringBuilder();
        int i = m, j = n;
        while (i > 0 && j > 0) {
            if (text1.charAt(i - 1) == text2.charAt(j - 1)) {
                lcs.append(text1.charAt(i - 1));
                i--;
                j--;
            } else if (dp[i - 1][j] > dp[i][j - 1]) {
                i--;
            } else {
                j--;
            }
        }
        
        return lcs.reverse().toString();
    }
}

Go:

// longestCommonSubsequence - 2D DP solution
// Time: O(m * n)
// Space: O(m * n)
func longestCommonSubsequence(text1 string, text2 string) int {
    m, n := len(text1), len(text2)
    
    // Create DP table
    dp := make([][]int, m+1)
    for i := range dp {
        dp[i] = make([]int, n+1)
    }
    
    // Fill DP table
    for i := 1; i <= m; i++ {
        for j := 1; j <= n; j++ {
            if text1[i-1] == text2[j-1] {
                // Characters match
                dp[i][j] = dp[i-1][j-1] + 1
            } else {
                // Characters don't match
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])
            }
        }
    }
    
    return dp[m][n]
}

func max(a, b int) int {
    if a > b {
        return a
    }
    return b
}

JavaScript:

/**
 * 2D DP solution for LCS
 * Time: O(m * n)
 * Space: O(m * n)
 */
function longestCommonSubsequence(text1, text2) {
    const m = text1.length;
    const n = text2.length;
    
    // Create DP table
    const dp = Array(m + 1).fill().map(() => Array(n + 1).fill(0));
    
    // Fill DP table
    for (let i = 1; i <= m; i++) {
        for (let j = 1; j <= n; j++) {
            if (text1[i - 1] === text2[j - 1]) {
                // Characters match
                dp[i][j] = dp[i - 1][j - 1] + 1;
            } else {
                // Characters don't match
                dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
            }
        }
    }
    
    return dp[m][n];
}

/**
 * LCS with path reconstruction
 * Time: O(m * n)
 * Space: O(m * n)
 */
function getLCS(text1, text2) {
    const m = text1.length;
    const n = text2.length;
    const dp = Array(m + 1).fill().map(() => Array(n + 1).fill(0));
    
    // Fill DP table
    for (let i = 1; i <= m; i++) {
        for (let j = 1; j <= n; j++) {
            if (text1[i - 1] === text2[j - 1]) {
                dp[i][j] = dp[i - 1][j - 1] + 1;
            } else {
                dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
            }
        }
    }
    
    // Reconstruct LCS
    const lcs = [];
    let i = m, j = n;
    while (i > 0 && j > 0) {
        if (text1[i - 1] === text2[j - 1]) {
            lcs.push(text1[i - 1]);
            i--;
            j--;
        } else if (dp[i - 1][j] > dp[i][j - 1]) {
            i--;
        } else {
            j--;
        }
    }
    
    return lcs.reverse().join('');
}

C#:

public class Solution {
    /// <summary>
    /// 2D DP solution for LCS
    /// Time: O(m * n)
    /// Space: O(m * n)
    /// </summary>
    public int LongestCommonSubsequence(string text1, string text2) {
        int m = text1.Length;
        int n = text2.Length;
        
        // dp[i,j] = LCS length of text1[0...i-1] and text2[0...j-1]
        int[,] dp = new int[m + 1, n + 1];
        
        // Fill DP table
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (text1[i - 1] == text2[j - 1]) {
                    // Characters match
                    dp[i, j] = dp[i - 1, j - 1] + 1;
                } else {
                    // Characters don't match
                    dp[i, j] = Math.Max(dp[i - 1, j], dp[i, j - 1]);
                }
            }
        }
        
        return dp[m, n];
    }
}

Approach 2: Space Optimized DP

Since we only need the previous row, we can optimize space to O(min(m, n)).

Python:

def longestCommonSubsequence_optimized(text1, text2):
    """
    Space-optimized DP solution
    Time: O(m * n)
    Space: O(min(m, n))
    """
    # Make text1 the shorter string for space optimization
    if len(text1) > len(text2):
        text1, text2 = text2, text1
    
    m, n = len(text1), len(text2)
    
    # Only need previous and current row
    prev = [0] * (m + 1)
    curr = [0] * (m + 1)
    
    for j in range(1, n + 1):
        for i in range(1, m + 1):
            if text1[i-1] == text2[j-1]:
                curr[i] = prev[i-1] + 1
            else:
                curr[i] = max(curr[i-1], prev[i])
        
        # Swap rows
        prev, curr = curr, prev
    
    return prev[m]

def longestCommonSubsequence_1d(text1, text2):
    """
    Further optimized to use only one array
    Time: O(m * n)
    Space: O(min(m, n))
    """
    if len(text1) > len(text2):
        text1, text2 = text2, text1
    
    m, n = len(text1), len(text2)
    dp = [0] * (m + 1)
    
    for j in range(1, n + 1):
        prev = 0  # dp[i-1][j-1]
        for i in range(1, m + 1):
            temp = dp[i]  # Save dp[i][j-1] for next iteration
            if text1[i-1] == text2[j-1]:
                dp[i] = prev + 1
            else:
                dp[i] = max(dp[i], dp[i-1])
            prev = temp
    
    return dp[m]

# Example usage
print(longestCommonSubsequence_optimized("abcde", "ace"))  # Output: 3

Approach 3: Memoization (Top-Down DP)

Python:

def longestCommonSubsequence_memo(text1, text2):
    """
    Top-down DP with memoization
    Time: O(m * n)
    Space: O(m * n)
    """
    memo = {}
    
    def lcs(i, j):
        # Base case
        if i < 0 or j < 0:
            return 0
        
        # Check memo
        if (i, j) in memo:
            return memo[(i, j)]
        
        # Recursive case
        if text1[i] == text2[j]:
            result = 1 + lcs(i - 1, j - 1)
        else:
            result = max(lcs(i - 1, j), lcs(i, j - 1))
        
        memo[(i, j)] = result
        return result
    
    return lcs(len(text1) - 1, len(text2) - 1)

def longestCommonSubsequence_lru(text1, text2):
    """
    Using functools.lru_cache
    Time: O(m * n)
    Space: O(m * n)
    """
    from functools import lru_cache
    
    @lru_cache(maxsize=None)
    def lcs(i, j):
        if i < 0 or j < 0:
            return 0
        
        if text1[i] == text2[j]:
            return 1 + lcs(i - 1, j - 1)
        else:
            return max(lcs(i - 1, j), lcs(i, j - 1))
    
    return lcs(len(text1) - 1, len(text2) - 1)

# Example usage
print(longestCommonSubsequence_memo("abcde", "ace"))  # Output: 3

Key Insights

  1. 2D DP Structure: LCS is a classic 2D DP problem with string indices
  2. Recurrence Relation: If chars match, extend LCS; else take max of two choices
  3. Base Cases: Empty string has LCS length 0 with any string
  4. Space Optimization: Only need previous row, can reduce to O(min(m, n))
  5. Path Reconstruction: Backtrack through DP table to find actual LCS

Edge Cases

  1. Empty strings: Either or both strings are empty
  2. Identical strings: LCS length equals string length
  3. No common characters: LCS length is 0
  4. Single character strings: LCS is 1 if characters match, 0 otherwise
  5. One string is subsequence of other: LCS length equals shorter string length

Test Cases

def test_lcs():
    test_cases = [
        ("abcde", "ace", 3),
        ("abc", "abc", 3),
        ("abc", "def", 0),
        ("", "abc", 0),
        ("abc", "", 0),
        ("", "", 0),
        ("a", "a", 1),
        ("a", "b", 0),
        ("abcdgh", "aedfhr", 3),
        ("AGGTAB", "GXTXAYB", 4)  # LCS: "GTAB"
    ]
    
    for text1, text2, expected in test_cases:
        result = longestCommonSubsequence(text1, text2)
        assert result == expected, f"text1='{text1}', text2='{text2}', expected={expected}, got={result}"
        print(f"LCS('{text1}', '{text2}') = {result} ✓")

test_lcs()

Common Mistakes

  1. Index confusion: Mixing up 0-based and 1-based indexing
  2. Wrong base cases: Not handling empty strings correctly
  3. Character comparison: Using wrong indices for character comparison
  4. Space optimization errors: Incorrect row swapping or variable updates
  5. Path reconstruction: Backtracking in wrong direction

Follow-up Questions

  1. Print LCS: Return the actual subsequence, not just length
  2. All LCS: Find all longest common subsequences
  3. LCS of multiple strings: Extend to k strings
  4. Shortest common supersequence: Related but different problem
  5. Edit distance: Convert one string to another with minimum operations
  6. Longest palindromic subsequence: LCS of string with its reverse

Interview Tips

  1. Start with examples: Work through small examples to understand pattern
  2. Identify subproblems: Explain how LCS breaks into smaller problems
  3. Draw the DP table: Visualize the 2D table filling process
  4. Discuss space optimization: Show progression from 2D to 1D
  5. Handle edge cases: Make sure to test empty strings
  6. Explain recurrence: Clearly state the two choices at each step
  7. Time complexity: Explain why it’s O(m*n) and if it can be improved