Language Selection
Choose your preferred programming language
Regular Expression Matching
Problem Statement
Given an input string s and a pattern p, implement regular expression matching with support for ‘.’ and ‘*’ where:
- ‘.’ Matches any single character
- ‘*’ Matches zero or more of the preceding element
The matching should cover the entire input string (not partial).
Input/Output Specifications
- Input: Two strings
sandpwhere1 <= s.length <= 20,1 <= p.length <= 30 - Output: A boolean indicating whether the pattern matches the string
Constraints
1 <= s.length <= 201 <= p.length <= 30scontains only lowercase English letterspcontains only lowercase English letters, ‘.’, and ‘*’
Examples
Example 1:
Input: s = "aa", p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".
Example 2:
Input: s = "aa", p = "a*"
Output: true
Explanation: '*' means zero or more of the preceding element, 'a'.
Example 3:
Input: s = "ab", p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".
Solution Approaches
Approach 1: Dynamic Programming (Bottom-Up) - Optimal
Algorithm Explanation:
- Create a 2D DP table where dp[i][j] represents if s[0:i] matches p[0:j]
- Handle base cases for empty string and pattern
- For each character, check if it matches and handle ‘*’ cases
- ‘*’ can match zero or more of the preceding character
Implementation:
Python:
def isMatch(s, p):
"""
Regular expression matching using dynamic programming
Time: O(m * n)
Space: O(m * n)
"""
m, n = len(s), len(p)
# dp[i][j] represents if s[0:i] matches p[0:j]
dp = [[False] * (n + 1) for _ in range(m + 1)]
# Base case: empty string matches empty pattern
dp[0][0] = True
# Handle patterns like a*, a*b*, a*b*c*
for j in range(2, n + 1):
if p[j - 1] == '*':
dp[0][j] = dp[0][j - 2]
# Fill the DP table
for i in range(1, m + 1):
for j in range(1, n + 1):
if p[j - 1] == '*':
# Case 1: '*' matches zero characters
dp[i][j] = dp[i][j - 2]
# Case 2: '*' matches one or more characters
if matches(s, p, i - 1, j - 2):
dp[i][j] = dp[i][j] or dp[i - 1][j]
else:
# Direct character match
if matches(s, p, i - 1, j - 1):
dp[i][j] = dp[i - 1][j - 1]
return dp[m][n]
def matches(s, p, i, j):
"""Check if s[i] matches p[j]"""
if j < 0:
return False
return p[j] == '.' or s[i] == p[j]
Java:
class Solution {
public boolean isMatch(String s, String p) {
int m = s.length();
int n = p.length();
boolean[][] dp = new boolean[m + 1][n + 1];
dp[0][0] = true;
for (int j = 2; j <= n; j++) {
if (p.charAt(j - 1) == '*') {
dp[0][j] = dp[0][j - 2];
}
}
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (p.charAt(j - 1) == '*') {
dp[i][j] = dp[i][j - 2];
if (matches(s, p, i - 1, j - 2)) {
dp[i][j] = dp[i][j] || dp[i - 1][j];
}
} else {
if (matches(s, p, i - 1, j - 1)) {
dp[i][j] = dp[i - 1][j - 1];
}
}
}
}
return dp[m][n];
}
private boolean matches(String s, String p, int i, int j) {
if (j < 0) return false;
return p.charAt(j) == '.' || s.charAt(i) == p.charAt(j);
}
}
Complexity Analysis:
- Time Complexity: O(m * n) - Fill the DP table once
- Space Complexity: O(m * n) - DP table storage
Key Insights
- Dynamic Programming: Break down the problem into smaller subproblems
- Pattern Matching: Handle ‘.’ and ‘*’ characters correctly
- State Transitions: Define clear transitions between states
- Base Cases: Handle empty string and pattern cases
Edge Cases
- Empty String: Handle empty string matching
- Empty Pattern: Handle empty pattern
- Only ‘*’: Pattern with only ‘*’ characters
- Multiple ‘*’: Patterns with multiple ‘*’ characters
- No Match: Cases where no match is possible
Test Cases
def test_isMatch():
assert isMatch("aa", "a") == False
assert isMatch("aa", "a*") == True
assert isMatch("ab", ".*") == True
assert isMatch("aab", "c*a*b") == True
assert isMatch("mississippi", "mis*is*p*.") == False
print("All tests passed!")
test_isMatch()
Follow-up Questions
- Wildcard Matching: Handle ‘?’ and ‘*’ wildcards?
- Multiple Patterns: Match against multiple patterns?
- Pattern Validation: Validate if pattern is well-formed?
- Performance Optimization: Optimize for specific patterns?
- Unicode Support: Handle Unicode characters?
Common Mistakes
- Incorrect ‘*’ Handling: Not handling zero or multiple matches correctly
- Index Errors: Off-by-one errors in string indexing
- Base Case Missing: Not handling empty string/pattern cases
- State Transitions: Incorrect DP state transitions
- Memoization: Not properly implementing memoization
Interview Tips
- Start with Recursion: Begin with recursive approach
- Add Memoization: Optimize with memoization
- Convert to DP: Show bottom-up DP solution
- Handle Edge Cases: Cover empty strings and special patterns
- Explain Complexity: Discuss time and space complexity
Concept Explanations
Dynamic Programming: A technique to solve problems by breaking them into smaller subproblems and storing results to avoid redundant calculations.
Regular Expression: A sequence of characters that defines a search pattern, commonly used for pattern matching in strings.
Pattern Matching: The process of finding patterns within strings, which is fundamental to many text processing applications.
When to Use: Use this pattern when you need to match strings against patterns with special characters, especially when the pattern can match variable-length sequences.