Language Selection
Choose your preferred programming language
Showing: Python
Problem Statement
Given an input string s and a pattern p, implement regular expression matching with support for '.' and '*' where:
'.'Matches any single character'*'Matches zero or more of the preceding element
The matching should cover the entire input string (not partial).
Input:
- A string
s(the input string) - A string
p(the pattern)
Output:
- Boolean indicating if the entire string matches the pattern
Constraints:
- 1 ≤ s.length ≤ 20
- 1 ≤ p.length ≤ 30
- s contains only lowercase English letters
- p contains only lowercase English letters, ‘.’, and ‘*’
- It is guaranteed for each appearance of ‘*’, there will be a previous valid character to match
Examples:
Example 1:
Input: s = "aa", p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".
Example 2:
Input: s = "aa", p = "a*"
Output: true
Explanation: '*' means zero or more of the preceding element, 'a'.
Therefore, by repeating 'a' once, it becomes "aa".
Example 3:
Input: s = "ab", p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".
Example 4:
Input: s = "aab", p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 2 times. Therefore, it matches "aab".
Solutions
Approach 1: 2D Dynamic Programming
Algorithm:
- Create 2D DP table where dp[i][j] = true if s[0:i] matches p[0:j]
- Base case: dp[0][0] = true (empty string matches empty pattern)
- Handle patterns with ‘*’ that can match zero characters
- For each cell, check character match and handle ‘*’ cases
Python:
def isMatch(s, p):
"""
2D DP solution for regex matching
Time: O(m * n) where m = len(s), n = len(p)
Space: O(m * n) for DP table
"""
m, n = len(s), len(p)
# dp[i][j] = True if s[0:i] matches p[0:j]
dp = [[False] * (n + 1) for _ in range(m + 1)]
# Base case: empty string matches empty pattern
dp[0][0] = True
# Handle patterns like "a*", "a*b*", etc. that can match empty string
for j in range(2, n + 1):\n # Pattern like \"a*\" can match empty string\n if p[j-1] == '*':\n dp[0][j] = dp[0][j-2]\n \n # Fill DP table\n for i in range(1, m + 1):\n for j in range(1, n + 1):\n if p[j-1] == '*':\n # '*' can match zero or more of preceding character\n # Case 1: Use * to match zero occurrences (ignore \"x*\")\n dp[i][j] = dp[i][j-2]\n \n # Case 2: Use * to match one or more occurrences\n # Check if preceding character matches current character in s\n if matches(s, p, i, j-1): # s[i-1] matches p[j-2]\n dp[i][j] = dp[i][j] or dp[i-1][j]\n else:\n # Regular character or '.'\n if matches(s, p, i, j):\n dp[i][j] = dp[i-1][j-1]\n \n return dp[m][n]\n\ndef matches(s, p, i, j):\n \"\"\"Check if s[i-1] matches p[j-1]\"\"\"\n return p[j-1] == '.' or s[i-1] == p[j-1]\n\ndef isMatch_verbose(s, p):\n \"\"\"\n Verbose version with detailed comments\n Time: O(m * n)\n Space: O(m * n)\n \"\"\"\n m, n = len(s), len(p)\n dp = [[False] * (n + 1) for _ in range(m + 1)]\n \n # Base case\n dp[0][0] = True\n print(f\"Base case: dp[0][0] = {dp[0][0]}\")\n \n # Handle patterns that can match empty string\n for j in range(2, n + 1):\n if p[j-1] == '*':\n dp[0][j] = dp[0][j-2]\n print(f\"Empty string vs '{p[:j]}': dp[0][{j}] = {dp[0][j]}\")\n \n # Fill table\n for i in range(1, m + 1):\n for j in range(1, n + 1):\n char_s = s[i-1]\n char_p = p[j-1]\n \n if char_p == '*':\n # Zero occurrences\n dp[i][j] = dp[i][j-2]\n \n # One or more occurrences\n if p[j-2] == '.' or s[i-1] == p[j-2]:\n dp[i][j] = dp[i][j] or dp[i-1][j]\n \n print(f\"s[:{i}]='{s[:i]}' vs p[:{j}]='{p[:j]}' (*): dp[{i}][{j}] = {dp[i][j]}\")\n else:\n # Regular character or '.'\n if char_p == '.' or char_s == char_p:\n dp[i][j] = dp[i-1][j-1]\n \n print(f\"s[:{i}]='{s[:i]}' vs p[:{j}]='{p[:j]}' ({char_p}): dp[{i}][{j}] = {dp[i][j]}\")\n \n return dp[m][n]\n\n# Example usage\nprint(isMatch(\"aa\", \"a*\")) # Output: True\nprint(isMatch(\"ab\", \".*\")) # Output: True\nprint(isMatch(\"aab\", \"c*a*b\")) # Output: True\n```\n\n**Java:**\n```java\nclass Solution {\n /**\n * 2D DP solution for regex matching\n * Time: O(m * n)\n * Space: O(m * n)\n */\n public boolean isMatch(String s, String p) {\n int m = s.length();\n int n = p.length();\n \n // dp[i][j] = true if s[0:i] matches p[0:j]\n boolean[][] dp = new boolean[m + 1][n + 1];\n \n // Base case\n dp[0][0] = true;\n \n // Handle patterns like \"a*\", \"a*b*\" that can match empty string\n for (int j = 2; j <= n; j++) {\n if (p.charAt(j - 1) == '*') {\n dp[0][j] = dp[0][j - 2];\n }\n }\n \n // Fill DP table\n for (int i = 1; i <= m; i++) {\n for (int j = 1; j <= n; j++) {\n if (p.charAt(j - 1) == '*') {\n // '*' can match zero or more of preceding character\n // Case 1: zero occurrences\n dp[i][j] = dp[i][j - 2];\n \n // Case 2: one or more occurrences\n if (matches(s, p, i, j - 1)) {\n dp[i][j] = dp[i][j] || dp[i - 1][j];\n }\n } else {\n // Regular character or '.'\n if (matches(s, p, i, j)) {\n dp[i][j] = dp[i - 1][j - 1];\n }\n }\n }\n }\n \n return dp[m][n];\n }\n \n private boolean matches(String s, String p, int i, int j) {\n return p.charAt(j - 1) == '.' || s.charAt(i - 1) == p.charAt(j - 1);\n }\n}\n```\n\n### Approach 2: Recursion with Memoization\n\n**Python:**\n```python\ndef isMatch_recursive(s, p):\n \"\"\"\n Recursive solution with memoization\n Time: O(m * n)\n Space: O(m * n) for memoization + recursion stack\n \"\"\"\n memo = {}\n \n def dp(i, j):\n # Base cases\n if (i, j) in memo:\n return memo[(i, j)]\n \n # If pattern is exhausted\n if j == len(p):\n return i == len(s)\n \n # Check if current characters match\n first_match = i < len(s) and (p[j] == '.' or s[i] == p[j])\n \n # Handle '*' in pattern\n if j + 1 < len(p) and p[j + 1] == '*':\n # Two choices:\n # 1. Don't use the \"x*\" pattern (match zero occurrences)\n # 2. Use the \"x*\" pattern if first character matches\n result = (dp(i, j + 2) or \n (first_match and dp(i + 1, j)))\n else:\n # No '*', must match current character\n result = first_match and dp(i + 1, j + 1)\n \n memo[(i, j)] = result\n return result\n \n return dp(0, 0)\n\ndef isMatch_recursive_verbose(s, p):\n \"\"\"\n Verbose recursive solution\n Time: O(m * n)\n Space: O(m * n)\n \"\"\"\n memo = {}\n \n def dp(i, j, depth=0):\n indent = \" \" * depth\n print(f\"{indent}Checking s[{i}:] = '{s[i:]}' vs p[{j}:] = '{p[j:]}'\")\n \n if (i, j) in memo:\n print(f\"{indent}Found in memo: {memo[(i, j)]}\")\n return memo[(i, j)]\n \n # Pattern exhausted\n if j == len(p):\n result = i == len(s)\n print(f\"{indent}Pattern exhausted, string exhausted: {result}\")\n memo[(i, j)] = result\n return result\n \n # Check current character match\n first_match = i < len(s) and (p[j] == '.' or s[i] == p[j])\n print(f\"{indent}First match: {first_match}\")\n \n if j + 1 < len(p) and p[j + 1] == '*':\n print(f\"{indent}Found '*' pattern: {p[j]}*\")\n # Zero occurrences\n zero_match = dp(i, j + 2, depth + 1)\n print(f\"{indent}Zero occurrences: {zero_match}\")\n \n # One or more occurrences\n if first_match:\n one_or_more = dp(i + 1, j, depth + 1)\n print(f\"{indent}One or more occurrences: {one_or_more}\")\n result = zero_match or one_or_more\n else:\n result = zero_match\n else:\n print(f\"{indent}Regular character match\")\n result = first_match and dp(i + 1, j + 1, depth + 1)\n \n print(f\"{indent}Result: {result}\")\n memo[(i, j)] = result\n return result\n \n return dp(0, 0)\n\n# Example usage\nprint(isMatch_recursive(\"aab\", \"c*a*b\")) # Output: True\n```\n\n## Key Insights\n\n1. **Two Key Cases**: Handle regular characters/dots and '*' patterns differently\n2. **'*' Flexibility**: '*' can match zero or more of the preceding character\n3. **Greedy vs Non-greedy**: '*' requires checking both zero and multiple matches\n4. **Base Cases**: Empty pattern can only match empty string (with exceptions)\n5. **State Definition**: dp[i][j] represents if s[0:i] matches p[0:j]\n\n## Edge Cases\n\n1. **Empty strings**: Both empty should return True\n2. **Pattern with '*'**: \"a*\" can match empty string\n3. **Only '.'**: Single '.' matches any single character\n4. **Only '*'**: Invalid pattern (guaranteed not to happen)\n5. **Multiple '*'**: \"a*b*\" can match various combinations\n\n## Test Cases\n\n```python\ndef test_regex_matching():\n test_cases = [\n (\"aa\", \"a\", False),\n (\"aa\", \"a*\", True),\n (\"ab\", \".*\", True),\n (\"aab\", \"c*a*b\", True),\n (\"mississippi\", \"mis*is*p*.\", False),\n (\"\", \".*\", True),\n (\"\", \"a*\", True),\n (\"a\", \"ab*\", True),\n (\"abc\", \"a.*c\", True),\n (\"abcdef\", \"a.*f\", True)\n ]\n \n for s, p, expected in test_cases:\n result = isMatch(s, p)\n assert result == expected, f\"s='{s}', p='{p}', expected={expected}, got={result}\"\n print(f\"isMatch('{s}', '{p}') = {result} ✓\")\n\ntest_regex_matching()\n```\n\n## Common Mistakes\n\n1. **Wrong '*' handling**: Not considering both zero and multiple matches\n2. **Index errors**: Off-by-one errors when accessing string characters\n3. **Base case confusion**: Not handling empty string/pattern correctly\n4. **Greedy matching**: Assuming '*' should always match as much as possible\n5. **'.' misunderstanding**: Forgetting that '.' matches any single character\n\n## Follow-up Questions\n\n1. **Wildcard matching**: Different pattern with '?' and '*'\n2. **Multiple patterns**: Match against multiple regex patterns\n3. **Case sensitivity**: Handle case-insensitive matching\n4. **Extended regex**: Support for '+', '?', character classes\n5. **Pattern validation**: Check if the pattern itself is valid\n6. **Space optimization**: Reduce space complexity of DP solution\n\n## Interview Tips\n\n1. **Understand the problem**: Clarify what '.' and '*' mean exactly\n2. **Start with examples**: Work through examples to understand patterns\n3. **Break down cases**: Handle '*' and regular characters separately\n4. **Draw DP table**: Visualize the state transitions\n5. **Test edge cases**: Empty strings, patterns with multiple '*'\n6. **Explain both approaches**: Both recursive and iterative DP\n7. **Discuss complexity**: Time and space complexity analysis