Minimum Window Substring

Find the minimum window substring that contains all characters of another string

Language Selection

Choose your preferred programming language

Showing: Python

Minimum Window Substring

Problem Statement

Given two strings s and t of lengths m and n respectively, return the minimum window substring of s such that every character in t (including duplicates) is included in the window. If there is no such window, return the empty string "".

Input/Output Specifications

  • Input: Two strings s and t where 1 <= m, n <= 10^5 and s and t consist of uppercase and lowercase English letters
  • Output: A string representing the minimum window substring

Constraints

  • 1 <= m, n <= 10^5
  • s and t consist of uppercase and lowercase English letters

Examples

Example 1:

Input: s = "ADOBECODEBANC", t = "ABC"
Output: "BANC"
Explanation: The minimum window substring "BANC" includes 'A', 'B', and 'C' from string t.

Example 2:

Input: s = "a", t = "a"
Output: "a"
Explanation: The entire string s is the minimum window.

Example 3:

Input: s = "a", t = "aa"
Output: ""
Explanation: Both 'a's from t must be included in the window. Since the largest window of s only has one 'a', return empty string.

Solution Approaches

Approach 1: Sliding Window with Two Pointers - Optimal

Algorithm Explanation:

  1. Use two pointers (left and right) to maintain a sliding window
  2. Expand the right pointer to include more characters
  3. When all characters from pattern are found, contract the left pointer
  4. Keep track of character frequencies using hash maps
  5. Update the minimum window when a valid window is found

Implementation:

Python:

from collections import defaultdict

def minWindow(s, t):
    """
    Find minimum window substring containing all characters of t
    Time: O(n + m)
    Space: O(n + m)
    """
    if not s or not t or len(s) < len(t):
        return ""
    
    # Count characters in pattern t
    pattern_count = defaultdict(int)
    for char in t:
        pattern_count[char] += 1
    
    # Track characters in current window
    window_count = defaultdict(int)
    
    # Variables for sliding window
    left = 0
    min_len = float('inf')
    min_start = 0
    required_chars = len(pattern_count)
    formed_chars = 0
    
    for right in range(len(s)):
        # Add character to window
        char = s[right]
        window_count[char] += 1
        
        # Check if this character completes a required character
        if char in pattern_count and window_count[char] == pattern_count[char]:
            formed_chars += 1
        
        # Try to contract window from left
        while left <= right and formed_chars == required_chars:
            # Update minimum window
            current_len = right - left + 1
            if current_len < min_len:
                min_len = current_len
                min_start = left
            
            # Remove character from left
            left_char = s[left]
            window_count[left_char] -= 1
            
            # Check if we lost a required character
            if left_char in pattern_count and window_count[left_char] < pattern_count[left_char]:
                formed_chars -= 1
            
            left += 1
    
    return s[min_start:min_start + min_len] if min_len != float('inf') else ""

Java:

import java.util.*;

class Solution {
    public String minWindow(String s, String t) {
        if (s == null || t == null || s.length() < t.length()) {
            return "";
        }
        
        Map<Character, Integer> patternCount = new HashMap<>();
        for (char c : t.toCharArray()) {
            patternCount.put(c, patternCount.getOrDefault(c, 0) + 1);
        }
        
        Map<Character, Integer> windowCount = new HashMap<>();
        
        int left = 0;
        int minLen = Integer.MAX_VALUE;
        int minStart = 0;
        int requiredChars = patternCount.size();
        int formedChars = 0;
        
        for (int right = 0; right < s.length(); right++) {
            char c = s.charAt(right);
            windowCount.put(c, windowCount.getOrDefault(c, 0) + 1);
            
            if (patternCount.containsKey(c) && 
                windowCount.get(c).intValue() == patternCount.get(c).intValue()) {
                formedChars++;
            }
            
            while (left <= right && formedChars == requiredChars) {
                int currentLen = right - left + 1;
                if (currentLen < minLen) {
                    minLen = currentLen;
                    minStart = left;
                }
                
                char leftChar = s.charAt(left);
                windowCount.put(leftChar, windowCount.get(leftChar) - 1);
                
                if (patternCount.containsKey(leftChar) && 
                    windowCount.get(leftChar) < patternCount.get(leftChar)) {
                    formedChars--;
                }
                
                left++;
            }
        }
        
        return minLen == Integer.MAX_VALUE ? "" : s.substring(minStart, minStart + minLen);
    }
}

Complexity Analysis:

  • Time Complexity: O(n + m) - Each character is visited at most twice
  • Space Complexity: O(n + m) - Hash maps store character frequencies

Key Insights

  1. Sliding Window Pattern: Expand right pointer, contract left pointer when valid
  2. Character Frequency Tracking: Use hash maps to track required vs actual counts
  3. Formed Characters Counter: Efficiently track when all required characters are present
  4. Two Pointers Technique: Maintain optimal window boundaries

Edge Cases

  1. Empty Strings: Return empty string
  2. Pattern Longer Than String: Return empty string
  3. No Valid Window: Return empty string
  4. Single Character: Handle single character patterns
  5. Duplicate Characters: Handle multiple occurrences of same character

Test Cases

def test_minWindow():
    assert minWindow("ADOBECODEBANC", "ABC") == "BANC"
    assert minWindow("a", "a") == "a"
    assert minWindow("a", "aa") == ""
    assert minWindow("ab", "b") == "b"
    assert minWindow("bba", "ab") == "ba"
    print("All tests passed!")

test_minWindow()

Follow-up Questions

  1. Longest Substring: Find longest substring with all characters?
  2. Substring with At Most K Distinct: Find substring with at most k distinct characters?
  3. Substring with Exactly K Distinct: Find substring with exactly k distinct characters?
  4. Minimum Window Subsequence: Find minimum window that contains pattern as subsequence?
  5. Multiple Patterns: Handle multiple patterns simultaneously?

Common Mistakes

  1. Wrong Window Expansion: Not properly expanding the right pointer
  2. Incorrect Character Counting: Not tracking character frequencies correctly
  3. Formed Characters Logic: Incorrect logic for tracking when all characters are found
  4. Window Contraction: Not properly contracting the window from left
  5. Edge Case Handling: Not handling empty strings or impossible cases

Interview Tips

  1. Start with Brute Force: Show understanding of the problem
  2. Explain Sliding Window: Describe the two-pointer technique
  3. Discuss Optimization: Mention the character frequency tracking
  4. Handle Edge Cases: Cover empty strings and impossible cases
  5. Code Carefully: Pay attention to pointer movement and counting logic

Concept Explanations

Sliding Window Technique: A technique where we maintain a window of variable size and slide it across the string. We expand the window to include more characters and contract it to find the optimal solution.

Character Frequency Tracking: Using hash maps to track how many times each character appears in the pattern versus the current window. This allows us to efficiently determine when we have all required characters.

Two Pointers Pattern: Using two pointers (left and right) to maintain the window boundaries. The right pointer expands the window, and the left pointer contracts it when the window becomes valid.

When to Use: This pattern is useful for substring problems where you need to find a contiguous subsequence that satisfies certain conditions, especially when the condition involves character frequencies or counts.