Word Break

Determine if a string can be segmented into space-separated words from a dictionary.

Language Selection

Choose your preferred programming language

Showing: Python

Word Break

Problem Statement

Given a string s and a dictionary of strings wordDict, return true if s can be segmented into a space-separated sequence of one or more dictionary words.

Note that the same word in the dictionary may be reused multiple times in the segmentation.

Examples

Example 1:

Input: s = "leetcode", wordDict = ["leet","code"]
Output: true
Explanation: Return true because "leetcode" can be segmented as "leet code".

Example 2:

Input: s = "applepenapple", wordDict = ["apple","pen"]
Output: true
Explanation: Return true because "applepenapple" can be segmented as "apple pen apple".
Note that you are allowed to reuse a dictionary word.

Example 3:

Input: s = "catsandog", wordDict = ["cats","dog","sand","and","cat"]
Output: false
Explanation: "catsandog" cannot be segmented into dictionary words.

Constraints

  • 1 <= s.length <= 300
  • 1 <= wordDict.length <= 1000
  • 1 <= wordDict[i].length <= 20
  • s and wordDict[i] consist of only lowercase English letters.
  • All the strings of wordDict are unique.

Approach 1: Dynamic Programming (Optimal)

Algorithm

Use dynamic programming to check if each prefix of the string can be segmented.

Steps:

  1. Create a boolean array dp where dp[i] represents if substring s[0:i] can be segmented
  2. Initialize dp[0] = true (empty string is always segmentable)
  3. For each position i, check all possible word endings at position i
  4. If a word ends at position i and the prefix before it is segmentable, mark dp[i] = true
  5. Return dp[n] where n is the length of the string

Implementation

Python:

def wordBreak(s, wordDict):
    """
    Check if string can be segmented using dynamic programming
    Time: O(n^2)
    Space: O(n)
    """
    wordSet = set(wordDict)
    n = len(s)
    dp = [False] * (n + 1)
    dp[0] = True  # Empty string is always segmentable
    
    for i in range(1, n + 1):
        for j in range(i):
            if dp[j] and s[j:i] in wordSet:
                dp[i] = True
                break
    
    return dp[n]

Java:

class Solution {
    /**
     * Check if string can be segmented using dynamic programming
     * Time: O(n^2)
     * Space: O(n)
     */
    public boolean wordBreak(String s, List<String> wordDict) {
        Set<String> wordSet = new HashSet<>(wordDict);
        int n = s.length();
        boolean[] dp = new boolean[n + 1];
        dp[0] = true; // Empty string is always segmentable
        
        for (int i = 1; i <= n; i++) {
            for (int j = 0; j < i; j++) {
                if (dp[j] && wordSet.contains(s.substring(j, i))) {
                    dp[i] = true;
                    break;
                }
            }
        }
        
        return dp[n];
    }
}

Go:

func wordBreak(s string, wordDict []string) bool {
    /**
     * Check if string can be segmented using dynamic programming
     * Time: O(n^2)
     * Space: O(n)
     */
    wordSet := make(map[string]bool)
    for _, word := range wordDict {
        wordSet[word] = true
    }
    
    n := len(s)
    dp := make([]bool, n+1)
    dp[0] = true // Empty string is always segmentable
    
    for i := 1; i <= n; i++ {
        for j := 0; j < i; j++ {
            if dp[j] && wordSet[s[j:i]] {
                dp[i] = true
                break
            }
        }
    }
    
    return dp[n]
}

JavaScript:

/**
 * Check if string can be segmented using dynamic programming
 * Time: O(n^2)
 * Space: O(n)
 */
function wordBreak(s, wordDict) {
    const wordSet = new Set(wordDict);
    const n = s.length;
    const dp = new Array(n + 1).fill(false);
    dp[0] = true; // Empty string is always segmentable
    
    for (let i = 1; i <= n; i++) {
        for (let j = 0; j < i; j++) {
            if (dp[j] && wordSet.has(s.substring(j, i))) {
                dp[i] = true;
                break;
            }
        }
    }
    
    return dp[n];
}

C#:

public class Solution {
    /// <summary>
    /// Check if string can be segmented using dynamic programming
    /// Time: O(n^2)
    /// Space: O(n)
    /// </summary>
    public bool WordBreak(string s, IList<string> wordDict) {
        var wordSet = new HashSet<string>(wordDict);
        int n = s.Length;
        bool[] dp = new bool[n + 1];
        dp[0] = true; // Empty string is always segmentable
        
        for (int i = 1; i <= n; i++) {
            for (int j = 0; j < i; j++) {
                if (dp[j] && wordSet.Contains(s.Substring(j, i - j))) {
                    dp[i] = true;
                    break;
                }
            }
        }
        
        return dp[n];
    }
}

Complexity Analysis

  • Time Complexity: O(n^2) - Nested loops checking all possible substrings
  • Space Complexity: O(n) - DP array and word set storage

Key Insights

  1. Dynamic Programming: Break problem into overlapping subproblems
  2. Memoization: Cache results to avoid recomputation
  3. Trie Optimization: Use trie for efficient word lookup
  4. Substring Check: Check all possible word endings at each position
  5. Base Case: Empty string is always segmentable

Edge Cases

  1. Empty String: Should return true (no words needed)
  2. Single Word: String equals a dictionary word
  3. No Valid Segmentation: String cannot be segmented
  4. Overlapping Words: Words that share prefixes/suffixes
  5. Repeated Words: Same word used multiple times

Test Cases

def test_wordBreak():
    # Test case 1
    assert wordBreak("leetcode", ["leet", "code"]) == True
    
    # Test case 2
    assert wordBreak("applepenapple", ["apple", "pen"]) == True
    
    # Test case 3
    assert wordBreak("catsandog", ["cats", "dog", "sand", "and", "cat"]) == False
    
    # Edge cases
    assert wordBreak("", ["a", "b"]) == True
    assert wordBreak("a", ["a"]) == True
    assert wordBreak("a", ["b"]) == False
    
    print("All tests passed!")

test_wordBreak()

Follow-up Questions

  1. Word Break II: Return all possible segmentations?
  2. Concatenated Words: Find all concatenated words?
  3. Word Break with Wildcards: Handle wildcard characters?
  4. Word Break with Cost: Minimize cost of segmentation?
  5. Word Break with Constraints: Add additional constraints?

Common Mistakes

  1. Not Handling Empty String: Forgetting that empty string is always segmentable
  2. Wrong DP State: Incorrectly defining what dp[i] represents
  3. Missing Memoization: Not caching results in recursive approach
  4. Inefficient Lookup: Using list instead of set for word lookup
  5. Index Errors: Off-by-one errors in substring operations

Interview Tips

  1. Start with DP: Most intuitive and efficient approach
  2. Explain the State: Clearly define what dp[i] represents
  3. Handle Edge Cases: Mention empty string and single word cases
  4. Discuss Optimization: Show knowledge of trie and memoization
  5. Follow-up Ready: Be prepared for Word Break II

Concept Explanations

Dynamic Programming Approach: We use a boolean array where dp[i] represents whether the substring s[0:i] can be segmented into dictionary words. This allows us to build up the solution from smaller subproblems.

Why DP Works: The problem has optimal substructure - if we can segment a string, then we can segment any prefix of that string. We also have overlapping subproblems - the same substring might be checked multiple times.

Memoization: In the recursive approach, we cache the results of subproblems to avoid recomputing them. This transforms the exponential time complexity into polynomial time.

Trie Optimization: Instead of checking every possible substring against the dictionary, we use a trie to efficiently traverse and find valid words. This reduces the constant factor in the time complexity.

Base Case: The empty string is always segmentable (no words needed), which serves as our base case for the DP approach.

Substring Generation: For each position, we check all possible substrings ending at that position. If a substring is a valid word and the prefix before it is segmentable, then the current position is segmentable.