Problem Walkthrough

Word Break II LeetCode 140 Guide

Solve LeetCode 140 Word Break II using backtracking with memoization to generate all valid sentence segmentations — understanding why Word Break I's pure DP approach cannot enumerate paths and how caching suffix results eliminates exponential re-exploration of overlapping subproblems.

9 min read|

Word Break II

LeetCode 140

Problem Overview

LeetCode 140 — Word Break II — gives you a string s and a dictionary of strings wordDict. Your task is to add spaces in s to construct a sentence where each word is a valid dictionary word, and return all such possible sentences in any order. Unlike Word Break I (LeetCode 139), which only asks if a valid segmentation exists, Word Break II requires you to enumerate every valid segmentation explicitly.

For example, given s = "catsanddog" and wordDict = ["cat","cats","and","sand","dog"], the output is ["cats and dog", "cat sand dog"]. Both segmentations cover all characters of s with words from the dictionary. If no valid segmentation exists, return an empty list. The problem is harder than it looks because the number of valid sentences can be exponentially large in the worst case.

The challenge combines two algorithmic ideas: feasibility checking (can the string be broken?) and path enumeration (list all ways to break it). Pure dynamic programming from Word Break I answers the first question efficiently but cannot enumerate paths without major extension. The right approach is backtracking guided by the dictionary, with memoization to avoid redundant work on repeated suffixes.

  • Constraints: 1 <= s.length <= 20
  • 1 <= wordDict.length <= 1000
  • 1 <= wordDict[i].length <= 10
  • s and wordDict[i] consist of lowercase English letters only
  • All strings in wordDict are unique
  • Input is guaranteed to have at least one valid answer (though the output may be empty if none exists)

From Word Break I to II

Word Break I (LeetCode 139) is a classic 1-D DP problem. Build a boolean array dp where dp[i] is true if the substring s[0:i] can be segmented into dictionary words. Fill it left to right: for each position i, check every j < i such that dp[j] is true and s[j:i] is in the dictionary. If any such j exists, set dp[i] = true. The answer is dp[len(s)].

This DP tells you whether a valid segmentation exists but gives no information about which prefixes were chosen at each step. Reconstructing all paths from a feasibility DP requires storing backtracking links at every position, which essentially transforms the DP into the full memoized backtracking solution anyway. It is cleaner and more intuitive to go directly to backtracking for Word Break II.

The transition from I to II is conceptual: shift from "can we reach the end?" to "what are all the routes to the end?" Backtracking answers the second question naturally — at each position, try every prefix that is a dictionary word, recurse on the remaining suffix, and collect results. Memoization is the efficiency layer that prevents the same suffix from being re-solved from scratch each time it is reached via different prefixes.

💡

Use Word Break I as a Pre-Check

Run Word Break I's DP first as a feasibility check. If dp[len(s)] is false, the string cannot be segmented at all — return an empty list immediately without running the expensive backtracking enumeration. This pre-check eliminates the worst-case cost for inputs where no valid sentence exists, at the price of one O(n^2) DP pass.

Backtracking Approach

The backtracking function takes a start index and returns all valid sentences that can be formed from s[start:]. Base case: when start equals len(s), return a list containing one empty string — the sentinel that signals a complete segmentation has been reached. Recursive case: try every end index from start+1 to len(s)+1; if s[start:end] is in the dictionary, recurse on end to get all sentences for the remaining suffix, then prepend s[start:end] to each suffix sentence.

Combining prefix word with suffix results is straightforward: if the current word is "cat" and the suffix returns ["sand dog"], produce "cat sand dog". If start == len(s) the base case returns [""], and "cat" combined with "" produces "cat" — which is handled by checking if the suffix sentence is empty and omitting the extra space. This keeps the combination logic clean and uniform across all recursion levels.

The time complexity without memoization is O(n * 2^n) in the worst case — for a string like "aaa...a" with dictionary ["a","aa","aaa",...], nearly every prefix is valid and the tree fans out exponentially. With memoization, repeated suffix calls are resolved in O(1) and the total work is proportional to the size of the output, which is bounded but can still be exponential when the answer set is large.

  1. 1Convert wordDict to a set for O(1) lookup
  2. 2Define backtrack(start) returning list of sentences from s[start:]
  3. 3Base case: if start == len(s), return [""]
  4. 4Initialize results = []
  5. 5Loop end from start+1 to len(s)+1:
  6. 6 word = s[start:end]
  7. 7 If word not in wordSet: continue
  8. 8 For each suffix_sentence in backtrack(end):
  9. 9 sentence = word + " " + suffix_sentence if suffix_sentence else word
  10. 10 Append sentence to results
  11. 11Return results

Memoization for Overlapping Subproblems

The key observation that motivates memoization is that different prefixes can lead to the same suffix. For s = "catsanddog", both "cat" and "cats" are valid first words. After choosing "cat", the suffix is "sanddog" starting at index 3. After choosing "cats", the suffix is "anddog" starting at index 4. These are different suffixes, so no sharing here. But in a string like "aaa" with dictionary ["a","aa"], "a|aa" and "aa|a" both leave the same suffix "" at different points — and in longer strings the overlap grows rapidly.

Memoization stores the list of sentences for each start index after the first time it is computed. On subsequent calls with the same start, return the cached list immediately. In Python this is most elegantly expressed with @functools.lru_cache on the backtrack function, or with a memo dictionary keyed by start index. In Java, use a HashMap<Integer, List<String>> initialized before the recursive method.

The memoized solution avoids redundant work at the cost of extra memory for the cache. In the worst case the cache holds O(n) entries, each containing a list of sentences. The total memory is proportional to the total size of all cached sentence lists, which is bounded by the output size. For inputs with large valid sentence sets the memory can be substantial, but that is unavoidable — any algorithm must produce the full output.

ℹ️

Without Memoization, Repeated Suffixes Are Re-Explored Exponentially

Without memoization, the same suffix starting at index i can be re-explored once for every distinct prefix sequence that leads to position i. In a string where many prefixes are valid, this creates exponential redundancy: backtrack(5) might be called hundreds of times, each time recomputing the same sentence list from scratch. Memoization makes repeated suffix calls O(1) by caching the result on the first computation.

Using a HashSet for Dictionary Lookup

The first step in any implementation is converting wordDict from a list to a set. List membership checking is O(k) where k is the dictionary size; set lookup is O(1) average case. Inside the backtracking loop, s[start:end] is tested against the dictionary at every position for every recursive call, so the lookup cost compounds quickly. Using a set ensures that dictionary checks do not become the bottleneck.

Try all prefix lengths from 1 to len(s) - start. For each candidate prefix s[start:end], check membership in the word set. Only recurse when the prefix is a valid word — this is the pruning step that keeps the backtracking tree manageable. Early pruning is especially effective when the dictionary is small relative to the string length, because most prefixes will fail the membership check and their subtrees are never explored.

An additional optimization is to limit the maximum prefix length to the longest word in the dictionary. If no word in wordDict exceeds 10 characters (the constraint ceiling), then end - start never needs to exceed 10. Instead of looping end from start+1 to len(s)+1, loop end from start+1 to min(start + max_word_len + 1, len(s) + 1). This caps the inner loop at a constant factor and can dramatically reduce work when the string is long but dictionary words are short.

  1. 1wordSet = set(wordDict)
  2. 2max_len = max(len(w) for w in wordDict)
  3. 3In backtrack(start): loop end from start+1 to min(start+max_len+1, len(s)+1)
  4. 4Check if s[start:end] in wordSet before recursing
  5. 5Skip to next end if word not found — no recursion needed
  6. 6Only recurse with backtrack(end) when s[start:end] is a valid word

Code Walkthrough: Python and Java

In Python, define wordBreak(s, wordDict) with wordSet = set(wordDict). Use @functools.lru_cache(None) on an inner function backtrack(start) that returns a tuple of sentences (tuples are hashable, required for lru_cache). Base case: return ("",) when start == len(s). Inner loop tries s[start:end] for end in range(start+1, len(s)+1); for each valid word, join it with each suffix sentence and collect. Convert the final result from backtrack(0) to a list, filtering out the lone empty string. Time complexity is O(n * 2^n) worst case, but memoization prunes heavily in practice.

In Java, declare a HashMap<Integer, List<String>> memo. The private List<String> backtrack(String s, Set<String> wordSet, int start, Map<Integer, List<String>> memo) method checks memo first and returns immediately if the key exists. Otherwise, create a new ArrayList<String> result. If start == s.length(), add "" and put result in memo, then return. Loop end from start+1 to s.length()+1: if wordSet.contains(s.substring(start, end)), recurse on end and combine. Store result in memo before returning. Call backtrack(s, wordSet, 0, memo) from the public method.

Both implementations share the same structure: convert dict to set, memoize by start index, base case returns [""] at the end of the string, and combination joins prefix word with suffix sentences using a space. The join logic handles the empty-string base case by returning just the word when the suffix is "". The overall algorithm is correct by induction: if backtrack(end) returns all valid sentences from s[end:], then prepending s[start:end] to each gives all valid sentences from s[start:].

⚠️

Output Can Be Exponentially Large

The result set can grow exponentially. For s = "aaa" with wordDict = ["a","aa","aaa"], there are 4 valid sentences: "a a a", "a aa", "aa a", "aaa". For longer strings of repeated characters with a rich dictionary, the output can contain thousands of sentences. Be prepared for large result sets and avoid solutions that try to deduplicate or sort the output, since the problem guarantees all returned sentences are distinct.

Ready to master algorithm patterns?

YeetCode flashcards help you build pattern recognition through active recall and spaced repetition.

Start practicing now