Problem Overview: Find All Words on a Character Board
LeetCode 212 Word Search II gives you an m x n board of characters and a list of words. Your task is to find all words in the list that can be constructed from letters of sequentially adjacent cells on the board. Two cells are adjacent if they are horizontally or vertically neighboring. The same cell may not be used more than once per word, but different words can reuse cells independently.
The output is a list of all words from the input list that appear on the board. Unlike Word Search I (LeetCode 79), which asks whether a single word exists, Word Search II must handle potentially thousands of words in one pass. This distinction is what makes a naive per-word DFS approach unacceptable and motivates the Trie-based solution.
Understanding the constraints helps appreciate why the naive solution fails and why a Trie is the right data structure. The board can be up to 12x12, words can be up to length 10, and the word list can contain up to 30,000 words. The key insight is that searching each word independently ignores the shared prefixes among words — a waste the Trie eliminates.
- m x n board of lowercase English letters
- List of words to search for on the board
- Adjacent means horizontally or vertically neighboring
- Same cell cannot be reused within a single word
- Different words can reuse the same cells independently
- Return all words from the list that are found on the board
Why DFS Per Word Is Too Slow
The Word Search I solution (LeetCode 79) performs DFS from every cell for a single word: O(m*n * 4^L) where L is the word length. Calling this independently for each word in a list of W words gives O(W * m*n * 4^L) total. On a 12x12 board with 30,000 words of average length 7, this is roughly 30,000 * 144 * 4^7 = 88 billion operations — completely impractical.
The fundamental inefficiency is that each per-word DFS re-traverses the same board cells and ignores shared prefix structure. If the word list contains "apple", "application", and "apply", a per-word approach traverses the "appl" prefix three separate times from every starting cell. A Trie collapses all words with shared prefixes into a single tree, so the DFS follows each prefix path only once regardless of how many words share it.
The per-word approach also cannot prune intelligently: it commits to searching for a specific word and must explore all paths that match its characters, even when many of those paths lead nowhere useful for other words. A Trie-guided DFS prunes the moment no Trie edge matches the current character, regardless of which words are being searched.
A Trie Searches All Words Simultaneously in One Traversal
Rather than making W separate DFS calls — one per word — a Trie lets you search all W words in a single traversal of the board. At each cell during backtracking, you follow one Trie edge corresponding to the current character. If that edge does not exist, you prune immediately. If a node marks the end of a word, you collect it. One traversal, all words found — with shared-prefix work done exactly once.
Building the Trie
A TrieNode has two fields: a children map (dict in Python, HashMap in Java) keyed by character, and an optional word field that stores the complete word string if this node marks the end of a word (or null/None if it does not). Storing the full word string at the terminal node — rather than just a boolean flag — makes it easy to append the word directly to results without reconstructing it during backtracking.
Building the Trie means inserting every word from the input list character by character. For each word, start at the root and walk down the Trie, creating new TrieNode objects for characters that do not yet have a corresponding child. After processing all characters of a word, set the word field of the final node to the word string. Inserting W words of average length L takes O(W * L) time and space.
The Trie structure enables two critical optimizations during DFS: prefix pruning (if no child node exists for the current board character, abandon this path entirely) and word deduplication (once a word is found, set its terminal node's word field to null to prevent it from being collected again if the same path is reached via a different board route).
- 1Define TrieNode with children: dict/map and word: str/String (default null)
- 2Create root = TrieNode() as the Trie entry point
- 3For each word in the words list:
- 4 Start at root, walk character by character
- 5 For each char, create a child TrieNode if not present
- 6 After all chars, set node.word = the full word string
- 7Trie is now ready for Trie-guided DFS on the board
Trie-Guided Backtracking on the Board
The main DFS function takes the current board cell (row, col) and the current Trie node. At each step: if the current Trie node has no child for board[row][col], return immediately — this is the key pruning step. Otherwise, descend to the matching child node. If that child node's word field is non-null, a word has been found — add it to results.
To avoid revisiting the same cell within a single word, temporarily mark board[row][col] with a sentinel character like '#' before recursing into the four neighbors. After all recursive calls return, restore the original character. This in-place marking avoids the overhead of a separate visited array and is cache-friendly.
The outer loop calls the DFS function from every cell on the board with the Trie root as the starting node. Words that do not share any prefix with the Trie root's children are pruned on the very first step. Words that share prefixes are explored together along the same DFS paths, paying the traversal cost only once per shared prefix segment.
Pruning Dead Trie Branches After Finding a Word Speeds Up Further Searches
After collecting a found word, set its terminal TrieNode's word field to null so it is not collected again. More aggressively, after the DFS for a child subtree completes, check if that child node has become a leaf (empty children map and null word). If so, remove it from the parent's children map entirely. This shrinks the Trie progressively as words are found, making subsequent DFS calls faster by reducing the number of active Trie nodes to traverse.
Optimization Tricks
The most impactful optimization is removing found words from the Trie to prevent duplicate collection. If the board contains two distinct paths that spell "apple", a naive implementation would add "apple" to results twice. Setting the terminal node's word to null after the first collection ensures each word appears in results at most once without needing a separate result-deduplication set.
Pruning empty Trie branches goes further: after the DFS call for a child returns, if the child node now has an empty children map and a null word field, delete the child from the parent's children map. This makes the Trie progressively smaller as words are found. On boards where many words are quickly found in early DFS calls, this can dramatically reduce the work done for later board cells.
Using '#' as a visited marker instead of a separate boolean visited array has two benefits: it avoids allocating an m*n array per DFS call, and it automatically prevents the DFS from following a Trie edge for '#' (since no word in the list contains '#'). Just remember to restore the cell after backtracking.
- 1Set node.word = null immediately after adding a found word to results (dedup)
- 2After each DFS child call, check if child node is now empty (no children, null word)
- 3If child is empty, remove it from parent's children map (shrink Trie)
- 4Mark visited by setting board[r][c] = '#' before DFS, restore after
- 5Check board[r][c] != '#' at entry to avoid revisiting within same word path
- 6Start DFS from every board cell using the Trie root as starting node
Code Walkthrough — Python and Java
Python solution: class TrieNode: def __init__(self): self.children={}; self.word=None. def findWords(board, words): root=TrieNode(); [insert(root,w) for w in words]; res=[]; [dfs(board,r,c,root,res) for r in range(len(board)) for c in range(len(board[0]))]; return res. def dfs(board,r,c,node,res): if r<0 or r>=len(board) or c<0 or c>=len(board[0]): return; ch=board[r][c]; if ch=='#' or ch not in node.children: return; nxt=node.children[ch]; if nxt.word: res.append(nxt.word); nxt.word=None; board[r][c]='#'; [dfs(board,r+dr,c+dc,nxt,res) for dr,dc in [(0,1),(0,-1),(1,0),(-1,0)]]; board[r][c]=ch; if not nxt.children: del node.children[ch].
Java solution: class Solution { char[][] board; List<String> res = new ArrayList<>(); public List<String> findWords(char[][] board, String[] words) { this.board=board; TrieNode root=new TrieNode(); for(String w:words) insert(root,w); for(int r=0;r<board.length;r++) for(int c=0;c<board[0].length;c++) dfs(r,c,root); return res; } void dfs(int r,int c,TrieNode node) { if(r<0||r>=board.length||c<0||c>=board[0].length) return; char ch=board[r][c]; if(ch=='#'||!node.children.containsKey(ch)) return; TrieNode nxt=node.children.get(ch); if(nxt.word!=null){res.add(nxt.word);nxt.word=null;} board[r][c]='#'; int[][] dirs={{0,1},{0,-1},{1,0},{-1,0}}; for(int[] d:dirs) dfs(r+d[0],c+d[1],nxt); board[r][c]=ch; if(nxt.children.isEmpty()) node.children.remove(ch); } }
Time complexity: O(m*n * 4^L) in the worst case where L is the maximum word length, but Trie pruning makes the practical performance much faster — especially when the word list is large with many shared prefixes. Space complexity: O(total characters in all words) for the Trie plus O(L) recursion stack depth. Compared to the naive O(W * m*n * 4^L) per-word approach, the Trie solution is dramatically faster on real inputs.
Restore the Cell Value After Backtracking — The Most Common Bug
The visited marking trick requires restoring board[r][c] to its original character after the recursive DFS calls return. A common bug is forgetting this restoration, leaving '#' markers in the board permanently. This causes subsequent DFS calls from other starting cells to skip cells they should visit, producing incorrect results. Always save the original character before overwriting: char tmp = board[r][c]; board[r][c] = '#'; dfs(...); board[r][c] = tmp.