Problem Walkthrough

Find All Anagrams in String LeetCode 438

Solve LeetCode 438 Find All Anagrams in a String using a fixed-size sliding window with two character frequency arrays and a matches counter — an O(n) approach that never re-scans the window and finds every anagram start index in one pass.

8 min read|

Find All Anagrams

LeetCode 438

Problem Overview: Finding Every Anagram Start Index

LeetCode 438 Find All Anagrams in a String gives you two strings s and p. Your task is to find all start indices in s where a substring of length len(p) is an anagram of p. An anagram is any rearrangement of p's characters — two strings are anagrams if and only if they have the exact same character frequencies.

The function should return a list of all valid start indices in any order. For example, with s = "cbaebabacd" and p = "abc", the result is [0, 6] because s[0..2] = "cba" and s[6..8] = "bac" are both anagrams of "abc".

This is a medium-difficulty problem that tests your ability to think in terms of frequency windows. The key insight is that anagram detection reduces to frequency comparison, and maintaining two frequency arrays over a sliding fixed-size window lets you detect anagrams in a single pass without recomputing from scratch at each position.

  • s and p consist of lowercase English letters only
  • 1 <= s.length, p.length <= 3 * 10^4
  • Return all start indices of p's anagrams in s (any order)
  • Anagram = same character frequencies, any permutation of characters
  • Substring must have exactly length len(p)
  • Output list may be empty if no anagrams exist

Brute Force and Why It Is Slow

The naive approach checks every substring of s with length len(p) to see if it is an anagram of p. One way is to sort each substring and compare with sorted(p) — this is O(n * k log k) where k = len(p). For the given constraints this is around 3*10^4 * 17 * 4 = 2 million operations, which might barely pass but is far from optimal.

A slightly better brute force builds a frequency array for each window of size k and compares all 26 counts against p's frequency array. Each window comparison is O(26), and there are O(n) windows, giving O(26n) overall — which works but still re-counts every character from scratch for each position instead of reusing the previous window's counts.

The truly efficient solution avoids any full array comparison per step by maintaining a single integer counter called matches. This matches counter tracks how many of the 26 character slots currently have equal counts between the window and p. When matches reaches 26, the window is an anagram. Sliding the window updates only 1-2 character slots per step, making each step O(1).

💡

Matches Counter Key Idea

Use a "matches" integer that counts how many of the 26 characters have equal frequency in the current window and in p. When matches == 26, all characters are balanced — record the start index. This turns the O(26) full comparison per step into O(1) by only updating the 1-2 characters affected by each slide.

Fixed-Size Sliding Window Setup

Start by building the frequency array for p — call it pCount, an array of 26 integers indexed by character (0 = 'a', 25 = 'z'). Then build sCount for the first window of s with length len(p). Both arrays are initialized to all zeros and updated by incrementing the index corresponding to each character.

After filling pCount and sCount for the first window, compute the initial matches value by comparing pCount[i] == sCount[i] for all 26 characters and summing the matching slots. If matches == 26 immediately, index 0 is a valid anagram start.

The window size is always exactly len(p) — it never grows or shrinks. This is a fixed-size sliding window problem, which is simpler than variable-size windows because you always add one character on the right and remove one character on the left in each step.

  1. 1Compute k = len(p); return [] immediately if k > len(s)
  2. 2Build pCount[26] from all characters of p
  3. 3Build sCount[26] from the first k characters of s
  4. 4Compute initial matches = sum(1 for i in range(26) if sCount[i] == pCount[i])
  5. 5If matches == 26, append index 0 to result
  6. 6Begin the sliding loop from index k to len(s) - 1

Sliding the Window One Step at a Time

In each step of the sliding loop, add the new right character (s[r]) and remove the old left character (s[l] where l = r - k). For each of these two characters, update sCount and adjust the matches counter based on whether the count for that character just became equal to or diverged from pCount.

The order matters: first update sCount for the new character, adjust matches for that character, then update sCount for the removed character, adjust matches for it, then check if matches == 26. If it is, append the current left index (l + 1 after removal, or equivalently r - k + 1) to the result list.

Because only two characters change per step — the one entering and the one leaving — only two of the 26 matches slots can change per iteration. This guarantees O(1) work per step regardless of window size, giving the whole algorithm O(n) time overall.

ℹ️

O(1) Per Step via Matches Counter

The matches counter turns the O(26) per-step comparison into O(1). Only the character being added (right) and the character being removed (left) can change their match status. Update matches only for those two characters each iteration. All other 24 characters remain unchanged in both sCount and pCount.

Matches Counter Logic in Detail

When adding a new right character c: increment sCount[c]. Then check: if sCount[c] now equals pCount[c], increment matches (they just became balanced). But if sCount[c] - 1 equaled pCount[c] before the increment (meaning they were equal before and now sCount[c] is one more), decrement matches (they just became unbalanced). These two conditions are mutually exclusive.

When removing the old left character c: decrement sCount[c]. Then check: if sCount[c] now equals pCount[c], increment matches (they just became balanced again). But if sCount[c] + 1 equaled pCount[c] before the decrement (meaning they were equal before and now sCount[c] is one less), decrement matches. Again, mutually exclusive.

This logic is symmetric for adding and removing. It is easy to implement but tricky to get exactly right — check the condition before the update for removal and after the update for addition, or check both before and after. The safest pattern is to check the before-state, update, then check the after-state for each character.

  1. 1Add right char c: if sCount[c] == pCount[c], matches-- (about to become unequal)
  2. 2Add right char c: sCount[c]++
  3. 3Add right char c: if sCount[c] == pCount[c], matches++ (now equal)
  4. 4Remove left char c: if sCount[c] == pCount[c], matches-- (about to become unequal)
  5. 5Remove left char c: sCount[c]--
  6. 6Remove left char c: if sCount[c] == pCount[c], matches++ (now equal)

Code Walkthrough: Python and Java

The Python solution uses two lists of size 26. Build pCount and sCount from the first window. Compute initial matches. Then loop from k to len(s): for each new character add it to sCount with the before/after matches logic, remove the leftmost character with the same logic, and if matches == 26 append r - k + 1. Total time O(n), space O(1) since the arrays are constant size 26.

In Java, use int[26] arrays. The logic is identical. Java's explicit indexing with c - 'a' is straightforward. The loop runs from k to s.length() - 1 inclusive. ArrayList<Integer> collects the result. Both Python and Java solutions run in O(n) time and O(1) space, and the two frequency arrays plus the integer result list are all the memory needed.

A common alternative uses a HashMap<Character, Integer> instead of fixed arrays, which works for Unicode inputs but adds overhead for this problem. Since the problem guarantees lowercase English letters, the fixed int[26] array is optimal — direct index access, no hashing, and the 26-slot comparison at initialization is always exactly 26 iterations regardless of input size.

⚠️

Initialize Matches from the First Window

Do not start with matches = 0 and assume the first window is handled by the loop. Instead, fully compare pCount and sCount after building the first window to get the correct initial matches value, and check if matches == 26 before entering the loop. Starting from 0 without comparison will give wrong results if the first window is an anagram.

Ready to master algorithm patterns?

YeetCode flashcards help you build pattern recognition through active recall and spaced repetition.

Start practicing now