Group Anagrams: The Classic Hash Map Grouping Problem

Group Anagrams (#49) is one of the most elegant problems on LeetCode. It sits at the intersection of string manipulation and hash map usage, and once you see the core insight, the solution practically writes itself. This is the group anagrams leetcode problem that every interviewer loves because it tests whether you can derive a grouping key — a fundamental skill in algorithm design.

The problem appears constantly in real interviews at Amazon, Meta, Bloomberg, and Google. It is rated Medium, but the difficulty is really about recognizing the pattern rather than implementing anything complex. If you understand how hash maps group values by key, you already have ninety percent of the solution.

In this walkthrough, we will break down two clean approaches — the sorted string key and the character count key — explain why both work, cover edge cases, and show you how this pattern extends to dozens of other problems. By the end, you will be able to solve any grouping problem that follows this template.

Understanding the Problem

The problem statement is straightforward: given an array of strings, group the anagrams together. Two strings are anagrams if they contain exactly the same characters in any order. For example, "eat", "tea", and "ate" are all anagrams of each other because they each contain one "a", one "e", and one "t".

You can return the groups in any order, and each group can list its words in any order. The input array can contain empty strings, single characters, and words of varying lengths. The key constraint is that all strings consist of lowercase English letters only, which is important for the character count approach.

What makes this problem interesting is the question it implicitly asks: how do you define a canonical representation that is the same for all anagrams of a word? Once you answer that question, the grouping becomes a single hash map operation. This is the core insight behind every group anagrams leetcode solution.

ℹ️

Interview Frequency

Group Anagrams (#49) is one of the top 10 most-asked Medium problems across all companies — it's the go-to test for hash map grouping skills and appears at Amazon, Meta, and Bloomberg.

Approach 1: Sorted String Key

The most intuitive approach is to sort each word alphabetically and use the sorted version as a hash map key. Since all anagrams contain the same characters, sorting them produces the identical string. For example, "eat", "tea", and "ate" all sort to "aet". You group every word that maps to the same sorted key.

The implementation is remarkably concise. Create a hash map where keys are sorted strings and values are lists of original words. For each word in the input, compute its sorted form, and append the original word to the corresponding list. After processing all words, return the hash map values as your grouped result.

The time complexity is O(n * k log k), where n is the number of strings and k is the maximum length of a string. The k log k factor comes from sorting each word. Space complexity is O(n * k) to store every string in the hash map. In practice, this approach is fast enough for any interview setting and is the version most interviewers expect to see first.

Sort each word alphabetically to create a canonical key
Use a hash map: sorted string -> list of original words
All anagrams share the same sorted key
Time: O(n * k log k) | Space: O(n * k)
Python: defaultdict(list) with "".join(sorted(word)) as the key

Approach 2: Character Count Key

The second approach avoids sorting entirely by counting character frequencies. For each word, create an array of 26 integers representing the count of each letter from a to z. Convert this array to a tuple or string and use it as the hash map key. Two anagrams will always produce the same character count signature.

For example, "eat" produces the count array [1,0,0,0,1,0,...,1,0,...,0] — one "a" at index 0, one "e" at index 4, one "t" at index 19. Every anagram of "eat" produces this exact same array. By converting it to a tuple like (1,0,0,0,1,0,...,1,0,...,0), you get a hashable key for your dictionary.

The time complexity improves to O(n * k) because counting characters is linear in the word length — no sorting needed. The trade-off is that the key itself is longer (26 integers versus the sorted string), but this rarely matters in practice. The character count key anagram approach is worth knowing because interviewers sometimes ask explicitly for a solution without sorting.

Count the frequency of each character (a-z) in each word
Convert the 26-element count array to a tuple or delimited string
Use this count signature as the hash map key
Time: O(n * k) | Space: O(n * k)
Avoids sorting — useful when interviewer asks for linear-time solution

💡

Pro Tip

The sorted-string key approach is cleaner to code in interviews: defaultdict(list) with sorted(word) as the key. Three lines of Python. Use character-count key only if the interviewer asks for O(n*k).

Why These Approaches Work

Both approaches work because anagrams are defined by their character multiset — the collection of characters with their frequencies, ignoring order. Any function that maps a string to a canonical representation of its multiset will produce the same output for all anagrams. Sorting and character counting are two such functions.

The sorted string approach works because sorting is a canonical ordering. If two strings are anagrams, they have the same characters, so sorting produces the same string. The character count approach works because it directly encodes the multiset as a fixed-length vector. Both are valid canonical forms of the same underlying mathematical object.

This insight — that grouping requires a canonical key derived from shared properties — is the real lesson of the group anagrams hash map pattern. You will see it again in problems like Group Shifted Strings (#249), where the key is the sequence of differences between consecutive characters, and Valid Anagram (#242), which is a simplified two-string version of the same idea.

Edge Cases and Pitfalls

Empty strings are valid input and are anagrams of each other. If the input contains multiple empty strings, they should all be grouped together. Both approaches handle this naturally — an empty string sorts to an empty string, and its character count is all zeros.

Single-character strings each form their own group unless duplicates exist. An input like ["a", "b", "a"] should produce [["a", "a"], ["b"]]. When all strings are anagrams of each other, the output is a single group containing every input string. When no two strings are anagrams, each string forms its own group.

A common implementation pitfall in Python involves hashability. Lists are not hashable, so you cannot use a character count list directly as a dictionary key. You must convert it to a tuple first. Similarly, in Java, you would convert the count array to a string using Arrays.toString() or a delimiter-joined format. Missing this detail in an interview can cost you valuable time.

Empty strings group together — both approaches handle this naturally
Single characters form individual groups unless duplicated
All-same-anagram input produces one large group
No-anagram input produces n singleton groups
Convert count arrays to tuples (Python) or strings (Java) for hashability

⚠️

Common Gotcha

In Python, lists aren't hashable so you can't use a character count list as a dict key directly — convert to a tuple: tuple(sorted(word)) or tuple(count). This is a common interview gotcha.

What Group Anagrams Teaches You

The group anagrams explained pattern is really about derived-key grouping with a hash map. You take raw data, compute a canonical key that captures the grouping property, and use a hash map to collect items by that key. This exact template appears in dozens of LeetCode problems and real-world scenarios like database indexing and data deduplication.

Problems that follow the same pattern include Group Shifted Strings (#249), where you compute shift deltas as the key; Longest String Chain (#1048), where you explore word removal paths; and even real-world tasks like grouping log entries by normalized patterns. Once you internalize this approach from solving leetcode 49 solution, you unlock a whole family of problems.

If you want to solidify this pattern through spaced repetition, YeetCode flashcards cover Group Anagrams alongside the full Arrays and Hashing category. Reviewing the pattern regularly — rather than re-solving the problem from scratch — is the fastest way to make the hash map grouping template second nature for your next coding interview.

Group Anagrams LeetCode Solution: Hash Map Grouping Explained