Accounts Merge: Where Union-Find Meets Real Data
Most Union-Find problems on LeetCode operate on integers — node 0 connects to node 3, component sizes grow, and you return a count. Accounts Merge (#721) breaks that mold. Instead of abstract nodes, you are merging real-world user accounts based on shared email addresses. It is the accounts merge leetcode problem that finally makes Union-Find feel practical.
The premise is simple: you receive a list of accounts where each account has a name followed by one or more email addresses. Two accounts belong to the same person if they share at least one email. Your job is to merge all accounts that belong to the same person and return the consolidated list with emails sorted alphabetically.
This problem sits at the intersection of Union-Find and hash map techniques. It is rated medium on LeetCode, but the real challenge is bridging the gap between string data and the integer-based Union-Find structure. Once you see how to map emails to account indices, the solution clicks into place.
Accounts Merge appears frequently in interviews at companies like Google, Meta, and Amazon because it tests whether you can adapt a well-known data structure to a messy, real-world input format. Interviewers love it because candidates who only memorize Union-Find on integer arrays struggle when the input is a list of string arrays. The problem forces you to demonstrate genuine understanding of connected components rather than template recall.
Understanding the Accounts Merge Problem
You are given a list of accounts. Each account is an array where the first element is the account holder's name and the remaining elements are email addresses. For example, ["John", "john@mail.com", "john_work@mail.com"] is one account belonging to John with two emails.
Two accounts must be merged if they share at least one email address. After merging, each resulting account should contain the name followed by all associated emails in sorted order. The output is a list of these merged accounts.
The tricky part is transitivity. If account A shares an email with account B, and account B shares an email with account C, then all three accounts belong to the same person — even though A and C share no email directly. This is exactly the kind of transitive grouping that Union-Find handles naturally.
Consider this concrete example: accounts = [["John","johnsmith@mail.com","john_newyork@mail.com"], ["John","johnsmith@mail.com","john00@mail.com"], ["Mary","mary@mail.com"], ["John","johnnybravo@mail.com"]]. The first two John accounts share "johnsmith@mail.com", so they merge into one account with all three unique emails. The third entry is Mary with her own email. The fourth John has no shared email with any other account, so he remains separate despite having the same name. The output groups the merged Johns together with sorted emails while keeping Mary and the other John as independent accounts.
Why This Problem Matters
Accounts Merge (#721) is the most practical Union-Find problem — it models a real-world scenario (merging duplicate user accounts) that engineers encounter in production systems.
Union-Find Approach for Accounts Merge
The key insight for the leetcode 721 solution is to treat each account index as a node in a Union-Find structure. When two accounts share an email, you union their indices. After processing every email, accounts in the same connected component belong to the same person.
Here is the strategy. Create a hash map that maps each email string to the first account index where you encountered it. As you iterate through the accounts list, for each email in account i, check the map. If the email already exists and was first seen in account j, union(i, j). If it is new, store i as its owner in the map.
After all unions are complete, iterate through the map again. For each email, find the root of the account index it belongs to. Group all emails by their root. Finally, for each group, sort the emails alphabetically and prepend the account name from any account in that component.
You might wonder why Union-Find is preferable to a simple BFS or DFS approach for this problem. Both work, but union find accounts for the incremental nature of the merging process more naturally. With BFS or DFS, you would first need to build an adjacency list — creating edges between all accounts that share an email — and then run a traversal to find connected components. Union-Find skips the explicit graph construction. As you process each email, you immediately merge the relevant accounts. This makes the code shorter, the logic more direct, and the approach easier to explain in an interview setting.
- Create a Union-Find with N components (one per account)
- Build an email-to-account-index map as you iterate
- When an email is seen in a second account, union those two account indices
- Group emails by their root component after all unions
- Sort each group and prepend the name — that is your merged account
Implementation Step by Step
Start by initializing a Union-Find where each account index is its own parent. The parent array has length equal to the number of accounts. Include path compression and union by rank for efficiency.
Next, build the email ownership map. Loop through each account by index. For each email in that account (skipping index 0 which is the name), check if the email is already in the map. If yes, union the current account index with the stored index. If no, add the email to the map with the current account index.
After processing all accounts, create a component map from root index to a list of emails. For every email in the ownership map, find the root of its associated account index and add the email to that root's list. Finally, for each component, sort the email list and prepend the name from any account in that component. The name is the same for all accounts in a component because the problem guarantees accounts with the same name are the only ones that can merge.
A common implementation mistake is trying to union emails directly rather than account indices. Since Union-Find works best with integer keys, you should keep the Union-Find operating on account indices (0 through N-1) and use the email-to-index hash map purely as a lookup bridge. This separation keeps the Union-Find clean and avoids the overhead of creating a string-based union structure. The hash map does the translation work, and the Union-Find does the grouping work — each data structure handles what it is best at.
- 1Initialize Union-Find with parent[i] = i for each account index
- 2For each account i, iterate its emails (indices 1 to end)
- 3If email exists in the map at index j, call union(i, j)
- 4Otherwise, set map[email] = i
- 5Build component groups: for each email in map, find root of its index, add email to that root's group
- 6Sort each group alphabetically, prepend accounts[root][0] as the name
- 7Return all groups as the result
Core Technique
Map each email to the FIRST account index you see it in. When you see the same email in another account, union those two account indices. This connects accounts through shared emails.
Visual Walkthrough: Merging Accounts Through Shared Emails
Consider the input [["John","a@mail","b@mail"], ["John","b@mail","c@mail"], ["Mary","d@mail"]]. There are three accounts with indices 0, 1, and 2. Start with each as its own component.
Process account 0: emails "a@mail" and "b@mail" are new. Map becomes {a@mail: 0, b@mail: 0}. No unions yet.
Process account 1: email "b@mail" is already in the map pointing to account 0. Union(1, 0) — accounts 0 and 1 are now connected. Email "c@mail" is new, map adds {c@mail: 1}. Since 1 and 0 share a root, c@mail effectively belongs to the same component.
Process account 2: email "d@mail" is new. Map adds {d@mail: 2}. No unions. Account 2 stays separate.
Now group by root. Emails a@mail, b@mail, and c@mail all trace back to root 0. Email d@mail traces to root 2. Sort each group: component 0 becomes ["John", "a@mail", "b@mail", "c@mail"], component 2 becomes ["Mary", "d@mail"]. The two John accounts merged through the shared email "b@mail", exactly as expected.
Notice how the union operations create a transitive closure. Account 0 and account 1 were never compared directly as whole entities — they were connected because "b@mail" appeared in both. If we added a fourth account ["John","c@mail","e@mail"], it would union with account 1 through "c@mail", and transitively join the component containing accounts 0 and 1. This cascading merge through shared attributes is the core power of Union-Find, and it happens automatically through the find operation with path compression.
Complexity and Edge Cases
The time complexity is O(N * K * alpha(N) + E log E) where N is the number of accounts, K is the maximum emails per account, alpha is the inverse Ackermann function from Union-Find, and E is the total number of unique emails (for the final sort). In practice this is nearly O(N * K + E log E) since alpha grows astronomically slowly.
Space complexity is O(N + E) for the Union-Find parent array and the email-to-index map. The component grouping phase uses O(E) additional space for the grouped email lists.
Several edge cases deserve attention. An account with a single email that appears nowhere else stays as-is. Multiple accounts can chain together through different shared emails — account A shares email X with B, and B shares email Y with C, merging all three. Two people with the same name but no shared emails remain separate accounts. An account that is already fully merged (no overlapping emails with any other account) simply gets its emails sorted.
In interviews, the follow-up question is often about optimizing the final sort. One approach is to use a sorted set (like a TreeSet in Java) instead of a plain list when collecting emails by component, which maintains sorted order during insertion rather than requiring a sort at the end. Another common follow-up asks about handling the problem at scale — what if there are millions of accounts across distributed systems? The answer involves distributed Union-Find or MapReduce-style connected components, but the single-machine version using the approach described here is what interviewers expect you to implement.
- Single-email accounts with no overlap remain unchanged
- Transitive chains: A-B share email X, B-C share email Y — all three merge
- Same name, no shared emails — accounts stay separate
- Already-merged accounts just get emails sorted in the output
- Empty email lists are not valid input per the problem constraints
Name Trap
Same name does NOT mean same person — only shared emails connect accounts. Two "John" accounts with completely different emails remain separate.
What Accounts Merge Teaches You
This problem is a masterclass in adapting Union-Find to non-integer data. The core trick — mapping strings to indices so Union-Find can operate on them — appears in many real-world systems. Social networks use the same idea to suggest friends, ad platforms use it to deduplicate user profiles, and identity resolution systems merge records exactly this way.
Accounts merge leetcode also reinforces a broader pattern: when you need to find connected components and the connections are defined implicitly (shared attributes rather than explicit edges), Union-Find with a hash map bridge is your go-to strategy. The same template solves problems like Redundant Connection (#684), Number of Connected Components (#323), and Longest Consecutive Sequence (#128).
If you want to drill this pattern until it becomes second nature, YeetCode flashcards break Union-Find problems into the key decision points: when to union, what to use as the representative element, and how to extract the final grouping. Recognizing the accounts merge pattern means you can solve any connected-components-on-strings problem in an interview without hesitation.
The accounts merge explained here sits in a family of Union-Find problems that share the same "map non-integer data to indices, then union" pattern. Similar problems include Sentence Similarity II (#737), where you union words that are synonyms, and Evaluate Division (#399), where you build weighted union structures. Recognizing this family means you can solve any of them by identifying what the "nodes" are, what constitutes a "connection", and what the final grouping query looks like. The accounts merge leetcode problem is the perfect entry point into this family because the real-world metaphor makes the abstract structure concrete.