Problem Overview — FreqStack with Push and Pop
LeetCode 895 Maximum Frequency Stack asks you to design a FreqStack class that supports two operations: push(val) pushes an integer val onto the stack, and pop() removes and returns the most frequent element in the stack. If there is a tie in frequency, the element closest to the top of the stack — the most recently pushed among the tied elements — is returned. For example, if you push 5, 7, 5, 7, 4, 5 in that order, the frequencies are 5→3, 7→2, 4→1. Calling pop() returns 5 because it has the highest frequency of 3. After that pop, 5→2, 7→2, 4→1. The next pop() returns 7 because 5 and 7 are tied at frequency 2 and 7 was pushed more recently than 5 among the frequency-2 layer.
The challenge is achieving O(1) time complexity for both push and pop. With up to 2*10^4 calls and values up to 10^8, any solution that scans all elements or uses a heap with lazy deletion in O(log n) per operation is technically possible but misses the point. The problem is designed to test whether you can recognize that grouping elements by frequency into separate stacks naturally handles both the frequency-maximization and the recency tie-breaking in constant time.
This problem appears in FAANG interviews as a test of data structure creativity. The elegant solution requires two insight leaps: first, that you need to track frequency per value rather than globally, and second, that rather than sorting or heaping by frequency you simply maintain one stack per frequency level and track the current maximum frequency separately.
- Implement FreqStack with push(val) and pop()
- pop() returns the most frequent element in the stack
- Ties in frequency broken by most recently pushed (recency)
- val: 0 to 10^8, at most 2*10^4 calls to push and pop
- It is guaranteed that pop() is never called on an empty FreqStack
- Both push and pop must run in O(1) time
Why a Simple Max-Heap Fails
The instinct when you see "return the most frequent element" is to reach for a max-heap keyed by frequency. You could maintain a heap where each entry stores (frequency, push_timestamp, value), and pop the heap to get the most frequent — with ties broken by the largest timestamp (most recent push). This approach is conceptually correct and gives you the right answer.
The problem is time complexity. A heap gives O(log n) push and O(log n) pop, not O(1). When you push a value that already exists, you need to update its entry in the heap — heap update is O(log n) with a standard binary heap because you must find the element first. You can work around this with lazy deletion: mark old entries as stale and skip them on pop, but this bloats the heap and still costs O(log n) amortized per operation. For an interview that explicitly asks for O(1), the heap is the wrong tool.
A second structural issue is that the heap approach conflates the frequency tracking and the recency tracking into a single key tuple. When multiple elements share the same frequency, you need a secondary sort by recency. Managing this correctly with lazy deletion becomes error-prone. The two-map solution completely sidesteps these problems by making frequency stacks do the recency tracking automatically — the most recently pushed element at a given frequency is simply the top of that frequency's stack.
The Key Insight: Group Elements by Frequency into Separate Stacks
Instead of sorting or heaping by frequency, maintain one stack per frequency level. The stack at groupMap[f] contains all values that currently have frequency f, in push order. When you need the most frequent element with tie-breaking by recency, you simply pop the top of groupMap[maxFreq] — the most recently pushed element at the highest frequency level. No sorting, no O(log n) heap operations, just O(1) stack access.
Data Structure Design — Two Maps and maxFreq
The solution uses two hash maps and one integer. The first map, freqMap, maps each value to its current frequency — how many times that value has been pushed without a corresponding pop removing it. Initially empty, freqMap grows as values are pushed and shrinks (frequencies decrement) as values are popped. The second map, groupMap, maps each frequency integer to a stack (list in Python) of values that currently have exactly that frequency. Initially empty, groupMap gains entries as push adds elements to frequency layers.
The integer maxFreq tracks the current maximum frequency across all pushed elements. It starts at 0 and is updated on push: after incrementing freqMap[val], if the new frequency exceeds maxFreq, update maxFreq. On pop, maxFreq may decrease: after popping from groupMap[maxFreq], if that stack becomes empty, decrement maxFreq by 1. Crucially, maxFreq can only decrease by 1 on each pop — you cannot jump from maxFreq=5 to maxFreq=3 in a single pop because each pop removes one element from the current max-frequency layer, which means at most one level collapses.
The elegance of this design is that an element at frequency f also exists implicitly at all lower frequency levels. When you push val for the third time (so freqMap[val] goes from 2 to 3), you add val to groupMap[3]. But val also remains in groupMap[2] and groupMap[1] from its earlier pushes. This is intentional. When val is popped at frequency 3, it is removed from groupMap[3] and freqMap[val] decrements to 2 — it is now correctly accessible at groupMap[2] for the next pop that needs frequency-2 elements. The lower-frequency stacks act as a history of the element's push sequence.
- 1freqMap: dict mapping val -> current frequency (int)
- 2groupMap: dict mapping freq -> stack (list) of vals at that frequency
- 3maxFreq: int tracking the current maximum frequency, starts at 0
- 4Python: use defaultdict(int) for freqMap, defaultdict(list) for groupMap
- 5Java: use HashMap<Integer, Integer> for freqMap, HashMap<Integer, Stack<Integer>> for groupMap
- 6Both structures are O(n) space total where n is the number of push calls
Push Operation — Increment and Update maxFreq
The push(val) operation has three steps, all O(1). First, increment freqMap[val] by 1. If val is not yet in freqMap, initialize it to 0 before incrementing (or use defaultdict(int) in Python which initializes to 0 automatically). After this step, freqMap[val] holds the new frequency of val — call it f.
Second, append val to groupMap[f]. This records that val has been pushed to frequency level f, and positions it as the most recent element at that frequency. If groupMap[f] does not exist yet, initialize it as an empty list or stack before appending. In Python with defaultdict(list), this is automatic. In Java, you must check if the key f exists in groupMap and create a new Stack if not: groupMap.putIfAbsent(f, new Stack<>()).
Third, update maxFreq: set maxFreq = max(maxFreq, f). Since f is the new frequency of val and frequencies only increase by 1 per push, f can be at most maxFreq + 1. If f equals maxFreq + 1, then val is the new most-frequent element and maxFreq must advance. If f is less than or equal to maxFreq, the maximum does not change. All three steps are O(1) hash map and integer operations.
An Element Appears in Multiple Frequency Stacks Simultaneously
When you push val for the third time, val is added to groupMap[3]. It already exists in groupMap[1] (from the first push) and groupMap[2] (from the second push). All three entries coexist. This is by design — they represent val's push history at each frequency milestone. When val is eventually popped at frequency 3, only groupMap[3] is affected. The entries in groupMap[1] and groupMap[2] remain untouched and will be used if val needs to be returned at those frequency levels later.
Pop Operation — Remove from maxFreq Layer
The pop() operation has four steps, all O(1). First, pop the top element from groupMap[maxFreq]. Call this element val. This is the most recently pushed element at the current maximum frequency — exactly what the problem requires. In Python this is groupMap[maxFreq].pop(). In Java this is groupMap.get(maxFreq).pop().
Second, decrement freqMap[val] by 1. This reflects that val is now at a lower frequency in the data structure. Its next appearance in groupMap will be at frequency freqMap[val] (the decremented value). If freqMap[val] reaches 0, you can optionally delete the key from freqMap to save space, though it is not required for correctness.
Third, check if groupMap[maxFreq] is now empty. If the stack at maxFreq has no more elements, no element currently has that frequency, so decrement maxFreq by 1. This is the only case where maxFreq decreases. If groupMap[maxFreq] still has elements, maxFreq stays the same — there are other elements with the same maximum frequency waiting to be returned by future pop calls. Fourth, return val.
- 1val = groupMap[maxFreq].pop() — get most recent element at max frequency
- 2freqMap[val] -= 1 — decrement frequency of the popped element
- 3If groupMap[maxFreq] is empty: maxFreq -= 1 — lower the max frequency tracker
- 4Return val
- 5O(1) total: one stack pop, one dict decrement, one conditional decrement, one return
- 6maxFreq can only decrease by 1 per pop — no multi-level collapse possible
Code Walkthrough — Python and Java
The Python implementation uses defaultdict for clean initialization. In __init__, set self.freqMap = defaultdict(int) and self.groupMap = defaultdict(list) and self.maxFreq = 0. The push method: self.freqMap[val] += 1; f = self.freqMap[val]; self.groupMap[f].append(val); self.maxFreq = max(self.maxFreq, f). The pop method: val = self.groupMap[self.maxFreq].pop(); self.freqMap[val] -= 1; if not self.groupMap[self.maxFreq]: self.maxFreq -= 1; return val. The entire class is about 10 lines. Both push and pop are O(1) amortized. Space is O(n) where n is the total number of push calls.
The Java implementation uses HashMap with explicit initialization. In the constructor: freqMap = new HashMap<>(); groupMap = new HashMap<>(); maxFreq = 0. The push method: int f = freqMap.getOrDefault(val, 0) + 1; freqMap.put(val, f); groupMap.putIfAbsent(f, new Stack<>()); groupMap.get(f).push(val); if (f > maxFreq) maxFreq = f. The pop method: int val = groupMap.get(maxFreq).pop(); freqMap.put(val, freqMap.get(val) - 1); if (groupMap.get(maxFreq).isEmpty()) maxFreq--; return val. O(1) for both operations, O(n) space.
A subtle correctness point: you should not clean up old groupMap entries when a frequency stack becomes empty during pop. The only action needed is decrementing maxFreq. If you also delete groupMap[maxFreq], you lose history for values at lower frequencies. Similarly, do not delete freqMap entries when a value's frequency hits 0 unless you add a null check before accessing it again — the defaultdict(int) in Python handles this safely, but in Java you must use getOrDefault.
Do Not Remove Values from Lower-Frequency Stacks on Push
A common mistake is trying to "move" val from groupMap[f-1] to groupMap[f] on push — removing it from the old stack and adding it to the new one. This breaks the algorithm. The element must remain in all lower-frequency stacks because those entries represent its history. When val is eventually popped from groupMap[f], its entry in groupMap[f-1] is exactly the record needed for its next pop. The pop operation handles cleanup naturally — you never need to modify lower-frequency stacks during push.