Problem Walkthrough

Kth Largest in Stream LeetCode 703

Maintain a min-heap of exactly k elements where the heap top is always the kth largest element seen so far — any element smaller than the heap top is irrelevant and gets evicted immediately, keeping the heap lean and every add operation O(log k).

7 min read|

Kth Largest in Stream

LeetCode 703

Problem Overview

LeetCode 703 — Kth Largest Element in a Stream — asks you to design a class KthLargest that tracks a stream of integers and returns the kth largest element after each new value is added. The class must support a constructor and an add method that both operate efficiently across many calls.

The challenge here is that the stream is unbounded — you cannot afford to sort all elements each time add is called. You need a data structure that maintains partial order and answers the kth largest query in sub-linear time. A heap is the natural fit.

This is a foundational streaming algorithm problem. The same min-heap-of-size-k pattern appears in top-k frequent elements, find median from data stream, and nearly every interview question that involves maintaining a running statistic over an ordered stream.

  • Design KthLargest class: constructor(k, nums) initializes with an integer k and an initial list nums
  • add(val) inserts val into the stream and returns the current kth largest element
  • The kth largest means the kth largest in sorted order — not the kth distinct element
  • Multiple calls to add must each return the correct kth largest at that point in the stream
  • Constraints: 1 ≤ k ≤ 10⁴, 0 ≤ nums.length ≤ 10⁴, -10⁴ ≤ nums[i] ≤ 10⁴, at most 10⁴ calls to add

Min-Heap of Size k

The key insight is maintaining a min-heap of exactly k elements. The min-heap keeps its smallest element at the top (index 0). If the heap always contains the k largest elements seen so far, then the top of the heap — the smallest among those k — is exactly the kth largest overall.

Any element smaller than the heap top cannot be in the top-k and can be discarded. When a new element arrives that is larger than the heap top, it displaces the current kth largest (the old top) and becomes part of the new top-k. The old top is evicted. The heap top becomes the new kth largest.

This structure is extremely cache-friendly. The heap never grows beyond k elements, so memory usage is O(k) regardless of how many total elements have been streamed. The top is always accessible in O(1), and push/pop operations are O(log k) — independent of total stream length.

💡

Counterintuitive: MIN-heap for kth LARGEST

To find the kth LARGEST element, use a MIN-heap of size k. The smallest element in the heap is the kth largest because exactly k-1 elements above it in the stream are all larger. A max-heap of size k would give you the kth smallest instead.

Constructor

The constructor takes k and an initial array nums. The goal is to initialize the heap so it contains the k largest elements from nums. One approach: heapify all nums in O(n), then repeatedly pop the smallest until the heap size reaches k. Each pop is O(log n), and at most n-k pops are needed, giving O(n log n) total.

A more efficient approach: heapify the first k elements in O(k), then for each remaining element, if it is greater than the heap top, push it and pop the old top. This runs in O(n log k) overall — better when k is small relative to n. Both approaches are valid within the problem constraints.

After the constructor finishes, the heap contains exactly k elements (or fewer if nums has fewer than k elements — the problem guarantees there will always be at least k elements in the stream before add is called). The heap top is the current kth largest.

  1. 1Heapify all nums into a min-heap
  2. 2While heap size > k, pop the smallest element
  3. 3Heap now contains exactly the k largest elements from nums
  4. 4Heap top (heap[0]) is the current kth largest

Add Operation

The add operation is elegantly simple. Push val onto the heap — now the heap may have k+1 elements. If it does, pop the smallest. The heap is back to size k. Return heap[0], which is the kth largest.

This O(log k) operation is the heart of the design. Since the heap never exceeds k+1 elements during add, push and pop each touch at most O(log k) nodes. The total work per add call is bounded by the heap height, which is log k.

After each add, the heap invariant holds: it contains the k largest elements seen so far, and the top is the kth largest. This invariant is maintained through every insertion regardless of whether val is larger or smaller than the current kth largest.

ℹ️

No-op disguised as push+pop

If val is smaller than heap[0] (the current kth largest), pushing then immediately popping removes val itself. The heap content is unchanged. The kth largest does not change. This is effectively a no-op — but the code handles it uniformly without any special case.

Walk-Through Example

Let k=3 and initial nums=[4,5,8,2]. After heapify and trimming to size 3, the heap contains [4,5,8] with top=4 (the 3rd largest). The first add call arrives.

add(3): push 3 → heap becomes [3,4,5,8] with size 4. Pop smallest (3) → heap is [4,5,8]. Return heap[0] = 4. The 3rd largest of {4,5,8,2,3} = 4. Correct.

add(5): push 5 → heap becomes [4,5,5,8] with size 4. Pop smallest (4) → heap is [5,5,8]. Return heap[0] = 5. The 3rd largest of {4,5,8,2,3,5} = 5. Correct. Each add step takes O(log 3) = O(1) constant time for this small k.

  1. 1k=3, nums=[4,5,8,2]: heapify → pop 2 → heap=[4,5,8], top=4
  2. 2add(3): push → [3,4,5,8], pop 3 → [4,5,8], return 4
  3. 3add(5): push → [4,5,5,8], pop 4 → [5,5,8], return 5
  4. 4add(10): push → [5,5,8,10], pop 5 → [5,8,10], return 5
  5. 5Each add is O(log k) — the heap never grows beyond k+1 elements

Code Walkthrough — Python and Java

In Python, use the heapq module which provides a min-heap. Constructor: self.k = k; self.heap = nums; heapq.heapify(self.heap); while len(self.heap) > k: heapq.heappop(self.heap). Add: heapq.heappush(self.heap, val); if len(self.heap) > self.k: heapq.heappop(self.heap); return self.heap[0]. Constructor is O(n log k), add is O(log k), space is O(k).

In Java, use PriorityQueue which is a min-heap by default. Constructor: this.k = k; this.pq = new PriorityQueue<>(); for (int num : nums) { pq.offer(num); if (pq.size() > k) pq.poll(); }. Add: pq.offer(val); if (pq.size() > k) pq.poll(); return pq.peek(). Same O(n log k) constructor and O(log k) add complexity.

Both implementations share the same algorithmic structure: maintain a min-heap bounded to size k. The heap library handles all balancing automatically. The only language-specific detail is that Python uses heapq.heappush/heappop on a list, while Java uses PriorityQueue.offer/poll. Space complexity is O(k) for the heap storage.

⚠️

Do not use heapq.nlargest

Python's heapq.nlargest(k, stream) won't work here because it builds a new heap from scratch on every call — O(n) per add instead of O(log k). You must maintain a persistent heap instance across calls to achieve the O(log k) per-add guarantee that makes this design efficient.

Ready to master algorithm patterns?

YeetCode flashcards help you build pattern recognition through active recall and spaced repetition.

Start practicing now