Problem Overview: Merge K Sorted Lists
LeetCode 23 — Merge K Sorted Lists — gives you an array of k linked lists, each sorted in ascending order, and asks you to merge all of them into a single sorted linked list. It is one of the most frequently asked Hard problems in FAANG interviews and a benchmark for mastery of heap and divide-and-conquer patterns.
The problem is a natural extension of merging two sorted lists (LeetCode 21). Once you can merge two lists in O(n) time, the question becomes: how do you efficiently extend that to k lists without doing redundant work?
Understanding the constraints is essential for choosing the right approach. k can be as large as 10^4, and the total number of nodes across all lists can also reach 10^4. This means the brute-force and naive pairwise merge approaches will not be competitive.
- Input: array of k linked lists, each sorted in ascending order
- Output: one sorted linked list containing all nodes
- k can be up to 10^4 (up to ten thousand lists)
- Total nodes across all lists can be up to 10^4
- Node values fit in a 32-bit signed integer
Brute Force and Naive Merge
The simplest approach is to collect all node values into an array, sort it, then build a new linked list from the sorted values. This runs in O(n log n) time where n is the total number of nodes. While easy to implement, it does not exploit the fact that the input lists are already sorted — sorting all values from scratch wastes the pre-sorted structure.
A slightly smarter but still suboptimal approach is to merge lists one by one: merge list 1 and list 2 into a result, then merge that result with list 3, and so on. The problem is that the running result grows with each merge. In the worst case — all n nodes in the first list and one node per remaining list — each merge scans the entire result. Total work becomes O(n * k), which is O(n^2) when k is close to n.
Neither approach is acceptable when k and n approach 10^4. The key insight that unlocks the optimal solutions is that at any given step you only need to compare k candidates — one from the front of each list — rather than rescanning the entire merged result.
Key Insight
Both optimal approaches achieve O(n log k) by limiting comparisons to k candidates at each step. The min-heap keeps exactly k elements; divide-and-conquer reduces the problem size by half each round.
Min-Heap Approach
The min-heap (priority queue) approach directly models the observation above. Push the head node of each non-empty list into a min-heap keyed by node value. The heap always contains at most k elements — one per active list.
At each step, pop the smallest node from the heap, append it to the result list, and push that node's next pointer into the heap (if it exists). You are always choosing the global minimum from k candidates in O(log k) time. Repeat until the heap is empty.
The total work is O(n log k): n pop-and-push operations, each costing O(log k). Space is O(k) for the heap itself, plus O(1) extra beyond the output list. This is the approach most interviewers expect for this problem.
- 1Push the head of each non-empty list into a min-heap keyed by node value
- 2While the heap is non-empty, pop the minimum node
- 3Append the popped node to the result linked list
- 4If the popped node has a next pointer, push that next node into the heap
- 5Return the result list head (use a dummy head node to simplify edge cases)
Divide and Conquer
The divide-and-conquer approach mirrors merge sort's merge phase. In the first round, pair up adjacent lists and merge each pair: merge(list[0], list[1]), merge(list[2], list[3]), and so on. After one round you have k/2 merged lists. Repeat until only one list remains.
Each round processes all n total nodes once — O(n) work per round. With k lists, there are log k rounds (since you halve the number of lists each round). Total time is O(n log k), matching the heap approach.
The space advantage over the heap is that divide-and-conquer uses only O(log k) stack space for the recursion, compared to O(k) for the heap. In practice, for large k, the divide-and-conquer approach can also exhibit better cache locality since it operates on contiguous list pairs rather than randomly accessing heap slots.
Cache Locality
Divide-and-conquer has better cache locality than the heap approach because it works on contiguous list pairs. Each merge pass accesses memory sequentially, while the heap randomly accesses k different list positions.
Comparing the Two Approaches
Both the min-heap and divide-and-conquer approaches achieve O(n log k) time complexity, which is optimal for this problem. The differences show up in space usage, implementation complexity, and constant factors.
The heap uses O(k) extra space to maintain the priority queue. For k = 10^4 this is 10,000 heap entries — modest in absolute terms but larger than the divide-and-conquer stack. The heap approach is typically easier to code in an interview: initialize the heap, run the loop, done.
Divide-and-conquer uses O(log k) stack space for recursion. Its implementation requires managing the pairing logic and a recursive or iterative merge, which adds a few more lines. However, the constant factor is often better in practice since each merge is a tight linear scan without heap overhead.
- 1Time complexity: both approaches are O(n log k)
- 2Heap space: O(k) for the priority queue
- 3Divide-and-conquer space: O(log k) for the recursion stack
- 4Heap: simpler to code, widely expected in interviews
- 5Divide-and-conquer: better constants and cache behavior for large k
Code Walkthrough: Python and Java
In Python, the heap version requires special handling because ListNode is not comparable by default. The cleanest fix is to push tuples of (val, index, node) where index is a tiebreaker that prevents Python from attempting to compare ListNode objects directly. The index can simply be the position of the list in the original array.
The divide-and-conquer version in Python is equally straightforward: a helper function merges two sorted lists iteratively, and the main function repeatedly pairs and merges lists until one remains. In Java, the PriorityQueue takes a comparator lambda (a, b) -> a.val - b.val, making the heap version concise. Both languages must handle the edge case where the input array is empty or contains only null lists.
For the recursive divide-and-conquer in Java, be careful about the base cases: zero lists returns null, one list returns that list unchanged, two lists goes to the two-list merge helper. The recursive structure then mirrors merge sort cleanly.
Python Gotcha
In Python, ListNode is not comparable by default. If you push raw ListNode objects into heapq, you will get a TypeError when two nodes have equal values. Always push (val, index, node) tuples, or define __lt__ on the ListNode class.