What Is a Minimum Spanning Tree
A minimum spanning tree (MST) is a subset of edges in a weighted undirected graph that connects all n nodes using exactly n-1 edges while minimizing the total edge weight. The MST spans the entire graph (reaches every node) and is acyclic (no cycles). If all edge weights are distinct, the MST is unique.
MSTs arise in real-world problems whenever you need to connect multiple nodes at minimum cost: laying network cables between cities, designing road networks, clustering data points, and building communication infrastructure. The classic LeetCode MST problems disguise these scenarios as point grids, city connections, or pipe networks.
Two fundamental properties define MSTs: the cut property (the minimum-weight edge crossing any cut of the graph belongs to some MST) and the cycle property (the maximum-weight edge in any cycle does not belong to any MST). These properties underpin both Kruskal's and Prim's algorithms.
- Connect all n nodes with exactly n-1 edges — minimizing total edge weight
- MST is unique when all edge weights are distinct
- Acyclic by definition — adding any extra edge creates a cycle
- Applications: network design, clustering, pipe/road infrastructure
- Cut property: minimum edge crossing any cut belongs to an MST
Kruskal's Algorithm
Kruskal's algorithm is edge-centric. Start by sorting all edges by weight in ascending order. Iterate through sorted edges and greedily add each edge to the MST if it does not form a cycle — i.e., if the two endpoints belong to different connected components. Stop once n-1 edges have been added. The result is the MST.
Cycle detection is performed using Union-Find (Disjoint Set Union). Before adding an edge (u, v), call find(u) and find(v). If they return the same root, the edge would create a cycle — skip it. Otherwise, call union(u, v) to merge their components and include the edge. This keeps the time complexity at O(E log E) — dominated by the initial sort.
Kruskal's is particularly natural when the input is given as an edge list. It processes edges globally in sorted order rather than exploring locally from a node. This makes it straightforward to implement and reason about for interview settings.
Kruskal's Is Edge-Centric
Kruskal's is edge-centric: great when you have an edge list and E is small relative to V^2. Sort edges once, then Union-Find does near O(1) per edge. Total: O(E log E).
Union-Find for Kruskal's
Union-Find (Disjoint Set Union) is the data structure that makes Kruskal's efficient. It maintains a collection of disjoint sets, each representing a connected component. Two operations: find(x) returns the root/representative of x's component; union(x, y) merges the components containing x and y.
With path compression and union by rank, both operations run in nearly O(1) amortized time (inverse Ackermann function). Path compression flattens the tree during find — each node on the path points directly to the root after the call. Union by rank attaches the shorter tree under the taller one to keep trees flat.
Cycle detection with Union-Find is simple: before adding edge (u, v), if find(u) == find(v), they're already connected — adding this edge creates a cycle. Skip it. If find(u) != find(v), union them and include the edge. After n-1 successful unions, the MST is complete.
- 1Initialize parent[i] = i and rank[i] = 0 for each node i
- 2find(x): if parent[x] != x, recurse and apply path compression
- 3union(x, y): find roots of both; attach smaller rank tree under larger
- 4For each edge (u, v) in sorted order: if find(u) != find(v), union and add edge
- 5Stop when n-1 edges added — MST is complete
- 6Cycle detected when find(u) == find(v) — skip that edge
Prim's Algorithm
Prim's algorithm is vertex-centric. Start from any arbitrary node and maintain a min-heap (priority queue) of edges connecting the visited set to the unvisited set. At each step, extract the minimum-weight edge from the heap, mark the new node as visited, and add all its edges to unvisited neighbors into the heap. Repeat until all nodes are visited.
The greedy choice at each step — always picking the cheapest edge from the current MST boundary to an unvisited node — is correct by the cut property. The visited set grows one node at a time. Each new node brings its adjacent edges into the heap. The total time complexity is O(E log V) with a binary heap.
Prim's naturally works with adjacency lists. It is similar in structure to Dijkstra's algorithm — both use a min-heap and visited set — but Prim's minimizes edge weight to the MST boundary rather than cumulative distance from the source.
Prim's Is Vertex-Centric
Prim's is vertex-centric: better for dense graphs or when you start from a specific node. Min-heap always pops the cheapest edge to an unvisited node. O(E log V) with a binary heap.
Kruskal vs Prim
Both algorithms produce a valid minimum spanning tree. The choice between them depends on graph density and input format. Kruskal's is more natural when you have an edge list; Prim's is more natural when you have an adjacency list. For sparse graphs (E much less than V^2), Kruskal's is efficient. For dense graphs (E close to V^2), Prim's with a Fibonacci heap can be faster, though binary heap Prim's is the practical interview choice.
In LeetCode problems, the input often comes as an edge list or a coordinate grid. For edge-list inputs like connecting cities (LC 1135), Kruskal's is the direct approach — sort edges, apply Union-Find. For point-grid inputs like min cost to connect all points (LC 1584), both work but Prim's can avoid generating all O(V^2) edges up front by expanding greedily.
Both algorithms are O(E log E) or O(E log V) in practice. Neither is strictly better in all cases. Interview tip: if you remember Union-Find well, default to Kruskal's. If you're more comfortable with BFS/heap traversal, default to Prim's.
- 1Kruskal's: better for sparse graphs (E << V^2), needs edge list, Union-Find based
- 2Prim's: better for dense graphs, adjacency list, heap based
- 3Kruskal's time: O(E log E) dominated by edge sort
- 4Prim's time: O(E log V) with binary heap
- 5Both produce a valid MST — choose based on input format and familiarity
- 6For interview: Union-Find comfort → Kruskal; heap/BFS comfort → Prim
Common MST LeetCode Problems
Connecting Cities With Minimum Cost (LC 1135) is the canonical MST problem. Given n cities and weighted edges between them, find the minimum cost to connect all cities. Directly apply Kruskal's: sort edges, Union-Find to avoid cycles. Prim's also works with an adjacency list. The answer is the sum of the n-1 MST edges.
Min Cost to Connect All Points (LC 1584) gives you 2D coordinates and defines edge weight as Manhattan distance between any two points. You have a complete graph of V^2/2 edges. Prim's shines here: start from point 0, maintain the minimum distance from each unvisited point to the current MST, greedily pick the closest. O(V^2) time without a heap, O(V^2 log V) with one.
Optimize Water Distribution in a Village (LC 1168) adds a twist: you can build a well at any house (cost given) or lay pipes between houses (edge weights). The trick is adding a virtual node 0 connected to each house with edge weight equal to the well cost, then running standard MST on the augmented graph.
MST Pattern Recognition
If the problem says "connect all nodes with minimum total cost" it is almost certainly MST. Do not overcomplicate it with Dijkstra or DP. Sort edges + Union-Find (Kruskal) or min-heap expansion (Prim) — pick one and go.