Skip to content
C Codeloom
DSA

Kruskal's and Prim's MST Algorithms

Build minimum spanning trees with Kruskal's union-find approach and Prim's priority queue approach, with side-by-side Python implementations.

·5 min read · By Yash Kesharwani
Intermediate 10 min read

What you'll learn

  • What a minimum spanning tree is and why it matters
  • Kruskal's greedy edge-sort approach with union-find
  • Prim's greedy node-growth approach with a priority queue
  • Complexity trade-offs between the two algorithms
  • When MSTs show up in real systems

Prerequisites

  • Graphs basics from /blog/graphs-introduction
  • BFS and DFS from /blog/graphs-bfs-and-dfs
  • Sorting from /blog/sorting-algorithms-overview

A minimum spanning tree, or MST, is a subset of edges that connects every node in a weighted undirected graph using the smallest possible total edge weight, and contains no cycles. Two classical algorithms compute it: Kruskal’s and Prim’s. Both are greedy, both produce correct MSTs, and they differ mainly in the data structures they use.

Why MSTs Matter

MSTs solve the cheapest-to-connect problem. They are used to lay out network cables, design circuit boards, cluster data points, build approximations for the traveling salesman problem, and prune redundant edges from dense networks.

A spanning tree of a graph with V nodes always has exactly V minus one edges. The MST is the spanning tree with minimum total weight. If the graph is disconnected, the same algorithms produce a minimum spanning forest.

Kruskal’s Algorithm

Kruskal sorts all edges by weight, then iterates through them in order. For each edge, if its two endpoints are not already in the same connected component, add the edge to the MST and merge the components. If they are already connected, skip the edge to avoid creating a cycle.

The key data structure is union-find, also called disjoint-set union. It supports two operations: find the component of a node, and union two components together. With path compression and union by rank, both operations run in nearly constant amortized time.

class DSU:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        while self.parent[x] != x:
            self.parent[x] = self.parent[self.parent[x]]
            x = self.parent[x]
        return x

    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry:
            return False
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1
        return True

def kruskal(num_nodes, edges):
    edges_sorted = sorted(edges, key=lambda e: e[2])
    dsu = DSU(num_nodes)
    mst = []
    total = 0
    for u, v, w in edges_sorted:
        if dsu.union(u, v):
            mst.append((u, v, w))
            total += w
            if len(mst) == num_nodes - 1:
                break
    return mst, total

edges = [
    (0, 1, 4),
    (0, 2, 3),
    (1, 2, 1),
    (1, 3, 2),
    (2, 3, 4),
    (3, 4, 2),
]
print(kruskal(5, edges))

Kruskal runs in O(E log E) for sorting plus near-linear union-find work, which simplifies to O(E log V) since E is at most V squared.

Prim’s Algorithm

Prim grows a single tree starting from an arbitrary node. At each step, it picks the lightest edge that connects a node already in the tree to a node not yet in the tree. The priority queue stores candidate edges by weight.

import heapq
from math import inf

def prim(graph, start=0):
    visited = {start}
    pq = []
    for v, w in graph[start]:
        heapq.heappush(pq, (w, start, v))

    mst = []
    total = 0
    while pq and len(visited) < len(graph):
        w, u, v = heapq.heappop(pq)
        if v in visited:
            continue
        visited.add(v)
        mst.append((u, v, w))
        total += w
        for nxt, nw in graph[v]:
            if nxt not in visited:
                heapq.heappush(pq, (nw, v, nxt))
    return mst, total

graph = {
    0: [(1, 4), (2, 3)],
    1: [(0, 4), (2, 1), (3, 2)],
    2: [(0, 3), (1, 1), (3, 4)],
    3: [(1, 2), (2, 4), (4, 2)],
    4: [(3, 2)],
}
print(prim(graph, 0))

Prim with a binary heap runs in O((V + E) log V). With a Fibonacci heap the bound improves to O(E + V log V), but the constants make this impractical for most workloads.

Choosing Between Them

On sparse graphs, Kruskal often wins because sorting edges is cheap and union-find is extremely fast. On dense graphs, Prim is competitive because most edges are relevant and the priority queue handles them in a streaming fashion.

Implementation taste matters too. Kruskal feels like sorting plus union-find, which is elegant if you already have those tools. Prim feels like a weighted BFS, which is natural if you have just written Dijkstra. In fact Prim and Dijkstra differ only in what gets stored in the priority queue.

Correctness Intuition

Both algorithms are special cases of the cut property: for any partition of vertices into two non-empty sets, the lightest edge crossing the cut belongs to some MST. Kruskal applies this globally by picking the lightest safe edge anywhere. Prim applies it locally by picking the lightest edge crossing the cut between the current tree and the rest.

Ties matter for uniqueness. If multiple edges share the same weight, the graph may have several distinct MSTs, all with the same total weight. The algorithms find some MST, not the MST.

Practical Applications

Network design uses MSTs to minimize cable or fiber cost. Clustering algorithms like single-linkage hierarchical clustering implicitly compute an MST and then cut it at chosen edges. Approximation algorithms for the Euclidean traveling salesman problem use MSTs to get within a factor of 2 of optimal. Image segmentation uses MSTs on pixel grids to group similar regions.

For comparing trade-offs to other graph algorithms, see /blog/graphs-bfs-and-dfs and /blog/big-o-notation-explained.

Wrap up

Kruskal sorts edges and joins components using union-find. Prim grows one tree using a priority queue. Both produce an MST in roughly O(E log V) time. Pick Kruskal for sparse edge lists, Prim for dense adjacency representations, and know that both encode the same underlying greedy cut property.