Union-Find / Disjoint Set Union (DSU) Explained

Intermediate 10 min read

What you'll learn

✓What disjoint sets are and the find / union operations
✓How the parent array represents a forest of trees
✓Why union by rank and path compression yield near-O(1) amortized cost
✓How to apply DSU to connected components and Kruskal's MST
✓Common pitfalls when implementing union-find from scratch

Prerequisites

•Arrays: [Arrays Introduction](/blog/arrays-introduction)
•Big-O basics: [Big-O Notation Explained](/blog/big-o-notation-explained)
•Graphs: [Graphs: BFS and DFS](/blog/graphs-bfs-and-dfs)

The Disjoint Set Union (DSU) structure, often called Union-Find, keeps track of a collection of disjoint sets under two operations:

find(x) returns a representative element for the set containing x.
union(x, y) merges the sets containing x and y.

It sounds simple, but with the right optimizations both operations run in O(alpha(n)) amortized time, where alpha is the inverse Ackermann function. For any practical input, that is effectively a constant. DSU shows up wherever you need to dynamically track connectivity: Kruskal’s minimum spanning tree, percolation, image segmentation, equivalence classes, and offline graph queries.

The parent array

Represent each set as a tree whose root is the set’s representative. Store the trees implicitly in a parent array where parent[i] is i’s parent, and a root points to itself.

class DSU:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
        self.components = n

Initially every element is its own root, so there are n singleton sets.

Find with path compression

A naive find walks up the parent pointers until it reaches a root. That is correct but slow if the trees grow tall. Path compression flattens the path on the way back: every node visited becomes a direct child of the root.

def find(self, x):
    while self.parent[x] != x:
        self.parent[x] = self.parent[self.parent[x]]  # halving
        x = self.parent[x]
    return x

The version above uses “path halving”, which is slightly faster than full recursive compression and avoids stack overflows on deep chains. A purely recursive variant is clearer if you prefer:

def find(self, x):
    if self.parent[x] != x:
        self.parent[x] = self.find(self.parent[x])
    return self.parent[x]

Union by rank

Without care, union could repeatedly attach the larger tree under the smaller, growing depth unnecessarily. Union by rank always attaches the shorter tree under the taller. Rank approximates tree height; it only increases when two trees of equal rank merge.

def union(self, x, y):
    rx, ry = self.find(x), self.find(y)
    if rx == ry:
        return False  # already connected
    if self.rank[rx] < self.rank[ry]:
        rx, ry = ry, rx
    self.parent[ry] = rx
    if self.rank[rx] == self.rank[ry]:
        self.rank[rx] += 1
    self.components -= 1
    return True

A common alternative is union by size: track the size of each tree and attach the smaller under the larger. Both achieve the same amortized bound; size has the bonus that you can answer “how big is x’s component?” in O(alpha(n)).

Why near-O(1) amortized?

With both path compression and union by rank, any sequence of m operations on n elements runs in O(m * alpha(n)) total time. The inverse Ackermann alpha(n) grows so slowly that it is at most 4 for any n that fits in the observable universe. In practice, treat DSU operations as constant time and feel only a tiny bit guilty about it.

The proof is famously subtle, but the intuition is: path compression makes trees flat very quickly, and union by rank prevents pathological deep attachments in the first place.

Problem 1: Number of connected components

Given n nodes labeled 0..n-1 and an edge list, return the number of connected components.

You could run BFS or DFS as in Graphs: BFS and DFS. DSU gives an equally clean solution and is preferred when edges arrive in a stream and you only need connectivity questions.

def count_components(n, edges):
    dsu = DSU(n)
    for u, v in edges:
        dsu.union(u, v)
    return dsu.components

Time is O((n + m) * alpha(n)) which is effectively linear.

Problem 2: Kruskal’s minimum spanning tree

Given a weighted undirected graph, build a spanning tree of minimum total edge weight.

Sort edges by weight, then greedily add each edge that connects two previously disconnected components. DSU answers “are these endpoints already connected?” in near-constant time, which is what makes Kruskal’s algorithm efficient.

def kruskal(n, edges):
    edges.sort(key=lambda e: e[2])  # (u, v, w)
    dsu = DSU(n)
    mst_weight = 0
    mst_edges = []
    for u, v, w in edges:
        if dsu.union(u, v):
            mst_weight += w
            mst_edges.append((u, v, w))
            if len(mst_edges) == n - 1:
                break
    return mst_weight, mst_edges

Sorting dominates the runtime at O(m log m). Each union is effectively constant, so the total is O(m log m).

When not to use DSU

DSU answers “are these in the same set?” but it does not give you the actual path between elements, and it does not support efficient split or remove operations. If you need to undo unions, you need a specialized “DSU with rollback” variant that omits path compression. If you need shortest paths or traversal order, you still need BFS/DFS or Dijkstra.

For problems that look like “process events online and ask connectivity questions”, DSU is almost always the right call. For problems that look like “explore from a single source”, a traversal is better. Many DP problems on graphs combine both; see Dynamic Programming Introduction for the broader pattern of combining structures.

Common pitfalls

Forgetting to compress on every find. Even a single uncompressed path can degrade later operations.
Calling union without find. Always union the roots, not the input elements directly, or rank updates become meaningless.
Mixing union by rank and union by size sloppily. Pick one and stay consistent; the bookkeeping is different.
Returning the wrong “already connected” signal. Returning a bool from union is convenient: callers can detect cycles in one step.

Wrap up

Union-Find is one of the highest-leverage data structures in competitive programming and interviews. Memorize the parent array, the compressed find, the rank-based union, and the “treat it as constant time” rule. Once those three pieces are reflexive, problems involving connectivity, equivalence classes, or greedy edge selection become almost mechanical.