Skip to content
C Codeloom
SQL

SQL Recursive CTEs: A Practical Tutorial

Learn how recursive CTEs work in SQL, how to traverse hierarchies and graphs in pure SQL, and how to avoid the common termination and performance pitfalls.

·4 min read · By Codeloom
Intermediate 8 min read

What you'll learn

  • What a recursive CTE is and how it executes
  • How anchor and recursive parts combine
  • How to walk trees, graphs, and number sequences
  • How to avoid infinite recursion with cycle detection
  • When to use a CTE vs an application-side traversal

Prerequisites

  • Basic SQL familiarity

What and Why

Some data is naturally hierarchical: org charts, category trees, threaded comments, dependency graphs, file systems. Querying these in plain SQL is awkward without a loop, and writing the loop in the application means many round trips.

Recursive Common Table Expressions (CTEs) let SQL itself express the loop. You define a starting set, then a rule for extending it. The database keeps applying the rule until no new rows are produced. The result is a clean, set-based way to walk arbitrarily deep structures in a single query.

Mental Model

A recursive CTE has two parts joined by UNION ALL. The anchor query produces the initial rows, for example the root of the tree. The recursive query references the CTE name itself and produces the next level, given the previous level. The engine repeats the recursive step using the most recently produced rows as input, accumulating results until the step returns zero rows.

Conceptually, think of it as a queue. Start the queue with anchor rows. Pop the current frontier, apply the recursive step, push the new frontier. Stop when the frontier is empty. The accumulated rows are the answer.

This model handles trees naturally and, with cycle detection, also general graphs.

Hands-on Example

Suppose an employees table has id, name, and manager_id. Listing everyone under a manager is a single recursive CTE.

WITH RECURSIVE org AS (
  -- anchor: the chosen manager
  SELECT id, name, manager_id, 1 AS depth
  FROM employees
  WHERE id = 42

  UNION ALL

  -- recursive: direct reports of the previous level
  SELECT e.id, e.name, e.manager_id, o.depth + 1
  FROM employees e
  JOIN org o ON e.manager_id = o.id
)
SELECT * FROM org ORDER BY depth, name;

 anchor:   [ #42 Alice ]                  depth 1

 step 1 -> direct reports of Alice
           [ #51 Bob, #52 Carol ]          depth 2

 step 2 -> direct reports of {Bob, Carol}
           [ #61 Dan, #62 Eve, #63 Faye ]  depth 3

 step 3 -> direct reports of {Dan, Eve, Faye}
           [ ]                              STOP

 final result = union of all levels
A recursive CTE expands a frontier level by level until no new rows appear.

The same shape works for category trees, parent-child threads, and any other adjacency-list structure.

Common Pitfalls

The most dangerous pitfall is infinite recursion. If your data has cycles, or your join condition is wrong, the CTE will keep producing rows forever (or until memory runs out). Defend against this by tracking visited nodes in an array column and filtering them out, or by limiting depth.

WITH RECURSIVE walk AS (
  SELECT id, ARRAY[id] AS path, 1 AS depth FROM nodes WHERE id = 1
  UNION ALL
  SELECT n.id, w.path || n.id, w.depth + 1
  FROM nodes n JOIN edges e ON e.dst = n.id
  JOIN walk w ON e.src = w.id
  WHERE NOT n.id = ANY(w.path) AND w.depth < 20
)
SELECT * FROM walk;

Another mistake is using UNION instead of UNION ALL. UNION deduplicates each step, which is usually slower and can hide logical bugs. Use UNION ALL and handle uniqueness explicitly if needed.

Performance can also collapse if the recursive step is not indexed. The join inside the recursion runs once per level, so missing indexes hurt repeatedly.

Practical Tips

Always include a depth column. Even if you do not need it in the output, it makes debugging dramatically easier and gives you a natural cap to prevent runaways.

Use EXPLAIN ANALYZE to inspect the plan. Look for the “Worktable Scan” and confirm the recursive join uses an index on the parent or edge key.

For very deep or wide traversals, consider materializing the closure (storing ancestor-descendant pairs) once, then querying it directly. CTEs are great, but a closure table can be unbeatable for read-heavy workloads.

Keep the CTE focused. If you need to compute multiple things, run the recursion to get the row set, then join to other tables in a second CTE or in the final SELECT.

Wrap-up

Recursive CTEs let SQL traverse hierarchies and graphs in a single, set-based query. Anchor plus recursive step plus stopping condition is the whole pattern. Watch out for cycles and missing indexes, add a depth cap as a safety net, and reach for closure tables when reads vastly outnumber writes. With these habits, recursive queries become a reliable everyday tool rather than a last resort.