SQL Recursive CTEs: A Practical Tutorial
Learn how recursive CTEs work in SQL, how to traverse hierarchies and graphs in pure SQL, and how to avoid the common termination and performance pitfalls.
What you'll learn
- ✓What a recursive CTE is and how it executes
- ✓How anchor and recursive parts combine
- ✓How to walk trees, graphs, and number sequences
- ✓How to avoid infinite recursion with cycle detection
- ✓When to use a CTE vs an application-side traversal
Prerequisites
- •Basic SQL familiarity
What and Why
Some data is naturally hierarchical: org charts, category trees, threaded comments, dependency graphs, file systems. Querying these in plain SQL is awkward without a loop, and writing the loop in the application means many round trips.
Recursive Common Table Expressions (CTEs) let SQL itself express the loop. You define a starting set, then a rule for extending it. The database keeps applying the rule until no new rows are produced. The result is a clean, set-based way to walk arbitrarily deep structures in a single query.
Mental Model
A recursive CTE has two parts joined by UNION ALL. The anchor query produces the initial rows, for example the root of the tree. The recursive query references the CTE name itself and produces the next level, given the previous level. The engine repeats the recursive step using the most recently produced rows as input, accumulating results until the step returns zero rows.
Conceptually, think of it as a queue. Start the queue with anchor rows. Pop the current frontier, apply the recursive step, push the new frontier. Stop when the frontier is empty. The accumulated rows are the answer.
This model handles trees naturally and, with cycle detection, also general graphs.
Hands-on Example
Suppose an employees table has id, name, and manager_id. Listing everyone under a manager is a single recursive CTE.
WITH RECURSIVE org AS (
-- anchor: the chosen manager
SELECT id, name, manager_id, 1 AS depth
FROM employees
WHERE id = 42
UNION ALL
-- recursive: direct reports of the previous level
SELECT e.id, e.name, e.manager_id, o.depth + 1
FROM employees e
JOIN org o ON e.manager_id = o.id
)
SELECT * FROM org ORDER BY depth, name;
anchor: [ #42 Alice ] depth 1
step 1 -> direct reports of Alice
[ #51 Bob, #52 Carol ] depth 2
step 2 -> direct reports of {Bob, Carol}
[ #61 Dan, #62 Eve, #63 Faye ] depth 3
step 3 -> direct reports of {Dan, Eve, Faye}
[ ] STOP
final result = union of all levels
The same shape works for category trees, parent-child threads, and any other adjacency-list structure.
Common Pitfalls
The most dangerous pitfall is infinite recursion. If your data has cycles, or your join condition is wrong, the CTE will keep producing rows forever (or until memory runs out). Defend against this by tracking visited nodes in an array column and filtering them out, or by limiting depth.
WITH RECURSIVE walk AS (
SELECT id, ARRAY[id] AS path, 1 AS depth FROM nodes WHERE id = 1
UNION ALL
SELECT n.id, w.path || n.id, w.depth + 1
FROM nodes n JOIN edges e ON e.dst = n.id
JOIN walk w ON e.src = w.id
WHERE NOT n.id = ANY(w.path) AND w.depth < 20
)
SELECT * FROM walk;
Another mistake is using UNION instead of UNION ALL. UNION deduplicates each step, which is usually slower and can hide logical bugs. Use UNION ALL and handle uniqueness explicitly if needed.
Performance can also collapse if the recursive step is not indexed. The join inside the recursion runs once per level, so missing indexes hurt repeatedly.
Practical Tips
Always include a depth column. Even if you do not need it in the output, it makes debugging dramatically easier and gives you a natural cap to prevent runaways.
Use EXPLAIN ANALYZE to inspect the plan. Look for the “Worktable Scan” and confirm the recursive join uses an index on the parent or edge key.
For very deep or wide traversals, consider materializing the closure (storing ancestor-descendant pairs) once, then querying it directly. CTEs are great, but a closure table can be unbeatable for read-heavy workloads.
Keep the CTE focused. If you need to compute multiple things, run the recursion to get the row set, then join to other tables in a second CTE or in the final SELECT.
Wrap-up
Recursive CTEs let SQL traverse hierarchies and graphs in a single, set-based query. Anchor plus recursive step plus stopping condition is the whole pattern. Watch out for cycles and missing indexes, add a depth cap as a safety net, and reach for closure tables when reads vastly outnumber writes. With these habits, recursive queries become a reliable everyday tool rather than a last resort.
Related articles
- SQL CTE vs Subquery vs Temp Table: Picking the Right SQL Tool
Compare common table expressions, subqueries, and temporary tables in SQL. Learn when each shines, the performance trade-offs, and concrete examples.
- SQL SQL Aggregate Functions and GROUP BY in Depth
How aggregate functions interact with GROUP BY, HAVING, and window functions, with practical patterns and pitfalls every backend engineer should know.
- SQL SQL EXPLAIN ANALYZE Deep Dive
How to read EXPLAIN ANALYZE output, interpret cost estimates, spot bad plans, and use it to drive real performance improvements in production databases.
- SQL SQL EXPLAIN and Query Planning Made Approachable
Learn how to read EXPLAIN output, what scan and join types mean, and how to spot the indexes and rewrites that make slow queries fast.