LeetCode Serialize and Deserialize Binary Tree
LeetCode 297 fully unpacked — the preorder-with-nulls encoding, the iterator-driven decoder, and why level-order is a worse choice than it looks.
What you'll learn
- ✓Why nulls must appear in the serialization
- ✓The clean preorder-with-sentinels encoding
- ✓How to deserialize using a shared iterator
- ✓Trade-offs vs level-order encodings
Prerequisites
- •Preorder traversal — see Binary Tree Traversals
- •Recursive thinking — see Recursion Fundamentals
LeetCode 297 — Serialize and Deserialize Binary Tree — is the classic systems-flavored tree problem. You are not asked to compute anything; you are asked to design an encoding that round-trips perfectly. Most candidates get the serialize right and stumble on the deserialize because they forget the one invariant that makes both halves trivial: preorder plus explicit nulls is enough.
The Problem
Implement two functions: serialize(root) returns a string representation of a binary tree, and deserialize(data) reconstructs the original tree from that string. The encoding format is your call. Trees may contain arbitrary integer values and arbitrary shapes, including empty.
Brute Force
A common first attempt: store only inorder. That fails — inorder alone does not uniquely determine a binary tree because you cannot tell where one subtree ends and the next begins without knowing the root.
A second attempt: store inorder plus preorder. That works but requires two passes and an index search to rebuild. Acceptable but clunky for a single function. Worth mentioning as a warm-up, then setting aside.
Optimal Solution
Use preorder with an explicit null sentinel. Every recursive call emits either a value or the sentinel before recursing into children. On the decode side, walk the tokens in order and consume them with the same recursion shape.
class Codec:
def serialize(self, root):
out = []
def dfs(node):
if not node:
out.append("#")
return
out.append(str(node.val))
dfs(node.left)
dfs(node.right)
dfs(root)
return ",".join(out)
def deserialize(self, data):
tokens = iter(data.split(","))
def build():
val = next(tokens)
if val == "#":
return None
node = TreeNode(int(val))
node.left = build()
node.right = build()
return node
return build()
The Python iter object is the key — passing the same iterator into every recursive call automatically advances the cursor. No index bookkeeping required.
Walk Through an Example
Take the tree:
1
/ \
2 3
/ \
4 5
Serialization runs preorder: 1, 2, #, #, 3, 4, #, #, 5, #, #. Joined: "1,2,#,#,3,4,#,#,5,#,#".
Deserialization:
- Read 1. Build node. Recurse left.
- Read 2. Build node. Recurse left.
- Read
#. Return None. Recurse right. - Read
#. Return None. Node 2 is done. - Back at node 1, recurse right. Read 3. Build. Recurse left.
- Read 4. Build. Two
#tokens make 4 a leaf. - Back at node 3, recurse right. Read 5. Two
#tokens make 5 a leaf.
The tree is reconstructed exactly.
Edge Cases
- Empty tree — serializes to
"#", deserializes to None. - A single node —
"7,#,#". - Negative values — make sure your tokenizer treats
-as part of the number, not a delimiter. - Very large trees — Python recursion limits matter. Bump
sys.setrecursionlimitor write the iterative version with an explicit stack. - Duplicates — totally fine. The encoding does not care.
Complexity Analysis
Both serialize and deserialize are O(n) time, where n is the number of nodes (including the implicit null leaves we emit). Space is O(n) for the output string and O(h) for recursion, where h is the tree height. Level-order encodings have the same asymptotics but pay more in string size for sparse trees because they fill every position up to the deepest level.
For a refresher on why “O(n) time, O(h) space” is the standard tree budget, see Big-O Notation Explained.
How to Explain It in an Interview
Lead with the invariant: “Preorder alone is ambiguous — you need to know where missing children are. The fix is a null sentinel.” Sketch the serializer. For the deserializer, draw attention to the shared iterator: it is the trick that turns a fiddly index problem into a clean recursion.
If asked about level-order, acknowledge it works but point out two costs: the BFS code is longer, and sparse trees waste bytes. If asked about JSON or other structured formats, note that they are fine but most interviewers want a flat tokenized format because it shows you understand the algorithm.
Related Problems
- LeetCode 449 — Serialize and Deserialize BST (lets you skip the nulls)
- LeetCode 105 — Construct Binary Tree from Preorder and Inorder
- LeetCode 428 — Serialize and Deserialize N-ary Tree
- LeetCode 652 — Find Duplicate Subtrees (uses serialization as a hash key)
The N-ary case is a fun extension — instead of two recursive calls per node, you emit the child count first.
Wrap up
Serialize and Deserialize Binary Tree rewards a small, sharp insight: preorder traversal plus a null token is a self-describing format. Internalize the iterator-driven decoder — it generalizes to almost every recursive-parsing problem you will see, from expression trees to nested JSON. The same recursive shape you use in tree traversals shows up here doing double duty.