SQL Transactions and Isolation Levels Explained

Intermediate 10 min read

What you'll learn

✓How BEGIN, COMMIT, and ROLLBACK group statements into atomic units
✓What the ACID properties actually guarantee
✓The four standard isolation levels and what each costs
✓Dirty reads, non-repeatable reads, and phantom reads in practice
✓Postgres defaults and when to deviate from them

Prerequisites

•Comfort writing queries: [SQL SELECT basics](/blog/sql-select-basics)
•Familiarity with multi-table queries: [SQL joins](/blog/sql-joins)

A transaction is a sequence of statements that the database treats as a single, indivisible unit. Either every statement takes effect, or none of them do. Isolation levels then decide how transactions running at the same time see each other. Get these two ideas right and you avoid an entire class of bugs that are nearly impossible to reproduce after the fact.

BEGIN, COMMIT, and ROLLBACK

The basic shape of a transaction is the same everywhere.

BEGIN;

UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

COMMIT;

If anything between BEGIN and COMMIT fails, you call ROLLBACK instead and the database undoes every change made inside the transaction. Without a transaction, a crash between the two UPDATE statements leaves account 1 short by 100 and account 2 unchanged. With a transaction, the database guarantees that does not happen.

Most client libraries wrap this for you, but it is worth knowing the raw commands so you can drop into psql and debug without a framework in the way.

ACID

ACID is the shorthand for what a transactional database promises.

Atomicity. All statements in a transaction succeed together or fail together. There is no partial state visible after a crash.
Consistency. The database moves from one valid state to another. Constraints like foreign keys and CHECK clauses are enforced at commit time.
Isolation. Concurrent transactions do not corrupt each other. The isolation level decides exactly how strict this is.
Durability. Once a transaction commits, the change survives crashes, power loss, and process restarts.

Atomicity and durability are usually free as long as you actually use transactions. Isolation is where engineering tradeoffs live.

What can go wrong without isolation

Three classic anomalies show up when transactions are not isolated enough.

A dirty read happens when one transaction reads data that another transaction has written but not yet committed. If the writer rolls back, the reader saw a value that never officially existed.

A non-repeatable read happens when a transaction reads the same row twice and sees a different value the second time, because another transaction committed an update in between.

A phantom read happens when a transaction runs the same query twice and the second run returns rows that did not exist on the first run, because another transaction inserted them.

The four standard isolation levels

The SQL standard defines four levels. Each is stricter than the previous, preventing more anomalies at the cost of more contention.

Level	Dirty read	Non-repeatable read	Phantom read
Read uncommitted	possible	possible	possible
Read committed	prevented	possible	possible
Repeatable read	prevented	prevented	possible
Serializable	prevented	prevented	prevented

You set the level per transaction.

BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- statements
COMMIT;

You can also set it as a session default.

SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL SERIALIZABLE;

Postgres defaults

Postgres defaults to read committed. Each statement sees a fresh snapshot of committed data. Two reads in the same transaction can see different values if another transaction commits in between. This is the right default for most application code: it is fast and prevents the worst anomaly (dirty reads) while imposing minimal locking.

Postgres also implements repeatable read using snapshot isolation. The whole transaction sees a single snapshot taken at the first statement, which also prevents phantoms in practice for most workloads. The standard allows phantoms at this level, but Postgres goes further.

Postgres serializable is implemented with serializable snapshot isolation. The database tracks read and write dependencies between concurrent transactions and aborts any transaction that would produce a non-serializable schedule. You see the abort as a serialization_failure error and retry. This is the strongest guarantee you can ask for and the most likely to surface as application-visible retries.

A concrete example

Consider a counter that two processes both want to increment.

-- session A
BEGIN;
SELECT value FROM counters WHERE name = 'orders'; -- returns 42
-- session B does the same thing, also reads 42
UPDATE counters SET value = 43 WHERE name = 'orders';
COMMIT;
-- session B then writes 43 as well, both increments collapse into one

This is a lost update. Under read committed, two clients can both read 42 and both write 43. Three solutions, ordered from least to most invasive.

Do the math in SQL. Replace SELECT then UPDATE with UPDATE counters SET value = value + 1 WHERE name = 'orders'. The row is locked for the update and no value is lost.
Take an explicit row lock. SELECT value FROM counters WHERE name = 'orders' FOR UPDATE blocks the second reader until the first transaction commits.
Use serializable isolation. The database will abort one of the two transactions, and your application retries.

The first option is almost always the right answer. Reach for higher isolation when the logic is more complex than a single arithmetic update.

Locking briefly

Even at read committed, writes acquire row locks. A second transaction trying to update the same row waits until the first either commits or rolls back. This is usually invisible until you have a long-running transaction holding locks while others queue behind it. The fix is to keep transactions short. Avoid network calls, file I/O, and user input between BEGIN and COMMIT. Compute what you need first, then open the transaction, write quickly, and commit.

Indexes interact with locking. A query that scans the whole table can hold many more locks than a query that uses an index. See SQL indexes and performance for how to make hot writes cheap.

Savepoints

Inside a transaction you can mark a savepoint, then roll back to it without aborting the whole transaction.

BEGIN;
INSERT INTO orders (customer_id, total) VALUES (1, 100);
SAVEPOINT before_items;

INSERT INTO order_items (order_id, sku) VALUES (currval('orders_id_seq'), 'X');
-- something fails
ROLLBACK TO SAVEPOINT before_items;

INSERT INTO order_items (order_id, sku) VALUES (currval('orders_id_seq'), 'Y');
COMMIT;

Savepoints are useful inside ORMs that batch many statements per request.

Practical advice

Wrap multi-statement writes in a transaction by default. The cost is negligible and the bug class it prevents is severe.
Stick with read committed unless you have evidence you need more.
When you do need more, prefer SELECT ... FOR UPDATE over jumping to serializable. The cost is more predictable.
If you go serializable, your application must catch the serialization_failure error and retry. Wrap your transaction runner in a loop with a small backoff.
Keep transactions short. Long transactions are the leading cause of mysterious lock waits and deadlocks.

Wrap up

Transactions and isolation levels are a small surface with deep consequences. Use BEGIN and COMMIT for every write that touches more than one row. Understand the three anomalies and the four levels well enough to pick the right one. Lean on Postgres defaults until you have a real reason to change them, and when you do change them, change them deliberately with an eye on contention and retries. The reward is software that behaves the same under load as it does in your tests.