Skip to content
C Codeloom
SQL

SQL NULL Handling Best Practices

Learn how NULL behaves in SQL, why three-valued logic trips up queries, and the patterns that keep your data consistent and your queries correct.

·4 min read · By Codeloom
Intermediate 8 min read

What you'll learn

  • What NULL really means in SQL
  • How three-valued logic affects WHERE clauses
  • When to use COALESCE, NULLIF, and IS DISTINCT FROM
  • Indexing and aggregation behavior around NULLs
  • Schema design patterns to minimize NULL surprises

Prerequisites

  • Familiar with HTTP and databases

What and Why

NULL is SQL’s way of saying “unknown” or “not applicable”. It is not zero, not an empty string, and not false. Because of that, it follows three-valued logic: a comparison can be TRUE, FALSE, or UNKNOWN. Most beginners assume NULL behaves like a default value, and that assumption is the root of many silent bugs: rows missing from reports, joins dropping unexpectedly, and aggregations returning surprising totals.

Understanding NULL is critical because almost every nullable column you add introduces branching logic in your queries. Done poorly, this leads to inconsistent business rules; done well, it lets your schema model genuine uncertainty.

Mental Model

Think of NULL as a sealed envelope. You know an envelope exists, but you cannot see inside. Two sealed envelopes are not equal, because you do not know what is in either of them. That is why NULL = NULL returns UNKNOWN, not TRUE. Filters in WHERE only keep rows where the predicate is TRUE, so any row producing UNKNOWN is dropped.

Aggregations follow a different rule: most functions like SUM, AVG, and COUNT(column) skip NULLs entirely. COUNT(*) counts rows regardless. This asymmetry is the source of countless reporting discrepancies.

Hands-on Example

Imagine a users table where deleted_at is NULL for active users and a timestamp for deleted ones. A naive query like WHERE deleted_at != '2026-01-01' silently excludes all active users because NULL comparisons return UNKNOWN.

  deleted_at  -->  predicate  -->  result   keeps row?
----------       ---------       -------   ----------
2026-01-01       != date         FALSE     no
2025-12-01       != date         TRUE      yes
NULL             != date         UNKNOWN   no  <-- bug
Three-valued logic flow for a WHERE clause

The fix is explicit:

SELECT *
FROM users
WHERE deleted_at IS NULL
   OR deleted_at <> '2026-01-01';

Or use IS DISTINCT FROM, which treats NULL as a comparable value:

SELECT *
FROM users
WHERE deleted_at IS DISTINCT FROM '2026-01-01';

For default-style fallback values, COALESCE returns the first non-NULL argument:

SELECT id, COALESCE(nickname, full_name, 'Anonymous') AS display_name
FROM users;

And NULLIF(a, b) returns NULL when a = b, useful for avoiding division by zero:

SELECT total / NULLIF(count, 0) AS average FROM stats;

Common Pitfalls

  • Using = NULL instead of IS NULL. The first always returns UNKNOWN and matches nothing.
  • Assuming NOT IN (subquery) works when the subquery contains NULL. It does not: any UNKNOWN comparison kills the row. Prefer NOT EXISTS.
  • Forgetting that COUNT(column) ignores NULLs while COUNT(*) does not, leading to inflated or deflated metrics.
  • Indexing a nullable column and expecting it to speed up IS NULL lookups on every database. Some engines exclude NULLs from B-tree indexes by default.
  • Concatenating strings with NULL in databases like Oracle versus PostgreSQL produces different results, breaking portability.
  • Treating NULL as a sentinel for “no value yet”. Often a status column or separate table models intent more clearly.

Practical Tips

Decide NULL semantics at design time. For every nullable column, write down what NULL means in business terms. If you cannot articulate it, the column probably should be NOT NULL with a sensible default.

Prefer NOT NULL with explicit defaults for booleans, counts, and flags. Reserve NULL for cases like “the user has not yet provided an answer”.

When writing reports, normalize NULL early. Wrap nullable columns in COALESCE near the source so downstream logic does not have to remember the asymmetry.

Use database-specific helpers when they improve clarity: IS DISTINCT FROM, IFNULL, and NVL all exist for a reason. Keep their differences documented in your team’s style guide.

Add tests that explicitly include NULL rows. Most regressions involving NULL ship because the developer only tested with non-null fixtures.

Wrap-up

NULL is powerful but unforgiving. Treat it as genuine uncertainty, lean on COALESCE, NULLIF, and IS [NOT] DISTINCT FROM, and design schemas where NULL has a clear, documented meaning. Done well, NULL handling becomes invisible; done poorly, it becomes the source of your next production incident.