Idempotency Keys: Making APIs Safe to Retry

Intermediate 9 min read

What you'll learn

✓Why retries are unavoidable in distributed systems
✓What an idempotency key is and how it works
✓How to design the request and response storage
✓How to handle concurrent retries safely
✓Where to put idempotency in your stack

Prerequisites

•Basic REST: see [What is REST?](/blog/what-is-rest)

A client sends a POST to create a payment. The network drops the response. The client retries. Without protection, the server now has two payments. Idempotency keys are the standard fix.

The problem

Some HTTP methods are idempotent by definition. GET, PUT, and DELETE produce the same effect no matter how many times you call them. POST does not. A retried POST may be the difference between one order and three orders.

Network failures, gateway timeouts, mobile radios losing signal, server restarts, queue redeliveries: all of these turn one logical request into many physical ones. You cannot prevent the retries. You can make them safe.

What an idempotency key is

An idempotency key is a client-generated unique string sent on a request. The server remembers the result of the first request for that key and returns the same result for any future request with the same key.

The contract is:

Same key, same request body, same response.
Same key, different request body, an error.
New key, new request, new response.

A common header is Idempotency-Key: 8a4b...uuid. Stripe popularized this pattern, and it is now standard in payment, billing, and order APIs. See What is REST? for the broader API context.

A minimal implementation

The storage table holds the key, a fingerprint of the request, the response, and a status.

CREATE TABLE idempotency (
  key            text PRIMARY KEY,
  request_hash   text NOT NULL,
  status         text NOT NULL,  -- 'in_progress' | 'completed'
  response_code  int,
  response_body  jsonb,
  created_at     timestamptz NOT NULL DEFAULT now(),
  expires_at     timestamptz NOT NULL
);

The handler logic:

async function handle(req) {
  const key = req.headers["idempotency-key"];
  if (!key) return badRequest("missing idempotency key");

  const hash = sha256(canonicalize(req.body));

  const existing = await db.oneOrNone(
    "SELECT * FROM idempotency WHERE key = $1",
    [key]
  );

  if (existing) {
    if (existing.request_hash !== hash) {
      return conflict("idempotency key reused with different body");
    }
    if (existing.status === "completed") {
      return reply(existing.response_code, existing.response_body);
    }
    return reply(409, { error: "request in progress" });
  }

  await db.none(
    `INSERT INTO idempotency(key, request_hash, status, expires_at)
     VALUES ($1, $2, 'in_progress', now() + interval '24 hours')`,
    [key, hash]
  );

  const result = await doTheWork(req.body);

  await db.none(
    `UPDATE idempotency
     SET status='completed', response_code=$2, response_body=$3
     WHERE key=$1`,
    [key, result.status, result.body]
  );

  return reply(result.status, result.body);
}

Three things matter here: insert-then-work, hash the body, and bound the lifetime.

Concurrency

What if two retries arrive at the same time? The INSERT on a primary key serializes them. The second one fails the insert, reads the row, sees in_progress, and returns 409. The client retries after backoff, and by then the first request has completed and the response is cached.

If you cannot rely on a primary-key insert, use an advisory lock per key. Do not rely on application-level mutexes. They do not survive multiple processes.

What to do inside the work

The cached response is only safe if the work itself is correct. Two patterns:

Wrap the real operation in a transaction that includes the idempotency row insert. On retry, the transaction is short-circuited by the cached row.
Use the key as a deduplication token inside downstream systems too. Pass it to the payment gateway and to your message bus so partial failures do not duplicate side effects.

If the work creates an external side effect (charge a card, send an email), make sure the downstream call is also idempotent or guarded by the same key.

Storage and expiry

Keys should live long enough to cover realistic retry windows. 24 hours is a reasonable default for synchronous APIs. For background jobs, 7 days is common. Expire and clean up. Idempotency tables grow fast.

For high throughput, push the table into Redis with a TTL. Persist the response there, fall back to a database if Redis is cold. Make sure your eviction policy is volatile-ttl or you risk losing in-flight keys.

Where to put it in the stack

You have three reasonable places:

At the HTTP layer, in a middleware. Easy to add, blind to business semantics.
In the service handler. Knows the request, can hash precisely, can include the cache write in the same transaction.
At the message bus, for asynchronous flows. The consumer dedupes by key.

Pick one per flow. Two layers of idempotency are fine, three is a debugging nightmare.

Client responsibilities

Clients must:

Generate a UUID per logical operation, not per HTTP attempt.
Reuse the same key across retries until they give up.
Generate a new key for a genuinely new operation, even if the body looks the same.

Mobile and web SDKs should hide this from product code. Use a small helper that owns the key and the retry loop.

What not to do

Do not derive the key from the request body alone. Two legitimate identical requests become one.
Do not store only the key without the body hash. A reused key on a different body silently returns the wrong response.
Do not skip idempotency on PUT just because it is idempotent in theory. If your PUT triggers side effects like notifications, retries can still duplicate them.
Do not cache 5xx responses. The work may not have completed. Cache success and explicit client errors only.

Wrap up

Idempotency keys cost a small table and a hash, and they remove an entire class of production incidents. Make them required on every state-changing endpoint, hash the body, store the response, and let your clients retry with confidence. Reliability in distributed systems is mostly about giving callers a safe way to try again.