Asynchronous Processing & Messaging

Idempotency & Deduplication

18 min Lesson 6 of 10

Idempotency & Deduplication

In any distributed messaging system, the same message can arrive more than once. A producer retries after a timeout; a broker crashes mid-acknowledgement; a network partition causes a delivery to be confirmed on one side but not the other. These are not edge cases — they are everyday realities at scale. The solution is to design every consumer to be idempotent and to use deduplication wherever idempotency alone is not enough.

What Idempotency Means

An operation is idempotent if performing it multiple times produces exactly the same result as performing it once. The term comes from mathematics: applying the same function repeatedly leaves the state unchanged after the first application.

Concrete examples:

  • Idempotent: Setting a user's email to alice@example.com. Running the UPDATE ten times leaves exactly the same row.
  • Not idempotent: Debiting $10 from a wallet. Running that operation ten times charges $100.
  • Idempotent: Marking an order as SHIPPED. Transitioning from SHIPPEDSHIPPED is a no-op.
  • Not idempotent: Incrementing a view counter. Each duplicate message inflates the count.
Key principle: Never assume a message arrives exactly once. Design your consumers so that a duplicate message is harmless. At-least-once delivery is the safest guarantee most brokers can offer; exactly-once is expensive and often illusory at the application level.

Idempotency Keys

The standard pattern is to attach a stable, unique idempotency key to every message at the point of creation — typically a UUID generated by the producer. The consumer records which keys it has already processed. On receiving a message, it checks: have I seen this key before? If yes, it acknowledges and discards without re-executing the side effect.

The key must be:

  • Globally unique — a UUID v4 or a domain-scoped composite key such as order_id:event_type:attempt.
  • Stable across retries — the producer must send the same key on every retry of the same logical operation, not generate a fresh UUID each time.
  • Stored durably — in Redis with a TTL, or in a database table, so the check survives consumer restarts.
Idempotency key deduplication flow Producer key: "uuid-7f3a" msg (x2 retry) Message Broker delivers ≥ 1 time duplicate msg Consumer 1. Lookup key in store 2a. Seen → ack, skip 2b. New → process + save Dedup Store Redis / DB table Step 1 Step 2 Step 3 — guard Step 4 Key must be stable across retries — generated once by the producer, not per attempt.
Idempotency key check: consumer looks up the key in a durable store before executing the side effect.

Where to Store Processed Keys

The dedup store is a critical component. Common choices:

  • Redis SET NX with TTL — fast (sub-millisecond), works for keys that can expire (e.g., 24 h). Use SET key 1 EX 86400 NX: returns 1 on first write (process), 0 on duplicate (skip). This is the most common choice in high-throughput systems.
  • Database unique constraint — a table with a UNIQUE index on the idempotency key. An INSERT IGNORE or ON CONFLICT DO NOTHING atomically prevents double-processing. Slightly slower but durable and transactional — you can update the business row and insert the key in one transaction.
  • Broker-level deduplication — some brokers (AWS SQS FIFO, RabbitMQ with a dedup plugin) accept a MessageDeduplicationId and reject duplicates within a window (5 minutes for SQS FIFO) at the broker itself. This offloads the logic but does not protect against application-level replays outside that window.
Practical tip: Combine both layers. Use broker-level dedup to catch fast retries (within seconds), and a database unique constraint to catch delayed replays that arrive after the broker window closes. The database constraint is your final safety net.

Atomicity: The Double-Spend Problem

A subtle but critical correctness issue arises if you check the dedup store and execute the business logic as two separate steps. Between those two steps another thread could process the same message. The solution is an atomic check-and-set:

  • With Redis: SET key 1 EX 86400 NX is a single atomic command — no race condition.
  • With a relational database: wrap the key insert and the business operation in a single transaction. The unique constraint will cause the second transaction to fail and roll back cleanly.
Avoid "check then act": Never do SELECT → decide → INSERT as three separate statements without a transaction or atomic primitive. Under concurrent load, two consumers can both pass the SELECT check and both proceed to execute, producing a duplicate. This is a classic TOCTOU (time-of-check/time-of-use) race.

Making Non-Idempotent Operations Safe

When you cannot rewrite the underlying operation to be naturally idempotent (e.g., charging a payment gateway), the pattern is fence with a status machine:

  1. Before calling the external service, write a record to the database with status PENDING and the idempotency key, inside a transaction.
  2. If that write fails with a unique-key violation, another worker has started or completed the operation — stop and return.
  3. Call the external service, then update the record to COMPLETE with the result.
  4. On any crash between steps 2 and 3, a recovery job sees the PENDING row and retries — passing the same idempotency key to the external API (Stripe, PayPal etc. all support this) so the payment gateway deduplicates on its side.
Payment idempotency fence with status machine Charge Event key: "k-001" DB: Insert key=k-001 status=PENDING Conflict → stop (duplicate) Payment Gateway POST /charge Idempotency-Key: k-001 DB: Update key=k-001 status=COMPLETE Crash between Insert and Update → recovery job retries with same key; gateway deduplicates.
Payment idempotency fence: PENDING status guards the external call; the gateway also deduplicates by key.

TTL and Key Expiry

Dedup keys do not need to live forever. Choose a TTL that covers your retry window with a comfortable margin. If your retry policy gives up after 1 hour with exponential back-off, a 24-hour TTL is safe. For payment operations where a customer might dispute a charge days later, keep keys for 7–30 days. When keys expire, you accept that a message replayed after the window will be processed again — make sure that is acceptable for your use case, or use permanent database records for high-stakes operations.

Practical Checklist

  • Every producer assigns a UUID idempotency key once and reuses it on all retries.
  • Every consumer checks the key atomically before executing the side effect.
  • The dedup store uses an atomic primitive (SET NX or unique constraint).
  • Business logic and key recording happen in the same transaction where possible.
  • TTL is set long enough to cover the full retry window, plus margin.
  • External services (payment gateways, email providers) receive the same key so they can deduplicate independently.