Idempotency: Making Operations Safe to Retry

A customer clicks “Pay” on your checkout page. The request goes out. The network times out. The client does not know if the payment went through. It retries. The payment goes through twice. The customer is charged twice. Your support queue fills up with angry emails.

This is the retry problem. And idempotency is the solution.

What idempotency means

An operation is idempotent if performing it multiple times has the same effect as performing it once. The result is the same whether you call it once or a hundred times.

Naturally idempotent operations:

GET /users/123 - Reading data does not change state. Call it 100 times, same result.
PUT /users/123 {name: "Alice"} - Setting a value to the same value is idempotent. The 10th call has the same effect as the first.
DELETE /users/123 - Deleting something that is already deleted is a no-op (or returns 404, which is fine).

Not naturally idempotent:

POST /payments - Creating a payment twice creates two payments.
POST /orders - Creating an order twice creates two orders.
POST /emails/send - Sending an email twice sends two emails.

The challenge: POST requests that create resources are not naturally idempotent. But they need to be safe to retry.

Idempotency keys

The standard solution: the client generates a unique key for each operation and includes it in the request. The server uses this key to detect and deduplicate retries.

How it works:

Client generates a unique ID (UUID) for the operation: idem-key: 550e8400-e29b-41d4-a716-446655440000
Client sends the request with the idempotency key
Server checks if it has seen this key before
If not: process the request, store the result with the key
If yes: return the stored result without processing again

graph TB
subgraph first["First Request"]
  C1["Client
idempotency-key: abc123"] -->|"POST /payments"| S1["Server"]
  S1 -->|"Check key: not seen"| DB1["Store result
key=abc123
result=payment_id_456"]
  DB1 -->|"Process payment"| PAY["Payment processor"]
  PAY -->|"success"| S1
  S1 -->|"201 Created
payment_id: 456"| C1
end

subgraph retry["Retry - Same Key"]
  C2["Client
idempotency-key: abc123"] -->|"POST /payments (retry)"| S2["Server"]
  S2 -->|"Check key: FOUND"| DB2["Return stored result
key=abc123
result=payment_id_456"]
  DB2 -->|"200 OK
payment_id: 456
(same as before)"| C2
end

style DB1 fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style DB2 fill:#E1F5EE,stroke:#0F6E56,color:#085041
style PAY fill:#FAEEDA,stroke:#854F0B,color:#633806

Implementing idempotency keys

Server-side storage

Store idempotency keys in a fast, durable store (Redis or a database table):

idempotency_keys table:
- key (unique)
- request_hash (hash of the request body)
- response_status
- response_body
- created_at
- expires_at

On each request:

Look up the key
If found and request hash matches: return stored response
If found and request hash does not match: return 422 (same key, different request - client bug)
If not found: process the request, store the result, return the response

Key expiration

Idempotency keys should expire. A key from 30 days ago is unlikely to be retried. Expire keys after 24-72 hours. After expiration, the same key can be reused (though clients should generate new keys for new operations).

Handling in-progress requests

What if two requests with the same key arrive simultaneously? The first request is being processed. The second arrives before the first completes.

Options:

Return 409 Conflict (request in progress, try again later)
Use a distributed lock: the first request acquires a lock on the key, the second waits

Redis SET key value NX EX 30 (set if not exists, expire in 30 seconds) is a common pattern for this.

Where it breaks or gets interesting

Idempotency vs at-most-once vs at-least-once

At-most-once delivery - The operation is performed zero or one times. If the network fails, the operation might not happen. No retries. Used when duplicate operations are worse than missing operations (sending a notification).

At-least-once delivery - The operation is performed one or more times. Retries are used to ensure delivery. Requires idempotency to handle duplicates. Used when missing operations are worse than duplicates (processing a payment).

Exactly-once delivery - The operation is performed exactly once. Achieved by combining at-least-once delivery with idempotency. This is the goal for most critical operations.

Idempotency in distributed systems

In a distributed system, the idempotency key store must be consistent. If you store keys in Redis and Redis has a failover, keys might be lost. A retry after the failover would be processed as a new request.

For critical operations (payments), use a database with ACID transactions for idempotency key storage. The key lookup and the operation should be in the same transaction.

The two-phase commit alternative

For operations that span multiple services (create order + charge payment + update inventory), idempotency alone is not enough. You need either:

A distributed transaction (two-phase commit) - complex and fragile
The Saga pattern - a sequence of local transactions with compensating actions
Idempotency at each step plus a coordinator that tracks overall progress

Consumer idempotency in message queues

Message queues guarantee at-least-once delivery. Your consumer will receive the same message multiple times. The consumer must be idempotent.

Pattern: include a unique message ID in every message. The consumer stores processed message IDs. Before processing, check if the ID has been seen. If yes, skip. If no, process and store the ID.

Real-world systems

Stripe - Idempotency keys for all write operations. Keys are stored for 24 hours. The same key with a different request body returns 422. Stripe’s documentation explicitly requires idempotency keys for payment operations.

PayPal - PayPal-Request-Id header for idempotency. Keys are stored for 72 hours.

AWS - Many AWS APIs support idempotency tokens. EC2 ClientToken, SQS MessageDeduplicationId, DynamoDB conditional writes.

Kafka - Producer idempotency: each producer has a unique ID and sequence number. The broker deduplicates messages with the same producer ID and sequence number.

Twilio - X-Twilio-Idempotency-Token header for SMS and call operations.

How to apply it in practice

When to require idempotency keys

Require idempotency keys for:

Payment operations
Order creation
Email/SMS sending
Any operation that creates a resource and should not be duplicated

Do not require them for:

Read operations (naturally idempotent)
Operations where duplicates are acceptable (logging, analytics)

Client-side implementation

The client is responsible for generating and storing idempotency keys:

Generate a UUID before making the request
Store the key with the operation details (in case of retry)
Include the key in the request header
On timeout or network error: retry with the same key
On success: discard the stored key
On permanent failure (4xx): discard the stored key, do not retry

Idempotency key scope

Keys should be scoped to the operation type. A key used for a payment should not be reused for an order. Include the operation type in the key or use separate key namespaces.

FAQ

Q: What is the difference between idempotency and immutability?

Idempotency means an operation can be performed multiple times with the same result. Immutability means data cannot be changed after creation. They are related but different. An immutable data store (append-only log) makes operations naturally idempotent because you cannot overwrite existing data. But idempotency does not require immutability - you can have idempotent operations on mutable data (using idempotency keys to deduplicate).

Q: Should the server or the client generate the idempotency key?

The client generates the key. The client knows when it is retrying the same operation. The server cannot distinguish a retry from a new request without the key. The client generates a UUID before making the request and uses the same UUID for all retries of that operation.

Q: What happens if the idempotency key store goes down?

If the key store is unavailable, you have two options: fail the request (safe but reduces availability) or process without idempotency checking (risky - might create duplicates). For critical operations like payments, fail the request. For less critical operations, you might accept the risk of occasional duplicates. Design your system to detect and handle duplicates even if idempotency checking fails (e.g., check for duplicate orders by user + product + timestamp before creating).

Interview questions

Q1: A user clicks “Submit Order” twice quickly. How do you prevent duplicate orders?

Strong answer: Multiple layers of defense. First, disable the submit button after the first click (client-side). Second, use an idempotency key: generate a UUID when the order form is loaded, include it in the submit request. The server checks if this key has been used. If yes, return the existing order. If no, create the order and store the key. Third, add a database unique constraint on (user_id, idempotency_key) to prevent race conditions where two requests arrive simultaneously. The unique constraint ensures only one INSERT succeeds even if two requests arrive at the same time. The idempotency key approach handles the case where the first request succeeded but the client did not receive the response (network timeout).

Q2: You are building a payment service. A payment request times out. The client does not know if the payment was processed. How do you handle this?

Strong answer: The client should retry with the same idempotency key. The server checks the key: if the payment was processed, return the existing payment result. If not, process it now. The client gets a definitive answer either way. On the server side: store the idempotency key and result in the same database transaction as the payment record. This ensures atomicity - either both the payment and the key are stored, or neither is. Use a database with ACID transactions for this, not Redis (which might lose data on failover). The client should implement exponential backoff for retries: wait 1 second, then 2 seconds, then 4 seconds. After N retries, surface an error to the user and let them check their payment history.

Q3: How do you implement idempotency for a Kafka consumer that processes payment events?

Strong answer: Each payment event has a unique event ID. The consumer maintains a processed events table (or Redis set) with event IDs. Before processing an event: check if the event ID is in the processed set. If yes, skip (already processed). If no, process the event and add the ID to the processed set in the same database transaction as the payment processing. Use a database transaction to ensure atomicity: if the payment processing fails, the event ID is not marked as processed, so it will be retried. If the payment processing succeeds but the consumer crashes before committing the Kafka offset, the event will be redelivered. The idempotency check prevents double processing. Expire old event IDs after 7 days to prevent the processed set from growing indefinitely.