Cache Patterns: Reading and Writing Through a Cache Correctly


Your product page loads in 800ms. The database query takes 600ms. You add Redis. You cache the product data. The page now loads in 50ms. Three days later, a product’s price changes. The cache still shows the old price. Customers are buying at the wrong price. You flush the cache. The database gets hammered by 10,000 simultaneous requests. It falls over.

You cached correctly but invalidated incorrectly. Caching is not just “store this in Redis.” It is a set of patterns with specific tradeoffs around consistency, performance, and failure modes.

The four core cache patterns

Cache-aside (lazy loading)

The application manages the cache directly. On a read: check the cache first. If it is there (cache hit), return it. If not (cache miss), fetch from the database, store in the cache, return the result.

On a write: update the database, then invalidate (delete) the cache entry. The next read will repopulate it.

graph LR
subgraph read["Cache-Aside Read"]
  APP1["Application"] -->|"1. Get product:123"| CACHE1["Cache"]
  CACHE1 -->|"2. Cache miss"| APP1
  APP1 -->|"3. Query DB"| DB1["Database"]
  DB1 -->|"4. Return data"| APP1
  APP1 -->|"5. Store in cache"| CACHE1
  APP1 -->|"6. Return to client"| CLIENT1["Client"]
end

style CACHE1 fill:#E1F5EE,stroke:#0F6E56,color:#085041
style DB1 fill:#EEEDFE,stroke:#534AB7,color:#3C3489

Pros: Simple. Only caches data that is actually requested. Cache failures do not break reads (just slower). Works with any database.

Cons: Cache miss penalty (3 round trips instead of 1). Stale data between write and cache invalidation. Cache stampede on cold start.

Best for: Read-heavy workloads where cache misses are acceptable. Most web applications use this pattern.

Read-through

The cache sits in front of the database. The application only talks to the cache. On a cache miss, the cache itself fetches from the database and populates itself.

Pros: Application code is simpler (no cache miss handling). Cache is always populated before returning to the application.

Cons: First request for any key is always slow (cache miss). Cache must know how to talk to the database. Less flexible than cache-aside.

Best for: When you want to hide cache complexity from the application. Used by some ORM-level caches.

Write-through

Every write goes to both the cache and the database synchronously. The write is not acknowledged until both succeed.

Pros: Cache is always consistent with the database. No stale reads after writes.

Cons: Higher write latency (two writes instead of one). Cache fills with data that might never be read (write-heavy data that is rarely read wastes cache space).

Best for: Read-heavy data that is written infrequently. User profiles, configuration data.

Write-behind (write-back)

Writes go to the cache immediately. The cache asynchronously writes to the database in the background.

Pros: Very low write latency. Batches multiple writes to the same key into one database write.

Cons: Risk of data loss if the cache fails before writing to the database. Complex to implement correctly. Consistency is harder to reason about.

Best for: High-frequency writes where some data loss is acceptable. Analytics counters, view counts, game scores.

graph TB
subgraph patterns["Cache Pattern Comparison"]
  CA["Cache-Aside
App manages cache
Lazy population"]
  RT["Read-Through
Cache fetches from DB
Transparent to app"]
  WT["Write-Through
Sync write to cache+DB
Always consistent"]
  WB["Write-Behind
Async write to DB
Low latency writes"]
end

subgraph tradeoffs["Tradeoffs"]
  T1["Simple, stale risk
Cache miss penalty"]
  T2["Simpler app code
First request slow"]
  T3["Consistent, slower writes
Cache bloat risk"]
  T4["Fast writes, data loss risk
Complex recovery"]
end

CA --- T1
RT --- T2
WT --- T3
WB --- T4

style CA fill:#E1F5EE,stroke:#0F6E56,color:#085041
style RT fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style WT fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style WB fill:#FAEEDA,stroke:#854F0B,color:#633806

Cache invalidation strategies

Cache invalidation is famously hard. There are two main approaches:

TTL-based expiration - Every cache entry has a time-to-live. After the TTL expires, the next read fetches fresh data. Simple, predictable, but data can be stale for up to TTL seconds.

Event-based invalidation - When data changes, explicitly delete or update the cache entry. More complex but more accurate. Requires the write path to know what to invalidate.

The hybrid approach: use a short TTL as a safety net, plus explicit invalidation on writes. The TTL catches cases where invalidation fails.

Where it breaks or gets interesting

The cache stampede (thundering herd)

A popular cache entry expires. Simultaneously, 1,000 requests come in for that entry. All 1,000 see a cache miss. All 1,000 query the database. The database gets 1,000 simultaneous queries for the same data. It falls over.

Solutions:

  • Probabilistic early expiration - Before the TTL expires, randomly start refreshing the cache. The probability of refreshing increases as the TTL approaches zero. This spreads the refresh load over time.
  • Mutex/lock - When a cache miss occurs, acquire a lock. Only one request fetches from the database. Others wait for the lock and then read from the cache. Adds latency but prevents stampede.
  • Background refresh - Serve stale data while asynchronously refreshing the cache. The user gets a slightly stale response but never waits for a database query.

Cache penetration

Requests for keys that do not exist in the database. Every request misses the cache and hits the database. If an attacker sends millions of requests for non-existent user IDs, the database gets hammered.

Solutions:

  • Cache negative results - Store a “not found” marker in the cache with a short TTL. Subsequent requests for the same key return the cached “not found” without hitting the database.
  • Bloom filter - A probabilistic data structure that can tell you if a key definitely does not exist. Check the bloom filter before the cache. If the bloom filter says “no,” skip the database entirely.

Cache inconsistency with write-then-invalidate

The most common cache-aside pattern: write to the database, then delete the cache entry. But there is a race condition:

  1. Request A reads from cache: miss
  2. Request A reads from database: gets value V1
  3. Request B writes V2 to database
  4. Request B deletes cache entry
  5. Request A stores V1 in cache

Now the cache has stale data V1 even though the database has V2. The cache will serve V1 until it expires.

The fix: use “delete-then-write” (invalidate before writing) or use a distributed lock around the read-modify-write cycle. Or accept the brief inconsistency if your TTL is short.

The write-behind data loss scenario

Write-behind caches buffer writes in memory. If the cache server crashes before flushing to the database, those writes are lost. For financial data, this is catastrophic. For view counts, it is acceptable.

If you use write-behind, ensure the cache has persistence (Redis AOF or RDB snapshots) and replication so a single node failure does not lose data.

Real-world systems

Facebook TAO - Facebook’s distributed cache for social graph data. Uses a write-through pattern with careful invalidation. Handles billions of reads per second.

Twitter - Uses Redis for timeline caching. Cache-aside pattern. Timelines are pre-computed and cached. On a new tweet, the timeline caches for followers are invalidated (or updated directly for users with few followers).

Amazon - DynamoDB Accelerator (DAX) is a read-through cache for DynamoDB. Transparent to the application - you point your DynamoDB client at DAX and it handles cache misses automatically.

Cloudflare - CDN caching uses TTL-based expiration with cache-control headers. Supports cache purge API for explicit invalidation.

Redis - Used as a cache layer by virtually every major web application. Supports all four patterns depending on how you use it.

How to apply it in practice

Choosing a pattern

  • Default choice: cache-aside with TTL. Simple, works for most cases.
  • Need strong consistency: write-through. Accept the write latency.
  • High write throughput, some data loss OK: write-behind. Use for counters and metrics.
  • Want to hide cache from application code: read-through (requires cache that supports it).

Setting TTLs

  • Static data (product catalog, configuration): 1-24 hours
  • Semi-static data (user profiles, prices): 5-60 minutes
  • Dynamic data (inventory counts, session data): 30 seconds to 5 minutes
  • Real-time data (stock prices, live scores): no cache, or 1-5 seconds

Cache key design

Good cache keys are:

  • Unique: user:123:profile not user_profile
  • Versioned: user:123:profile:v2 for schema changes
  • Namespaced: prefix by service to avoid collisions in shared caches
  • Not too long: Redis keys are stored in memory

FAQ

Q: Should you cache at the database level or the application level?

Both have their place. Database-level caching (query cache, buffer pool) is automatic and transparent. Application-level caching (Redis, Memcached) gives you control over what is cached, for how long, and with what invalidation strategy. For most applications, application-level caching is more effective because you can cache at the right granularity (a computed result, not just a raw query result) and invalidate precisely when data changes.

Q: How do you handle cache warming after a deployment?

A cold cache after deployment causes a stampede as all requests miss the cache simultaneously. Strategies: pre-warm the cache before switching traffic (run a script that populates common cache entries), use a blue-green deployment where the new version warms up while the old version handles traffic, or use a short TTL so the cache warms up quickly under real traffic. For critical caches, maintain a “warm standby” cache that is kept in sync with the primary.

Q: When should you NOT cache?

When data changes very frequently (every second), caching adds complexity without benefit. When data is user-specific and rarely repeated (personalized recommendations computed fresh each time). When the database query is already fast (under 5ms) and the cache overhead (network round trip to Redis) would be comparable. When consistency is critical and you cannot tolerate any staleness (financial transactions, inventory counts during checkout).

Interview questions

Q1: You are building a product page that shows price and inventory. The price changes rarely, inventory changes frequently. How do you cache this?

Strong answer: Cache price and inventory separately with different TTLs. Price: cache-aside with a 1-hour TTL, plus explicit invalidation when price changes. Inventory: either do not cache (query the database directly for accuracy during checkout) or cache with a very short TTL (30 seconds) and accept brief staleness for display purposes. Never use a cached inventory count for the actual purchase decision - always check the database with a SELECT FOR UPDATE at checkout time. This separates the display concern (can be slightly stale) from the transactional concern (must be accurate).

Q2: Your Redis cache is getting hammered by cache stampedes. Walk through how you would implement probabilistic early expiration.

Strong answer: Instead of a hard TTL, store the expiry time in the cached value itself. When reading, check if the remaining TTL is below a threshold (say, 10% of the original TTL). If it is, with some probability (proportional to how close to expiry), refresh the cache in the background before it expires. The formula: refresh = -1/beta * log(random()) > remaining_ttl. This means some requests start refreshing the cache before it expires, spreading the refresh load over time. The key insight: you are trading a small amount of extra database load (some requests refresh early) for eliminating the stampede (no requests all miss simultaneously). Libraries like dogpile.cache implement this pattern.

Q3: Explain the difference between cache invalidation and cache expiration and when you would use each.

Strong answer: Cache expiration is time-based: the entry automatically becomes invalid after a TTL. Simple to implement, no coordination needed, but data can be stale for up to TTL seconds. Cache invalidation is event-based: when the underlying data changes, you explicitly delete or update the cache entry. More accurate (data is fresh immediately after a write) but requires the write path to know what to invalidate, which can be complex. Use expiration when: brief staleness is acceptable, the data changes unpredictably, or you want a simple safety net. Use invalidation when: you need near-real-time consistency, you know exactly what changes when data is written, and you can afford the complexity. In practice, use both: explicit invalidation on writes plus a TTL as a fallback for cases where invalidation fails or is missed.