Cache Eviction Policies: What Gets Kicked Out When the Cache Is Full


Your Redis cache is configured with 4GB of memory. It fills up. Redis starts evicting entries to make room for new ones. Your cache hit rate drops from 95% to 60%. Database load spikes. Latency increases. You check what is being evicted: the most frequently accessed product pages. The eviction policy is set to allkeys-random. Redis is randomly evicting entries, including your hottest data.

Changing the policy to allkeys-lru fixes it. Cache hit rate goes back to 95%. The eviction policy is a small configuration change with a large impact.

Why eviction policies exist

A cache has finite memory. When it fills up and a new entry needs to be stored, something must be removed. The eviction policy determines which entry to remove.

The goal: evict entries that are least likely to be needed again, keeping the entries most likely to be requested in the future.

The challenge: you cannot know the future. Eviction policies use heuristics based on past access patterns to predict future access.

The main eviction policies

LRU (Least Recently Used)

Evict the entry that was accessed least recently. The assumption: if something has not been accessed in a while, it is less likely to be accessed soon.

LRU is the most commonly used policy and works well for most workloads. It naturally keeps “hot” data in the cache - frequently accessed entries are always recently used.

Implementation: A doubly linked list + hash map. The list is ordered by access time. On access, move the entry to the front. On eviction, remove from the back. O(1) for both operations.

Redis implementation: Redis uses an approximation of LRU (samples a random subset of keys and evicts the least recently used among them) rather than true LRU, for memory efficiency.

graph LR
subgraph lru["LRU Cache - 4 slots, adding E evicts A"]
  A["A
Accessed: 1min ago"] -->|"oldest"| B["B
Accessed: 30s ago"]
  B --> C["C
Accessed: 10s ago"]
  C -->|"newest"| D["D
Accessed: 2s ago"]
  E["New entry E"] -->|"evicts A"| A
end

style A fill:#FCEBEB,stroke:#A32D2D,color:#791F1F
style B fill:#FAEEDA,stroke:#854F0B,color:#633806
style C fill:#E1F5EE,stroke:#0F6E56,color:#085041
style D fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style E fill:#EEEDFE,stroke:#534AB7,color:#3C3489

LFU (Least Frequently Used)

Evict the entry that has been accessed the fewest times. The assumption: if something has been accessed rarely, it is less likely to be accessed in the future.

LFU is better than LRU for workloads with a stable set of hot keys. A key accessed 1,000 times will not be evicted just because it was not accessed in the last minute.

Problem: LFU has a “cache pollution” problem. An entry accessed many times in the past but not recently will stay in the cache indefinitely, even if it is no longer relevant. Solutions: decay the frequency count over time (time-decayed LFU), or use a sliding window.

Redis 4.0+ added LFU support with a time-decayed frequency counter.

FIFO (First In, First Out)

Evict the oldest entry, regardless of how often it has been accessed. Simple but rarely optimal - a frequently accessed entry that was added early will be evicted before a rarely accessed entry added recently.

Use case: When all entries have equal value and you just want to limit cache size. Rarely used in practice for general caching.

TTL-based expiration

Not strictly an eviction policy, but entries are removed when their time-to-live expires. Works alongside other eviction policies: entries expire on TTL, and if the cache fills up before TTLs expire, the eviction policy kicks in.

Random eviction

Evict a random entry. Surprisingly effective for some workloads (uniform access patterns) and terrible for others (skewed access patterns). Simple to implement. Redis supports allkeys-random and volatile-random.

MRU (Most Recently Used)

Evict the most recently used entry. Counterintuitive, but useful for specific access patterns: if you are scanning through a large dataset once (sequential scan), the most recently accessed entry is the least likely to be needed again. Prevents cache pollution from one-time scans.

graph TB
subgraph policies["Eviction Policy Comparison"]
  LRU2["LRU
Evict least recently used"]
  LFU2["LFU
Evict least frequently used"]
  FIFO2["FIFO
Evict oldest entry"]
  TTL2["TTL
Evict expired entries"]
  RAND["Random
Evict random entry"]
end

subgraph best_for["Best For"]
  B1["General purpose
Most workloads"]
  B2["Stable hot keys
Frequency matters more than recency"]
  B3["Equal-value entries
Simple use cases"]
  B4["Time-sensitive data
Sessions, tokens"]
  B5["Uniform access
Simple implementation"]
end

LRU2 --- B1
LFU2 --- B2
FIFO2 --- B3
TTL2 --- B4
RAND --- B5

style LRU2 fill:#E1F5EE,stroke:#0F6E56,color:#085041
style LFU2 fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style FIFO2 fill:#F1EFE8,stroke:#888780,color:#444441
style TTL2 fill:#FAEEDA,stroke:#854F0B,color:#633806
style RAND fill:#F1EFE8,stroke:#888780,color:#444441

Redis eviction policies in detail

Redis has 8 eviction policies, split into two groups:

allkeys-* - Apply to all keys in the cache:

  • allkeys-lru - Evict least recently used key
  • allkeys-lfu - Evict least frequently used key
  • allkeys-random - Evict random key

volatile-* - Apply only to keys with a TTL set:

  • volatile-lru - Evict least recently used key with TTL
  • volatile-lfu - Evict least frequently used key with TTL
  • volatile-random - Evict random key with TTL
  • volatile-ttl - Evict key with shortest remaining TTL

noeviction - Return an error when memory is full. No eviction.

Choosing for Redis:

  • Pure cache (all keys are cache entries): allkeys-lru or allkeys-lfu
  • Mixed use (some keys are persistent, some are cache): volatile-lru (only evicts keys with TTL, preserving persistent keys)
  • Session store: volatile-ttl (evict sessions closest to expiry first)

Where it breaks or gets interesting

The scan problem with LRU

A batch job scans through 10 million product records once. Each product is accessed exactly once. LRU evicts the oldest entries - which are the hot product pages that users actually visit. The batch scan pollutes the cache.

Solutions: use a separate cache for batch operations, use MRU for the batch cache, or implement a “scan-resistant” LRU that uses a two-queue approach (new entries go to a probationary queue, only promoted to the main LRU queue after a second access).

LFU frequency decay

Without decay, LFU keeps entries that were popular months ago but are no longer relevant. Redis’s LFU implementation uses a logarithmic counter that decays over time. The lfu-decay-time setting controls how fast the counter decays. A value of 1 means the counter halves every minute of inactivity.

Cache size and hit rate relationship

The relationship between cache size and hit rate is not linear. For most workloads, it follows a power law: doubling the cache size from 1GB to 2GB might increase hit rate from 80% to 90%. Doubling again to 4GB might only increase it to 93%. The marginal benefit of more cache decreases as you capture more of the working set.

The “working set” is the set of data that is actively accessed. If your working set fits in the cache, hit rate is high. If it does not, you are constantly evicting data you need.

The cold start problem

A new cache instance has no data. Every request is a cache miss. The database gets hammered until the cache warms up. This is the cold start problem.

Mitigations: pre-warm the cache before switching traffic, use a persistent cache (Redis with RDB/AOF snapshots), or use a gradual traffic shift (canary deployment) so the cache warms up under partial load.

Real-world systems

CPU L1/L2/L3 caches - Hardware caches use LRU or pseudo-LRU (approximations for speed). The CPU automatically manages eviction. This is the original context where LRU was developed.

Redis - Configurable eviction policy. allkeys-lru is the most common choice for pure caches. Approximated LRU for memory efficiency.

Memcached - Uses a slab allocator with LRU per slab class. Each slab class handles a range of object sizes. LRU is applied within each slab class.

Browser cache - Uses a combination of TTL (Cache-Control headers) and LRU for disk space management. The browser evicts cached resources when disk space is needed.

CDN edge caches - TTL-based expiration with LRU for space management. Popular content stays cached; unpopular content expires or is evicted.

Operating system page cache - The OS caches disk pages in RAM using a variant of LRU (Linux uses a two-list LRU: active and inactive lists). When RAM is needed, inactive pages are evicted first.

How to apply it in practice

Monitoring cache effectiveness

Key metrics:

  • Hit rate - Percentage of requests served from cache. Target: 90%+ for most caches.
  • Eviction rate - How many entries are being evicted per second. High eviction rate means the cache is too small or the eviction policy is wrong.
  • Memory usage - How full is the cache? If consistently at 100%, consider increasing size or reviewing what is being cached.

Sizing your cache

Start with the working set size. If your hot data is 10GB, a 4GB cache will have a low hit rate. A 12GB cache will capture most of the working set.

Use the 80/20 rule as a starting point: 20% of your data accounts for 80% of your reads. Cache that 20%.

Monitor hit rate and eviction rate. If eviction rate is high and hit rate is low, increase cache size. If hit rate is high and eviction rate is low, you might be able to reduce cache size.

When to use LFU over LRU

Use LFU when:

  • Your hot keys are stable over time (the same products are always popular)
  • You have bursty access patterns where a key is accessed many times in a short window
  • You want to protect frequently accessed keys from being evicted by one-time scans

Use LRU when:

  • Access patterns change over time (trending content, seasonal products)
  • You want simplicity and predictability
  • You are unsure which to use (LRU is the safer default)

FAQ

Q: What happens when Redis runs out of memory with noeviction policy?

Redis returns an OOM command not allowed when used memory is greater than maxmemory error for write commands. Read commands still work. This is appropriate when Redis is used as a primary data store (not just a cache) and you cannot afford to lose data. For pure caches, allkeys-lru is better - it degrades gracefully by evicting old entries rather than returning errors.

Q: How do you choose between volatile-lru and allkeys-lru?

Use volatile-lru when Redis stores both persistent data (no TTL) and cache data (with TTL). The policy only evicts keys with TTL, protecting your persistent data. Use allkeys-lru when Redis is a pure cache and all keys are expendable. allkeys-lru is more aggressive - it can evict any key, which gives it more flexibility to free memory.

Q: Can you implement your own eviction policy?

In Redis, no - you choose from the built-in policies. In application-level caches (in-process caches like Caffeine for Java, or custom implementations), you can implement any policy. Caffeine supports W-TinyLFU, a sophisticated policy that combines LRU and LFU with a frequency sketch for better hit rates than either alone. For most use cases, the built-in policies are sufficient.

Interview questions

Q1: Your Redis cache has a 70% hit rate. You want to improve it. What do you investigate?

Strong answer: Start by understanding what is being missed. Sample cache misses and categorize them: are they misses for keys that were never cached (cold data), keys that were evicted (eviction pressure), or keys that expired (TTL too short)? Check the eviction rate - if it is high, the cache is too small or the eviction policy is wrong. Check the TTL distribution - if many keys expire before being accessed again, TTLs might be too short. Look at the access pattern - if 30% of your data accounts for 95% of reads, make sure that 30% fits in the cache. Consider switching from LRU to LFU if your hot keys are stable. Also check if there are cache-busting patterns in the application (generating unique cache keys for each request, for example).

Q2: You are building a news feed cache. Articles are popular for a few hours after publication, then rarely accessed. Which eviction policy would you choose?

Strong answer: LRU is the right choice here. Articles are popular right after publication (recently accessed) and become less popular over time (less recently accessed). LRU naturally keeps recent articles in the cache and evicts old ones. LFU would be worse: an article that was very popular yesterday would have a high frequency count and stay in the cache even after it is no longer being accessed, evicting newer articles that are currently popular. Also set TTLs on articles (24-48 hours) so they expire even if they are not evicted by LRU. Use volatile-lru if you have other data in Redis that should not be evicted.

Q3: Explain the two-queue (2Q) algorithm and why it is better than simple LRU for some workloads.

Strong answer: Simple LRU has a weakness: a one-time sequential scan evicts hot data. The 2Q algorithm addresses this with two queues: a small FIFO queue (A1) for new entries, and a larger LRU queue (Am) for entries that have been accessed more than once. New entries go into A1. If they are accessed again while in A1, they are promoted to Am. Entries in A1 are evicted FIFO (oldest first). Entries in Am are evicted LRU. The benefit: a one-time scan fills A1 and gets evicted FIFO without touching Am. Frequently accessed entries get promoted to Am and are protected. This is similar to how the Linux page cache works (active and inactive lists) and how some database buffer pools work. The tradeoff is slightly more complexity and the need to tune the relative sizes of A1 and Am.