Cache Penetration, Breakdown and Avalanche
Chapter 34: Cache Penetration, Hotspot Invalidation, and Cache Avalanche
The three classic cache failure patterns — penetration, hotspot invalidation (breakdown), and avalanche — are the most operationally significant Redis problems. Each has a distinct trigger, impact radius, and solution set. This chapter provides precise definitions, quantified impact analysis, and production-ready code for every pattern.
34.1 Cache Penetration
34.1.1 Definition and Root Causes
Cache penetration occurs when a request targets a key that does not exist in either the cache or the database. Every such request bypasses the cache and hits the database directly.
Normal flow:
request(user:1001) → Redis HIT → return cached value
Penetration flow:
request(user:-1) → Redis MISS → DB query → not found → no cache write
→ next request(user:-1) → Redis MISS → DB query → ... ∞
Trigger scenarios:
- Attackers or scrapers using random non-existent IDs (
user:-1,product:9999999999) - Old references to deleted records
- Frontend bugs passing invalid parameters (null, 0, empty string)
Impact: Each penetrating request hits the database. At scale this is equivalent to a load test against the DB, potentially causing connection pool exhaustion and cascading failures.
34.1.2 Solution 1: Cache Null Values
Store a sentinel value in Redis when the DB lookup returns nothing, with a short TTL:
import redis
import json
r = redis.Redis(decode_responses=True)
NULL_SENTINEL = '__NULL__'
def get_user(user_id: int):
key = f'user:{user_id}'
cached = r.get(key)
if cached is not None:
if cached == NULL_SENTINEL:
return None # Negative cache hit — return immediately
return json.loads(cached)
# Cache miss — query the database
user = db.query_user(user_id)
if user is None:
# Cache the negative result with a short TTL (5 minutes)
# Short TTL: limits memory usage and allows recovery if the record is later created
r.setex(key, 300, NULL_SENTINEL)
return None
r.setex(key, 3600, json.dumps(user))
return user
Limitations:
- Attackers using unique random IDs each time still bypass the cache once per ID, but prevent repeated DB hits for the same ID.
- Null entries consume Redis memory — keep TTL short and monitor memory growth.
- When a record is created, proactively delete the null sentinel to avoid stale negatives within the TTL window.
34.1.3 Solution 2: Bloom Filter
A Bloom filter is a space-efficient probabilistic structure that answers: "Is this element definitely not in the set, or possibly in the set?" It has no false negatives (zero miss rate) but a tunable false positive rate.
Mechanism: k hash functions map each element to k positions in an m-bit array and set those bits. A query returns "not present" if any queried bit is 0 (100% certain); "possibly present" if all queried bits are 1 (with false positive probability).
False positive rate formula (k hash functions, m bits, n elements):
FPR ≈ (1 - e^(-kn/m))^k
Optimal k for given m and n: k = (m/n) × ln(2)
Storage sizing for target FPR p and n elements:
m = -n × ln(p) / (ln(2))²
| n (elements) | p (FPR) | m (bits) | Memory |
|---|---|---|---|
| 1 million | 1% | 9,585,059 | ~1.1 MB |
| 10 million | 0.1% | 143,775,879 | ~17 MB |
| 100 million | 0.01% | 1,917,011,718 | ~228 MB |
Implementation with Redisson (Java):
// Initialize (idempotent — safe to call on restart)
RBloomFilter<Long> bloom = redisson.getBloomFilter("known_user_ids");
bloom.tryInit(10_000_000L, 0.001); // 10M elements, 0.1% FPR
// Warm up during startup
try (Stream<Long> ids = userRepository.streamAllIds()) {
ids.forEach(bloom::add);
}
log.info("Bloom filter initialized, count: {}", bloom.count());
// Request handling
public User getUser(long userId) {
// Bloom filter: "not present" is certain
if (!bloom.contains(userId)) {
return null; // short-circuit — never hits DB
}
// "Possibly present" — proceed with cache/DB lookup
String cached = redis.get("user:" + userId);
if (cached != null) return deserialize(cached);
User user = db.findById(userId);
if (user != null) {
redis.setex("user:" + userId, 3600, serialize(user));
}
return user;
}
// Synchronize on record creation
public User createUser(User user) {
User saved = userRepository.save(user);
bloom.add(saved.getId()); // Keep filter in sync
return saved;
}
Implementation with Redis Stack / RedisBloom (Python):
r = redis.Redis()
# Create filter
r.bf().create('known_users', 0.001, 10_000_000)
# Batch warm-up
user_ids = db.get_all_user_ids()
pipe = r.pipeline(transaction=False)
for i, uid in enumerate(user_ids):
pipe.bf().add('known_users', uid)
if i % 5000 == 0:
pipe.execute()
pipe = r.pipeline(transaction=False)
pipe.execute()
# Request handler
def get_user(user_id: int):
if not r.bf().exists('known_users', user_id):
return None # definite non-existence
return fetch_from_cache_or_db(user_id)
Operational note: Bloom filters cannot remove elements. For datasets with frequent deletions (e.g., soft-deleted records), use a Counting Bloom Filter or periodically rebuild the filter from a database snapshot.
34.2 Cache Hotspot Invalidation (Breakdown)
34.2.1 Definition
Cache hotspot invalidation (sometimes called "cache breakdown") occurs when a single high-traffic key expires at the exact moment a large number of concurrent requests are accessing it. All requests simultaneously experience a cache miss and race to query the database.
Timeline:
t=0: 100,000 concurrent requests for "product:SKU001:detail"
t=0: key expires (TTL reaches zero)
t=0+: 100,000 simultaneous DB queries for the same record
result: database connection pool exhausted, query timeouts cascade
Distinction from penetration:
- Penetration: key never existed in DB (invalid key)
- Breakdown: key existed and was cached, but expires under high concurrency
34.2.2 Solution 1: Mutex Lock
Serialize cache rebuilding so only one thread queries the database; all others wait and then read the freshly-populated cache:
import redis, time, json, threading
r = redis.Redis(decode_responses=True)
def get_product(sku_id: str) -> dict:
key = f'product:{sku_id}'
cached = r.get(key)
if cached:
return json.loads(cached)
lock_key = f'lock:{key}'
acquired = r.set(lock_key, 1, nx=True, ex=5) # 5-second lock TTL
if acquired:
try:
# Double-check: another thread may have rebuilt the cache while we acquired the lock
cached = r.get(key)
if cached:
return json.loads(cached)
product = db.query_product(sku_id)
if product:
r.setex(key, 3600, json.dumps(product))
return product
finally:
r.delete(lock_key)
else:
time.sleep(0.05)
return get_product(sku_id) # retry after brief wait
func getProduct(ctx context.Context, rdb *redis.Client, skuID string) (*Product, error) {
key := "product:" + skuID
lockKey := "lock:" + key
cached, err := rdb.Get(ctx, key).Bytes()
if err == nil {
var p Product
if err := json.Unmarshal(cached, &p); err == nil {
return &p, nil
}
}
ok, err := rdb.SetNX(ctx, lockKey, 1, 5*time.Second).Result()
if err != nil {
return nil, fmt.Errorf("lock error: %w", err)
}
if ok {
defer rdb.Del(ctx, lockKey)
// Double-check after acquiring lock
if cached, err = rdb.Get(ctx, key).Bytes(); err == nil {
var p Product
json.Unmarshal(cached, &p)
return &p, nil
}
product, err := db.QueryProduct(skuID)
if err != nil {
return nil, err
}
data, _ := json.Marshal(product)
rdb.SetEx(ctx, key, data, time.Hour)
return product, nil
}
time.Sleep(50 * time.Millisecond)
return getProduct(ctx, rdb, skuID)
}
Mutex lock drawbacks: all waiting threads are blocked until the lock holder completes the DB query and cache write. Under extreme concurrency, the wait queue can itself become a latency problem.
34.2.3 Solution 2: Logical Expiration
Remove the Redis TTL entirely. Embed an expiration timestamp inside the value. When a logically-expired entry is read, trigger an asynchronous background refresh and return the stale value immediately to the caller. This trades strong consistency for zero blocking.
import json, time, threading
from dataclasses import dataclass, asdict
from typing import Any, Callable, Optional
@dataclass
class LogicalEntry:
value: Any
expire_at: float # Unix timestamp
r = redis.Redis(decode_responses=True)
_refresh_in_progress: set = set()
_lock = threading.Lock()
def get_logical(
key: str,
loader: Callable[[], Any],
ttl: int = 3600
) -> Optional[Any]:
raw = r.get(key)
if raw is None:
# Cold start: load synchronously
data = loader()
entry = LogicalEntry(value=data, expire_at=time.time() + ttl)
r.set(key, json.dumps(asdict(entry))) # no Redis TTL
return data
entry = LogicalEntry(**json.loads(raw))
if entry.expire_at > time.time():
return entry.value # logically fresh
# Logically expired — trigger async refresh, return stale value immediately
with _lock:
already_refreshing = key in _refresh_in_progress
if not already_refreshing:
_refresh_in_progress.add(key)
if not already_refreshing:
def refresh():
try:
new_data = loader()
new_entry = LogicalEntry(value=new_data, expire_at=time.time() + ttl)
r.set(key, json.dumps(asdict(new_entry)))
finally:
with _lock:
_refresh_in_progress.discard(key)
threading.Thread(target=refresh, daemon=True).start()
return entry.value # return stale data, never blocks
# Usage
product = get_logical(
key=f'product:{sku_id}',
loader=lambda: db.query_product(sku_id),
ttl=3600
)
Comparison: Mutex vs. Logical Expiration:
| Dimension | Mutex Lock | Logical Expiration |
|---|---|---|
| Consistency | Strong (blocks until rebuilt) | Eventual (stale reads allowed) |
| Availability | Low during rebuild | Always available |
| Caller latency | May spike during lock contention | Consistently low |
| Complexity | Moderate | Higher (async refresh, process coordination) |
| Best for | Financial/inventory accuracy | Product listings, content feeds |
34.3 Cache Avalanche
34.3.1 Two Root Causes
Cache avalanche: a large portion of cached keys expire simultaneously (or Redis becomes unavailable), sending the full traffic load to the database.
Cause A — Synchronized expiration (common TTL set for all keys during a data refresh job):
# Anti-pattern: all products expire at the same time
for product in products:
r.setex(f'product:{product.id}', 3600, serialize(product))
# At t=1 hour, all 50,000 product keys expire simultaneously
Cause B — Redis unavailability (instance crash, network partition, planned maintenance without graceful failover).
34.3.2 Solution 1: Random TTL Jitter
Spread expiration times across a range to prevent synchronized mass expiration:
import random
def cache_product(product_id: int, data: dict, base_ttl: int = 3600) -> None:
# Add ±10% random jitter
jitter = random.randint(-360, 360)
real_ttl = base_ttl + jitter
r.setex(f'product:{product_id}', real_ttl, json.dumps(data))
// Java version using ThreadLocalRandom (thread-safe, no contention)
int jitter = ThreadLocalRandom.current().nextInt(-360, 361); // ±10%
int realTtl = baseTtl + jitter;
jedis.setex(key, realTtl, value);
Tier-based TTL strategy (differentiate by access frequency):
| Data tier | Base TTL | Jitter range | Notes |
|---|---|---|---|
| Hot (trending items) | 3600 s | ±5% | Tight jitter; high value |
| Normal | 1800 s | ±20% | Moderate spread |
| Cold (archival) | 600 s | ±30% | Wide spread; low DB impact |
34.3.3 Solution 2: Redis High Availability
Sentinel mode: 3 or 5 sentinel processes monitor the primary. On primary failure, sentinels elect a new primary within 30–60 seconds. Business experiences errors during the failover window.
Cluster mode: 16 384 slots distributed across N primaries. A single primary failure affects ~1/N of keys; all other slots remain fully operational. This is the preferred architecture for availability during partial failures.
Cross-datacenter active-active: two independent Redis deployments in separate datacenters with async replication. Either DC can serve reads and writes; conflict resolution requires application-level logic (e.g., last-write-wins or CRDT semantics).
34.3.4 Solution 3: Multi-Layer Cache
Request → L1 (in-process Caffeine) → L2 (Redis) → DB
L1 (in-process cache) has sub-millisecond latency and absorbs traffic when L2 is degraded:
@Configuration
public class CacheConfig {
@Bean
public Cache<String, Object> l1Cache() {
return Caffeine.newBuilder()
.maximumSize(10_000)
.expireAfterWrite(Duration.ofMinutes(5)) // short TTL: L2 is source of truth
.recordStats()
.build();
}
}
@Service
public class ProductService {
@Autowired private Cache<String, Object> l1Cache;
@Autowired private StringRedisTemplate redis;
@Autowired private ProductRepository repo;
public Product getProduct(String id) {
String key = "product:" + id;
// L1 lookup
Product p = (Product) l1Cache.getIfPresent(key);
if (p != null) return p;
// L2 lookup
String json = redis.opsForValue().get(key);
if (json != null) {
p = deserialize(json);
l1Cache.put(key, p);
return p;
}
// DB lookup
p = repo.findById(id).orElse(null);
if (p != null) {
redis.opsForValue().set(key, serialize(p), Duration.ofHours(1));
l1Cache.put(key, p);
}
return p;
}
// Invalidate both layers on write
public void updateProduct(Product p) {
repo.save(p);
redis.delete("product:" + p.getId());
l1Cache.invalidate("product:" + p.getId());
}
}
When Redis is down, L1 continues serving fresh-enough data (up to 5 minutes stale) for the majority of hot keys, reducing DB pressure by 80–95%.
34.3.5 Solution 4: Circuit Breaker and Rate Limiting
When Redis is unavailable, prevent unlimited fallback to the database:
from pybreaker import CircuitBreaker
import pybreaker
# Open after 5 consecutive failures; attempt recovery after 30 s
db_breaker = CircuitBreaker(fail_max=5, reset_timeout=30)
@db_breaker
def query_db(key: str):
return db.get(key)
def get_data(key: str):
# Try Redis first
try:
val = r.get(key)
if val:
return val
except redis.RedisError:
pass # Redis unavailable — fall through to DB
# Rate-limited DB fallback
try:
return query_db(key)
except pybreaker.CircuitBreakerError:
# Circuit is open — DB already overloaded
return get_degraded_fallback(key)
def get_degraded_fallback(key: str):
# Options: return stale data from L1, return a default, or raise 503
return l1_cache.get(key) or {'error': 'service_degraded'}
// Spring Boot + Resilience4j
@CircuitBreaker(name = "redis-cb", fallbackMethod = "fallbackToDb")
@RateLimiter(name = "redis-rl")
public String getFromRedis(String key) {
return redisTemplate.opsForValue().get(key);
}
@RateLimiter(name = "db-rl", fallbackMethod = "getDegraded")
public String fallbackToDb(String key, Exception ex) {
log.warn("Redis unavailable, falling back to DB for key: {}", key);
return dbRepository.findByKey(key);
}
public String getDegraded(String key, Exception ex) {
metrics.increment("cache.avalanche.degraded");
return ""; // or throw ServiceUnavailableException
}
Resilience4j configuration in application.yaml:
resilience4j:
circuitbreaker:
instances:
redis-cb:
slidingWindowSize: 20
failureRateThreshold: 50
waitDurationInOpenState: 30s
permittedNumberOfCallsInHalfOpenState: 3
ratelimiter:
instances:
db-rl:
limitForPeriod: 500
limitRefreshPeriod: 1s
timeoutDuration: 0
34.4 Side-by-Side Comparison
| Dimension | Penetration | Hotspot Invalidation | Avalanche |
|---|---|---|---|
| Trigger | Query for non-existent key | Hot key expires during peak traffic | Many keys expire simultaneously / Redis down |
| Affected keys | One (invalid) key | One hot key | Large number of keys |
| DB impact | Sustained trickle | Instantaneous spike | Massive instantaneous flood |
| Primary solution | Bloom filter | Logical expiration | Random TTL jitter + multi-layer cache |
| Safety net | Null value caching | Mutex lock | Circuit breaker + rate limiter + degradation |
34.5 Monitoring and Alerting
Key metrics to watch:
# Cache hit ratio (early warning for penetration or avalanche onset)
redis-cli INFO stats | grep -E 'keyspace_hits|keyspace_misses'
# hit_rate = hits / (hits + misses) — alert if below 80%
# Client connection count (spikes precede DB connection pool exhaustion)
redis-cli INFO clients | grep connected_clients
# Memory usage (null-value caching can cause memory bloat)
redis-cli INFO memory | grep used_memory_human
Prometheus alert rules:
groups:
- name: redis_cache_health
rules:
- alert: CacheMissRatioHigh
expr: >
rate(redis_keyspace_misses_total[5m]) /
(rate(redis_keyspace_hits_total[5m]) + rate(redis_keyspace_misses_total[5m]) + 0.001) > 0.3
for: 2m
labels:
severity: warning
annotations:
summary: "Redis cache miss ratio exceeds 30% — possible penetration or avalanche"
- alert: RedisConnectionSurge
expr: redis_connected_clients > 2000
for: 1m
labels:
severity: critical
annotations:
summary: "Redis connection count surge — possible avalanche in progress"
- alert: RedisMemoryUsageHigh
expr: redis_memory_used_bytes / redis_memory_max_bytes > 0.85
for: 5m
labels:
severity: warning
annotations:
summary: "Redis memory usage above 85% — risk of eviction impacting hit ratio"