Cache Architecture Design
MySQL + Redis Cache Architecture
Caching is one of the most effective ways to reduce MySQL read pressure. A well-designed cache can reduce MySQL QPS by 90%+, while a poorly designed one causes cache stampede, data inconsistency, and other disasters. This chapter covers cache pattern selection, the three classic problems and their solutions, and data consistency guarantees.
1. Why Cache?
Memory access ~100ns, NVMe SSD ~100μs, MySQL query over network ~1-10ms. Caching serves data at memory speed, avoiding the database round-trip.
| Layer | Latency | Best For |
|---|---|---|
| In-process (Caffeine) | <1μs | Very hot data, small size, tolerates brief staleness |
| Redis distributed cache | 0.5–2ms | Hot data, cross-service sharing, atomic ops needed |
| MySQL replica | 5–20ms | Complex queries, near-realtime reads |
| MySQL primary | 5–50ms | Writes and strong-consistency reads |
2. Cache Read/Write Patterns
2.1 Cache-Aside (most common)
// Read: check cache → DB fallback → populate cache
// Write: update DB → DELETE cache key (not update)
Why DELETE instead of UPDATE on write? Under concurrent writes, updating the cache can leave it with a stale value (race condition). Deleting forces the next read to fetch fresh data from DB.
2.2 Write-Through
Every write updates both cache and DB synchronously (usually via a cache layer). Higher write latency, better consistency. Good for read-heavy, consistency-critical data.
2.3 Write-Behind (Write-Back)
Write only to cache, flush to DB asynchronously in batches. Highest write throughput but risk of data loss on cache failure. Good for counters and analytics.
3. Three Classic Cache Problems
🔴 Cache Stampede (Breakdown)
What: A hot key expires, and thousands of concurrent requests hit the DB simultaneously.
Solutions:
- Distributed mutex lock: Only one request queries DB; others wait or return stale value
- Logical expiry: Cache never TTL-expires; store expiry in value; refresh async on soft-expiry; others get stale value
- No expiry + explicit invalidation: Hot data lives forever, deleted explicitly on write
🟠 Cache Avalanche
What: Large numbers of keys expire simultaneously, or Redis cluster goes down, flooding the DB.
Solutions:
- TTL jitter (primary fix):
TTL = base + random(0, jitter)to spread expiry times - Cache warmup: Pre-load hot data before peak events
- Circuit breaker: Return defaults when DB pressure is too high
- Redis HA: Sentinel or Cluster to prevent single-point failure
- Multi-tier cache: Local in-process cache as fallback when Redis is down
🔵 Cache Penetration
What: Queries for non-existent data always miss cache and hit DB (since DB also has nothing). Malicious actors can exploit this.
Solutions:
- Cache null values: Store a short-TTL (1-5 min) null marker when DB returns empty
- Bloom filter: Redis 4.0+ BF.ADD / BF.EXISTS. Pre-filter known non-existent keys before hitting DB. No false negatives, ~1% false positives.
4. Cache-Database Consistency
Recommended pattern — write DB first, then delete cache:
- Update MySQL
- Delete (not update) the Redis key
- Optional: delayed double-delete after 500ms to cover race conditions
For high-consistency requirements, use CDC-driven cache invalidation: Canal/Debezium reads MySQL binlog, publishes to message queue, cache invalidation service consumes events and deletes Redis keys. This decouples cache logic from business code.
5. Multi-tier Cache Architecture
For extreme performance: in-process cache (Caffeine, <1μs) → Redis (1ms) → MySQL (10ms). Each miss backfills upper layers. Handle invalidation carefully — in-process caches are per-node, so updates must propagate via Redis pub/sub or short TTLs.
6. Redis Key Naming
-- Pattern: business:object:id[:field]
user:profile:1001
product:detail:SKU2024001
order:status:ORD20240101
session:token:abc123
-- Avoid: plain IDs without prefixes, underscores (use colons)
Big Key warning: String >10KB or Hash/List/Set with >5000 elements. Big key operations block Redis's single thread. Split into sharded sub-keys or compress values.