Chapter 1

Redis Is More Than a Cache

Chapter 1: Redis Is More Than a Cache

1.1 Origin: A Side Project That Changed Infrastructure

In 2009, Salvatore Sanfilippo — known online as antirez — was building a real-time web analytics tool called LLOOGG. The system needed to maintain per-user access logs as bounded lists: append new entries, evict old ones when the list exceeded a threshold. He tried MySQL. He tried PostgreSQL. Both were unacceptably slow for the write-heavy, disk-bound nature of relational storage.

His solution: write a memory-resident data structure server from scratch. The first commit was a few hundred lines of C, supporting only lists and sets. He posted it to Hacker News. By the end of that night, someone had submitted a Python client patch.

Timeline of Redis stewardship:

2009-04-10  First public release (v0.001), Hacker News debut
2010-03-15  VMware hires antirez for full-time development
2013-05-17  Pivotal takes over sponsorship (VMware spin-off)
2015-06-01  Redis Labs founded; antirez joins
2020-07-01  Redis Labs rebrands to Redis Inc.
2022-03-02  antirez formally departs Redis Inc., returns to personal projects
2024-03-20  Redis Inc. changes license from BSD-3 to RSALv2 + SSPLv1
2024-03-21  Linux Foundation announces Valkey fork
2024-04-xx  AWS, Google, Alibaba Cloud, Tencent Cloud back Valkey

The arc is worth internalizing: Redis was never roadmapped by a product team. It was a programmer scratching his own itch. That origin explains both its strengths (pragmatic, tight, fast) and its historical weaknesses (limited multi-tenancy, eventual-only replication by default).

1.2 Six Core Use Cases

1.2.1 Caching

The canonical use case, though easily oversimplified. Redis as a cache isn't valuable because it is "fast" — it is valuable because it absorbs read pressure that would otherwise hit a slower, more expensive data store.

The standard cache-aside pattern:

import json
import redis

r = redis.Redis(host='localhost', port=6379, decode_responses=True)

def get_user(user_id: int) -> dict:
    cache_key = f"user:{user_id}"
    
    raw = r.get(cache_key)
    if raw is not None:
        return json.loads(raw)
    
    # Cache miss: hit the database
    user = db.query_one("SELECT * FROM users WHERE id = %s", [user_id])
    
    # Write back with TTL
    r.setex(cache_key, 3600, json.dumps(user, default=str))
    return user

Three failure modes deserve their own architecture decisions — they are not just edge cases:

Cache penetration: queries for keys that will never exist (e.g., invalid user IDs). Defense: Bloom filter or negative-cache sentinel with short TTL.
Cache avalanche: mass TTL expiry at the same moment after a restart. Defense: randomized TTL jitter (base_ttl + random.randint(0, 600)).
Cache stampede (thundering herd): many requests simultaneously miss for the same key. Defense: mutex lock on first miss, or probabilistic early refresh (PER algorithm).

1.2.2 Leaderboards and Counters

The Sorted Set (ZSet) is arguably Redis's most elegant data structure for this use case. Internally it is a dual structure: a skip list for range queries and a hash table for O(1) score lookup by member. Both are maintained in sync on every write.

# Ingest scores
ZADD leaderboard:game:2024 NX 9850  "player:1001"
ZADD leaderboard:game:2024 NX 12400 "player:2007"
ZADD leaderboard:game:2024 NX 11200 "player:3015"

# Top 10, highest score first
ZREVRANGE leaderboard:game:2024 0 9 WITHSCORES

# Player rank (0-indexed; ZREVRANK for descending)
ZREVRANK leaderboard:game:2024 "player:1001"
# → 2 (third place)

# Score for one player
ZSCORE leaderboard:game:2024 "player:2007"
# → "12400"

# Players between rank 3 and 7 (0-indexed)
ZREVRANGE leaderboard:game:2024 2 6 WITHSCORES

Atomic counters with rate limiting:

# Atomic increment — safe under any concurrency
INCR page:views:home
INCRBY article:42:likes 5

# Rate limiting: max 100 requests per minute per user
SET rate:api:user:1001 0 NX EX 60    # Initialize if not exists, expire in 60s
INCR rate:api:user:1001              # Increment
# In application: if value > 100, reject

1.2.3 Distributed Locking

The naive SETNX key value + EXPIRE key 30 pattern has a race condition: if the process dies between SETNX and EXPIRE, the lock never expires. The atomic single-command form eliminates this:

SET lock:resource:checkout-42 "owner:uuid-abc123" NX PX 30000
# NX  = only set if key does not exist
# PX  = expiry in milliseconds
# Returns: "OK" on success, nil on failure (lock taken)

Lock release must be owner-verified and atomic. A plain DEL without ownership check is wrong — you might delete another process's lock. The canonical approach uses a Lua script (Lua execution is atomic in Redis):

-- release_lock.lua
-- KEYS[1] = lock key
-- ARGV[1] = expected owner value
if redis.call("GET", KEYS[1]) == ARGV[1] then
    return redis.call("DEL", KEYS[1])
else
    return 0
end

RELEASE_SCRIPT = """
if redis.call('GET', KEYS[1]) == ARGV[1] then
    return redis.call('DEL', KEYS[1])
else
    return 0
end
"""

release_lock = r.register_script(RELEASE_SCRIPT)
released = release_lock(keys=['lock:resource:checkout-42'], args=['owner:uuid-abc123'])

Redlock: The multi-node algorithm (acquire lock from N/2+1 independent Redis nodes) is controversial. Martin Kleppmann's 2016 critique argues that timing assumptions make it unsafe for linearizability. For most production systems, a single Redis node lock is sufficient — design your application to tolerate occasional lock failures gracefully.

1.2.4 Message Queues

Redis Streams (introduced in 5.0) is the most operationally complete option:

# Producer
XADD orders * order_id 10086 amount 299.00 user_id 1001 status pending

# Create consumer group (read from beginning with $0, or from now with $)
XGROUP CREATE orders order-processors $ MKSTREAM

# Consumer reads up to 10 messages, blocks 2s if queue empty
XREADGROUP GROUP order-processors worker-1 COUNT 10 BLOCK 2000 STREAMS orders >

# Acknowledge processed message
XACK orders order-processors "1703123456789-0"

# Check pending (unacknowledged) messages
XPENDING orders order-processors - + 10

# Claim a stale pending message after 60s (recovery)
XCLAIM orders order-processors worker-2 60000 "1703123456789-0"

List-based queues: simpler, no consumer groups, suitable for basic job queues:

# Producer
LPUSH job-queue '{"type":"send_email","to":"[email protected]","template":"welcome"}'

# Consumer (blocking pop, 0 = wait indefinitely)
BRPOP job-queue 0

# Reliable queue pattern: BRPOPLPUSH atomically moves to processing list
BRPOPLPUSH job-queue job-queue:processing 30

Pub/Sub: broadcast only, no persistence, no delivery guarantee. Use it for real-time notifications where missed messages are acceptable (e.g., live dashboard updates).

1.2.5 Session Storage

Stateless service architectures require centralized session storage. Redis is the de facto standard:

# Flask-Session configuration
from flask import Flask
from flask_session import Session
import redis

app = Flask(__name__)
app.config.update(
    SESSION_TYPE='redis',
    SESSION_REDIS=redis.StrictRedis(host='redis', port=6379),
    SESSION_USE_SIGNER=True,           # HMAC-sign session ID
    PERMANENT_SESSION_LIFETIME=7200,   # 2 hours in seconds
    SESSION_KEY_PREFIX='sess:',
)
Session(app)

Two session TTL strategies:

Absolute expiry: session expires N seconds after creation, regardless of activity. Simple, but log-out-on-idle UX.
Sliding expiry: TTL resets on every request. Implement with EXPIRE session:{token} 7200 on each authenticated request.

1.2.6 Real-Time Analytics

HyperLogLog for cardinality estimation (UV counting). Uses at most 12KB of memory regardless of the number of elements; error rate ≈ 0.81%:

PFADD uv:2024-01-15 "uid-a1b2" "uid-c3d4" "uid-e5f6"
PFCOUNT uv:2024-01-15           # Estimated unique visitors

# Merge daily HLLs into weekly
PFMERGE uv:2024-W03 uv:2024-01-15 uv:2024-01-16 uv:2024-01-17
PFCOUNT uv:2024-W03             # Weekly UV estimate

Bitmaps for boolean state per user (e.g., daily check-ins):

# User 1001 checked in on day 14 of January (0-indexed offset)
SETBIT checkin:1001:2024-01 14 1

# Count total check-in days this month
BITCOUNT checkin:1001:2024-01

# Find first day user did NOT check in
BITPOS checkin:1001:2024-01 0

# Bitwise AND across user cohort — who checked in every day?
BITOP AND result:all_checkin checkin:1001:2024-01 checkin:1002:2024-01
BITCOUNT result:all_checkin

1.3 Why Single-Threaded Outperforms Multi-Threaded

1.3.1 Clarifying the "Single-Threaded" Claim

The single-threaded label applies specifically to command processing. Since Redis 6.0, the picture is more nuanced:

Thread	Responsibility
Main thread	Event loop, command parsing, command execution
`bio_close_file`	Asynchronous file close (post-AOF-rewrite)
`bio_aof_fsync`	AOF flush to disk
`bio_lazy_free`	Lazy deletion (UNLINK, FLUSHDB ASYNC)
`io_threads[0..N]`	Network I/O reads (Redis 6.0+, default off)

The command execution path remains single-threaded. This is a deliberate architectural choice, not a limitation.

1.3.2 The Economics of Lock Contention

Consider a multi-threaded cache server where each command takes ~1μs to execute. If 8 threads contend for a shared hash table lock:

Thread A holds lock, executes in 1μs
Threads B–H spin-wait: each wastes CPU cycles
On lock release, one thread acquires it; OS scheduler overhead ≈ 1–10μs
Cache line invalidation (the lock's memory location bounces between CPU caches): ~100–300ns per transfer

At 100,000 QPS with 8 threads, the lock contention overhead alone can consume 20–40% of CPU time. The single-threaded model eliminates this entirely.

Measured throughput comparison (Redis benchmark, single node, 1 Gbps NIC, pipeline=1):

Command	QPS
PING (inline)	120,000
GET	108,000
SET	104,000
INCR	106,000
LPUSH	100,000
HSET	95,000

With pipelining (16 commands per round trip):

Command	QPS
GET	1,100,000
SET	1,050,000

The bottleneck at pipeline=1 is network round-trip time, not CPU. The single-threaded model is not the limiting factor.

1.3.3 CPU Cache Locality

A single thread running on one CPU core keeps its working set (the main dict, active key data) in L1/L2 cache (typically 256KB–4MB per core). Multi-threaded access requires cache coherence protocols (MESI or similar). When Thread B reads data that Thread A just modified:

Thread A's core sends a cache line invalidation broadcast
Thread B's core fetches the updated line from LLC or RAM
Round trip: ~40ns (LLC hit) to ~100ns (RAM)

For Redis-scale operations (1–5μs each), a single cache miss represents 2–10% of total command latency. Single-threaded operation makes this deterministic.

1.3.4 The Real Bottleneck

Single-threaded command processing does have limits:

O(N) commands on large collections: KEYS *, SMEMBERS on 10M elements, SORT — all block the event loop. Mitigation: use SCAN, SSCAN, avoid SORT on large sets.
Large value serialization: a 5MB HGETALL response serializes on the main thread. Keep values small.
Single-core ceiling: ~1–2M simple commands/sec on modern hardware. Scale horizontally with Redis Cluster.

Rule of thumb: any command taking more than 1ms is a latency bomb. Use redis-cli --latency-history and SLOWLOG GET 10 to catch offenders.

1.4 The Event-Driven Architecture: Reactor Pattern

Redis's networking layer is built on the ae (Async Events) library — approximately 800 lines of portable C. The design is a classic single-threaded Reactor:

                  ┌──────────────────────────────┐
 Client connect → │  Multiplexer (epoll/kqueue)  │
                  └──────────────┬───────────────┘
                                 │ Ready file descriptors
                  ┌──────────────▼───────────────┐
                  │     aeProcessEvents()         │
                  │     (main event dispatcher)   │
                  └──┬──────────┬────────────┬───┘
                     │          │            │
          ┌──────────▼──┐  ┌────▼────┐  ┌───▼──────────┐
          │  File Events│  │  Time   │  │  After-Sleep  │
          │  (read/write│  │  Events │  │  Hook         │
          │   on sockets│  │ (cron)  │  │               │
          └─────────────┘  └─────────┘  └──────────────┘

Platform selection at compile time:

/* ae.c — auto-selects best available multiplexer */
#ifdef HAVE_EVPORT
#include "ae_evport.c"   /* Solaris event ports */
#elif defined(HAVE_EPOLL)
#include "ae_epoll.c"    /* Linux — O(1) ready-event retrieval */
#elif defined(HAVE_KQUEUE)
#include "ae_kqueue.c"   /* macOS, FreeBSD */
#else
#include "ae_select.c"   /* POSIX fallback — O(n) polling */
#endif

Why epoll over select: select scans up to FD_SETSIZE (typically 1024) file descriptors on every call — O(n). epoll uses kernel-maintained event tables; epoll_wait() returns only ready descriptors — O(1) relative to total connections. At 50,000 concurrent connections, select becomes unusable; epoll handles it efficiently.

Why not a thread pool: For Redis's command profile (each command executes in microseconds), the overhead of a thread pool — task queue locking, condition variable signaling, context switches — would dwarf the execution time. Nginx uses the same architectural reasoning.

1.5 Positioning Relative to MySQL and MongoDB

1.5.1 Three-Way Comparison

Dimension	Redis	MySQL (InnoDB)	MongoDB (WiredTiger)
Primary storage	RAM	Disk (B+ tree)	Disk (B+ tree)
Read latency (p50)	< 0.5ms	1–5ms	1–5ms
Write latency (p50)	< 0.5ms	5–20ms	2–10ms
Data capacity	RAM-bounded	Multi-TB	Multi-TB
Query language	Key-based, limited	Full SQL	MQL (rich)
Transactions	Limited (Lua/MULTI)	Full ACID	Multi-doc (4.0+)
Replication	Async (default)	Semi-sync, GTID	Raft-based replica sets
Data model	KV + typed structures	Relational	Document (BSON)
TTL support	Native, per-key	Via application	Via TTL index

1.5.2 Collaborative Architecture

In a production e-commerce system, each database fills a distinct role:

┌─────────────────────────────────────────────────────┐
│                     Application                      │
└──────────┬──────────────┬───────────────┬────────────┘
           │              │               │
   ┌───────▼──────┐ ┌─────▼──────┐ ┌────▼──────────┐
   │    Redis     │ │   MySQL    │ │   MongoDB     │
   │              │ │            │ │               │
   │ • Session    │ │ • users    │ │ • events      │
   │ • Cart       │ │ • orders   │ │   (clickstream│
   │ • Inventory  │ │ • products │ │    logs)      │
   │   (DECR)     │ │ • payments │ │               │
   │ • Leaderboard│ │            │ │               │
   │ • Rate limits│ │            │ │               │
   └──────────────┘ └────────────┘ └───────────────┘

The decision rule is straightforward: if the data must persist with ACID guarantees, use MySQL. If it requires flexible schema and rich queries, use MongoDB. If it needs to be read or modified in under 1ms or has temporal characteristics (TTL, sliding window), use Redis.

1.6 When Redis Is the Wrong Choice

1.6.1 Data Exceeds Available RAM

Redis holds the full dataset in memory. When usage approaches 70–80% of available RAM, the OS begins swapping pages. On a system without swap, the OOM killer becomes a risk. On a system with swap, a single swapped-out key access can take 10ms — indistinguishable from a database query.

If you need TB-scale random access to key-value data, evaluate Aerospike (hybrid memory/SSD architecture, ~64 bytes of RAM per record regardless of value size) or RocksDB-backed stores.

1.6.2 Complex Query Requirements

Redis has no query planner, no secondary indexes (unless you build them manually as sets), and no join operator. If you find yourself iterating over SMEMBERS results and filtering in application code, you've exceeded Redis's query capabilities:

# Anti-pattern: simulating a WHERE clause in application code
members = redis.smembers("orders:user:1001")
result = [
    redis.hgetall(f"order:{oid}")
    for oid in members
    if float(redis.hget(f"order:{oid}", "amount") or 0) > 100
]
# This is O(n) round trips — use the database

1.6.3 Strong Consistency Requirements

Redis's default replication is asynchronous. A write acknowledged by the primary may not yet be on any replica. If the primary fails before replication completes, that write is lost. WAIT numreplicas timeout can enforce synchronous replication, but it adds latency and blocks if replicas are unavailable.

For financial transactions, payment processing, or inventory systems where data loss is unacceptable, use MemoryDB for Redis (multi-AZ transactional log) or a traditional RDBMS.

1.6.4 Cold or Archival Data

Redis charges by memory. Data accessed once per day — historical reports, audit logs, monthly aggregates — is expensive to keep in RAM. Store it in S3, BigQuery, or a cold DynamoDB tier. Redis is for hot data that justifies the memory cost.

1.6.5 Very Large Values

Single values exceeding ~10KB begin to impact performance:

Serialization and deserialization block the main thread
Network bandwidth becomes a bottleneck for large GETs
Memory fragmentation increases with varied value sizes

Practical limits: values up to 1KB are optimal; 1–10KB acceptable; 10KB–100KB requires justification; 100KB+ is an architectural smell. Store large blobs in object storage (S3/GCS/OSS); keep only the reference URL in Redis.

1.7 Summary

Redis's design philosophy can be summarized as: solve the right problem, in memory, with the simplest correct data structure. Its single-threaded event loop, in-memory storage, and purposeful data structures form a mutually reinforcing system — each design choice supports the others.

Understanding where Redis excels and where it fails is more valuable than memorizing command syntax. Chapter 2 examines the competitive landscape: Memcached, Aerospike, Dragonfly, Valkey, and KeyDB — each making different trade-offs from the same starting point.

Rate this chapter

4.7 / 5 (123 ratings)