Source Code Setup: Build, Debug and File Map
Chapter 21 โ Source Code Reading Environment: Compilation, Debugging, and the Core File Map
Understanding Redis at a deep level requires confronting the source code directly. This chapter establishes a complete source code reading environment โ from compilation flags to GDB single-step debugging, from the file map to the startup chain โ enabling you to understand Redis from an engineer's perspective rather than a user's.
21.1 Obtaining the Source Code
Redis source code is hosted on GitHub at https://github.com/redis/redis. In production study environments, lock to a specific version tag to avoid instability from the unstable branch.
git clone https://github.com/redis/redis.git
cd redis
git checkout 7.2.0 # Lock to the stable 7.2.0 release
git log --oneline -10 # Review recent commits and version history
Key changes in the 7.x series compared to 6.x:
- listpack fully replaces ziplist (began in 6.2, completed in 7.0)
OBJECT ENCODINGoutput changes:ziplistโlistpack- Multi-threaded I/O performance improvements
- Functions (Lua scripting replacement) formally introduced
21.2 Compilation: Preserving Debug Information
The default make uses -O2 optimization. The compiler inlines functions and reorders code, making debugging nearly impossible. When reading source code, always disable optimization:
# Disable optimization, keep full debug symbols
make CFLAGS="-g -O0" MALLOC=libc
# On Linux using jemalloc (the default)
make CFLAGS="-g -O0" MALLOC=jemalloc
# Parallel compilation for speed
make -j$(nproc) CFLAGS="-g -O0"
# Run the full test suite (requires Tcl 8.5+)
make test
# Run specific test files
./runtest tests/unit/type/hash.tcl
./runtest tests/unit/expire.tcl
Build artifacts are placed in src/:
src/redis-serverโ the server binarysrc/redis-cliโ the command-line clientsrc/redis-benchmarkโ the benchmarking toolsrc/redis-sentinelโ the sentinel program (actually a symlink to redis-server)src/redis-check-rdbโ RDB file integrity checkersrc/redis-check-aofโ AOF file integrity checker
Starting the server in debug mode:
src/redis-server --loglevel debug --save "" --appendonly no
# --save "" Disable RDB to avoid fork() interfering with debugging
# --appendonly no Disable AOF
# --loglevel debug Verbose logging
21.3 The Core File Map
Redis 7.2 contains approximately 150,000 lines of C code spread across 150+ files in the src/ directory. The following categorizes the most important files.
Core Framework Layer
| File | Lines (approx.) | Purpose |
|---|---|---|
| server.h | 3,800 | Global data structure definitions: redisServer, client, redisObject, dict, listpack, and all other core struct declarations |
| server.c | 7,500 | Server body: main(), initServer(), serverCron(), beforeSleep(), command table |
| ae.c | 800 | Event loop: aeEventLoop, aeProcessEvents, aeCreateFileEvent, aeCreateTimeEvent |
| ae_epoll.c | 150 | epoll implementation (Linux) |
| ae_kqueue.c | 150 | kqueue implementation (macOS/BSD) |
| ae_select.c | 100 | select fallback implementation |
| ae_evport.c | 150 | Solaris event port implementation |
Network and Protocol Layer
| File | Lines (approx.) | Purpose |
|---|---|---|
| networking.c | 4,200 | Client connection management, RESP protocol parsing, readQueryFromClient, addReply family |
| resp3.c | 200 | RESP3 protocol extensions (Redis 6+) |
| tls.c | 1,200 | TLS/SSL support (requires OpenSSL) |
| unix.c | 100 | Unix Domain Socket support |
Data Type Command Layer
| File | Lines (approx.) | Purpose |
|---|---|---|
| t_string.c | 900 | String commands: set/get/incr/append/getrange/setrange |
| t_hash.c | 800 | Hash commands: hset/hget/hmset/hgetall |
| t_list.c | 700 | List commands: lpush/rpush/lrange/blpop |
| t_set.c | 900 | Set commands: sadd/smembers/sinter/sunion |
| t_zset.c | 2,800 | ZSet commands + complete skiplist implementation (zskiplist/zskiplistNode) |
| t_stream.c | 3,500 | Stream commands: xadd/xread/xgroup/xack |
Low-Level Data Structure Layer
| File | Lines (approx.) | Purpose |
|---|---|---|
| sds.c / sds.h | 1,000 | SDS dynamic strings: sdsnew/sdscat/sdsMakeRoomFor |
| dict.c / dict.h | 1,500 | Incremental hash table: dictCreate/dictAdd/dictFind/dictRehash |
| listpack.c / listpack.h | 1,100 | listpack compact list (primary encoding format in 7.0+) |
| ziplist.c | 1,500 | ziplist (deprecated in 7.0, retained for compatibility) |
| quicklist.c | 1,200 | quicklist: doubly-linked list of listpack nodes |
| intset.c | 400 | intset: integer set encoding (small Set backend) |
| rax.c / rax.h | 1,600 | Radix tree (prefix tree), used in Streams and cluster routing |
Persistence Layer
| File | Lines (approx.) | Purpose |
|---|---|---|
| rdb.c | 3,500 | RDB persistence: rdbSave/rdbLoad/rdbSaveObject |
| rdb.h | 200 | RDB format constant definitions |
| aof.c | 3,000 | AOF persistence: feedAppendOnlyFile/rewriteAppendOnlyFile |
| rio.c / rio.h | 500 | Streaming I/O abstraction: supports file/buffer/fd backends |
Replication and Cluster Layer
| File | Lines (approx.) | Purpose |
|---|---|---|
| replication.c | 5,500 | Master-replica replication: PSYNC protocol, backlog, partial resync |
| sentinel.c | 5,000 | Sentinel mode: failure detection, leader election, notifications |
| cluster.c | 10,000+ | Cluster mode (largest file): Gossip, slot migration, failover |
| cluster.h | 500 | Cluster data structure definitions |
Memory Management Layer
| File | Lines (approx.) | Purpose |
|---|---|---|
| object.c | 1,200 | redisObject creation/release/reference counting/type checks/encoding conversion |
| db.c | 1,000 | dbAdd/dbDelete/expireIfNeeded/lookupKey/selectDb |
| expire.c | 300 | Periodic deletion: activeExpireCycle |
| evict.c | 600 | Memory eviction: 8 policy implementations, performEvictions |
| lazyfree.c | 400 | Background lazy deletion: bioCreateLazyFreeJob |
| zmalloc.c | 400 | Memory allocation wrapper: zmalloc/zfree/used_memory tracking |
| memory.c | 500 | MEMORY command, fragmentation stats, activedefrag |
Background Task Layer
| File | Lines (approx.) | Purpose |
|---|---|---|
| bio.c | 400 | Background I/O threads: 3 task types (fsync/lazyFree/closeFd) |
| threads_mngr.c | 200 | I/O multi-threading management (Redis 6+) |
21.4 GDB Debugging in Practice
Debugging the SET Command
# Terminal 1: Launch GDB
gdb src/redis-server
(gdb) break setCommand # Set breakpoint at setCommand in t_string.c
(gdb) break t_string.c:100 # Breakpoint by line number
(gdb) run redis.conf # Start the server
# Terminal 2: Send a command
redis-cli SET hello world
# Terminal 1: GDB stops at the breakpoint
(gdb) bt # Print backtrace (call stack)
(gdb) frame 0 # Switch to frame 0
(gdb) info args # Show function arguments
(gdb) p c->argc # Client argument count (should be 3: SET hello world)
(gdb) p c->argv[0]->ptr # First argument (command name: "SET")
(gdb) p c->argv[1]->ptr # Key ("hello")
(gdb) p c->argv[2]->ptr # Value ("world")
(gdb) n # Next (step over)
(gdb) s # Step (step into)
(gdb) c # Continue
Inspecting a redisObject
(gdb) p *c->argv[1] # Print the redisObject for the key
# Output similar to:
# {type = 0, encoding = 8, lru = 12345678, refcount = 1, ptr = 0x7f...}
(gdb) p (char*)c->argv[1]->ptr # Print as string if embstr/raw encoding
(gdb) x/10c c->argv[1]->ptr # Examine memory as characters (first 10 bytes)
Debugging Time Events (serverCron)
(gdb) break serverCron
(gdb) commands # Auto-execute on each hit
> print server.hz
> print server.unixtime
> continue
> end
GDB Quick Reference
break func Set breakpoint at function entry
break file:line Set breakpoint at specific line
watch expr Watchpoint on write to expression
rwatch expr Watchpoint on read of expression
info break List all breakpoints
delete N Delete breakpoint N
print expr Print expression value
x/Nfu addr Examine memory (N=count, f=format, u=unit)
ptype var Print variable's full type
whatis var Short type description
set var=value Modify a variable at runtime
finish Run until current function returns
up / down Navigate call stack frames
info locals Print all locals in current frame
21.5 The Startup Chain Explained
Redis's startup chain begins at main(). Understanding this chain is fundamental to understanding the overall architecture.
// server.c
int main(int argc, char **argv) {
// 1. Locale and timezone initialization
setlocale(LC_COLLATE, "");
tzset();
// 2. OOM handler registration
zmalloc_set_oom_handler(redisOutOfMemoryHandler);
// 3. Random seed initialization
srand(time(NULL)^getpid());
// 4. Default configuration (populates the global server struct)
initServerConfig();
// 5. Module system initialization
moduleInitModulesSystem();
// 6. Parse command-line arguments, load config file
loadServerConfig(configfile, options);
// 7. Sentinel/cluster mode detection and setup
if (server.sentinel_mode) initSentinelConfig();
// 8. ACL system initialization
ACLInit();
// 9. Core server initialization (eventloop, port binding, etc.)
initServer();
// 10. Load modules from the queue
moduleLoadFromQueue();
// 11. Load data from disk (RDB or AOF)
loadDataFromDisk();
// 12. Cluster/sentinel mode post-init
if (server.cluster_enabled) clusterInit();
// 13. Enter the event loop (never returns under normal operation)
aeSetBeforeSleepProc(server.el, beforeSleep);
aeSetAfterSleepProc(server.el, afterSleep);
aeMain(server.el);
// 14. Cleanup after event loop exits (on shutdown)
aeDeleteEventLoop(server.el);
return 0;
}
Key Steps Inside initServer()
void initServer(void) {
// Create the event loop โ the core of everything
server.el = aeCreateEventLoop(server.maxclients + CONFIG_FDSET_INCR);
// Create the database array (16 databases by default)
server.db = zmalloc(sizeof(redisDb) * server.dbnum);
for (int i = 0; i < server.dbnum; i++) {
server.db[i].dict = dictCreate(&dbDictType);
server.db[i].expires = dictCreate(&dbExpiresDictType);
server.db[i].blocking_keys = dictCreate(&keylistDictType);
server.db[i].watched_keys = dictCreate(&keylistDictType);
server.db[i].id = i;
}
// Bind and listen on TCP port
listenToPort(server.port, &server.ipfd);
// Register serverCron time event (checks every 1ms, actual rate controlled by hz)
aeCreateTimeEvent(server.el, 1, serverCron, NULL, NULL);
// Register accept file events for each listening fd
for (int i = 0; i < server.ipfd.count; i++) {
aeCreateFileEvent(server.el, server.ipfd.fd[i], AE_READABLE,
acceptTcpHandler, NULL);
}
// Initialize LRU clock
server.lruclock = getLRUClock();
}
serverCron: The System Heartbeat
serverCron runs approximately server.hz times per second (default 10 = every 100ms). It schedules all periodic maintenance tasks:
int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
// Always executed:
server.unixtime = time(NULL);
server.mstime = mstime();
updateCachedTime(0);
// Every 100ms:
run_with_period(100) {
trackInstantaneousMetric(STATS_METRIC_COMMAND, ...);
activeExpireCycle(ACTIVE_EXPIRE_CYCLE_FAST);
}
// Every 1000ms (1 second):
run_with_period(1000) {
dbCron(); // Database maintenance
replicationCron(); // Replication heartbeat and backlog trimming
}
// Every 5000ms:
run_with_period(5000) {
// Update slow-changing statistics
}
// Check for completed BGSAVE / BGREWRITEAOF child processes:
if (server.rdb_child_pid != -1 || server.aof_child_pid != -1) {
// waitpid to check if child has finished
}
return 1000 / server.hz; // Return next invocation interval in ms
}
21.6 RESP Protocol Parsing in Source Code
Understanding readQueryFromClient in networking.c is essential to understanding Redis's network model:
// Triggered when a client sends "SET hello world\r\n"
void readQueryFromClient(connection *conn) {
client *c = connGetPrivateData(conn);
// Read data from the connection into querybuf
nread = connRead(c->conn, c->querybuf + qblen, readlen);
// Parse the RESP protocol
processInputBuffer(c);
}
void processInputBuffer(client *c) {
while (c->qb_pos < sdslen(c->querybuf)) {
if (c->reqtype == PROTO_REQ_MULTIBULK) {
// Parse: *3\r\n$3\r\nSET\r\n$5\r\nhello\r\n$5\r\nworld\r\n
if (processMultibulkBuffer(c) != C_OK) break;
}
// Once a full command is parsed, execute it
if (c->argc > 0) {
processCommand(c);
}
}
}
21.7 Source Code Reading Methodology
1. Start from command handlers. Every Redis command maps to a C function. The mapping is defined in the redisCommandTable array in server.c:
struct redisCommand redisCommandTable[] = {
{"get", getCommand, 2, "read-only fast @string", ...},
{"set", setCommand, -3, "write use-memory @string", ...},
{"hset", hsetCommand, -4, "write use-memory fast @hash", ...},
// ... 200+ commands
};
2. Use ctags or cscope for navigation:
ctags -R src/ # Generate tags file
# In Vim: Ctrl+] to jump to definition, Ctrl+T to return
cscope -Rb # Build cscope database
3. Use VSCode with clangd โ generate compile_commands.json for precise jump-to-definition and type inference.
4. Anchor on redisObject. Almost all data is ultimately stored in robj (redisObject). Understanding the type/encoding/ptr triad is the key to understanding all encoding conversions:
typedef struct redisObject {
unsigned type:4; // OBJ_STRING / OBJ_LIST / OBJ_SET / OBJ_ZSET / OBJ_HASH
unsigned encoding:4; // OBJ_ENCODING_RAW / EMBSTR / INT / LISTPACK / SKIPLIST...
unsigned lru:24; // LRU timestamp (seconds) or LFU counter
int refcount; // Reference count (shared objects: integers 0โ9999)
void *ptr; // Pointer to actual data (or direct integer value)
} robj;
5. Performance reference numbers to keep in mind:
- A fresh Redis instance uses ~1 MB of RAM
- Each additional client connection uses ~20 KB
- An empty string key costs ~64 bytes overhead
- The event loop handles ~1 million events per second on modern hardware
epoll_waitcall overhead: ~1โ2 microseconds per wakeup
Chapter Summary
- Redis 7.2 source is approximately 150,000 lines of C code, with core logic concentrated in
src/ - Compile with
-g -O0to retain full debug information for GDB - GDB allows single-step tracing of any command execution path
- The startup chain:
main()โinitServerConfig()โinitServer()โloadDataFromDisk()โaeMain()โ permanent event loop serverCronis the system heartbeat, centrally scheduling all periodic tasks- The
redisObjecttype/encoding/ptrtriad is the master key to understanding all data encodings - Begin reading from
redisCommandTableโ every command has a corresponding C function entry point - Real performance numbers matter: memorize baseline costs (per-connection RAM, per-key overhead) to reason about scaling