Chapter 8

quicklist: listpack + Doubly Linked List Hybrid

Chapter 8: quicklist — listpack + Doubly Linked List Hybrid Design

8.1 Design Background and Evolution

8.1.1 Three Eras of Redis List Implementation

The Redis List data type has undergone three distinct internal representations, each solving the problems of its predecessor.

Era 1 (Redis 2.x): ziplist

LPUSH key a b c d ... (N elements stored as one ziplist)

Memory:
[zlbytes(4)][zltail(4)][zllen(2)][entry1][entry2]...[entryN][0xFF]
└──────────────── single contiguous allocation ──────────────────┘

Strengths:
  - No pointer overhead — pure data in one block
  - CPU cache-friendly (sequential memory)

Weaknesses:
  - LINSERT requires memmove of O(bytes) — for a 1MB list, moves 1MB of data
  - List growth causes realloc + potential memcpy
  - Cascading update risk (see Chapter 6)
  - Entire list in one allocation: memory allocator pressure

Era 2: linkedlist (doubly linked list)

head ──► [node1] ──► [node2] ──► [node3] ──► ... ──► NULL
          prev=NULL   prev=node1   prev=node2
         [sds data]  [sds data]   [sds data]

Strengths:
  - O(1) LPUSH/RPUSH/LPOP/RPOP — no memmove, no realloc
  - Insertions affect only adjacent pointers

Weaknesses:
  - Each node is a separate malloc allocation → memory fragmentation
  - Pointer overhead: prev(8B) + next(8B) = 16 bytes per node
  - Memory is non-contiguous → poor cache behavior
  - For a 10-byte element: 16 bytes of pointers + 3 bytes SDS header = 19 bytes overhead for 10 bytes data

Era 3 (Redis 3.2+): quicklist (current)

Doubly linked list of listpack nodes:

 head                                                          tail
  │                                                             │
  ▼                                                             ▼
[QLnode1] ◄──► [QLnode2] ◄──► [QLnode3] ◄──► [QLnode4] ◄──► [QLnode5]
│listpack│      │  LZF   │    │  LZF   │    │  LZF   │    │listpack│
│ (RAW)  │      │compress│    │compress│    │compress│    │ (RAW)  │
└────────┘      └────────┘    └────────┘    └────────┘    └────────┘
  (hot)            (cold)        (cold)        (cold)        (hot)

compress=1: head + tail nodes stay uncompressed (frequent LPUSH/RPOP)
            interior nodes compressed with LZF (rarely accessed)
fill=-2: each listpack capped at 8KB

8.2 quicklistNode and quicklist Structure

8.2.1 C Struct Definitions (quicklist.h)

/* Single node in the quicklist — represents one listpack */
typedef struct quicklistNode {
    struct quicklistNode *prev;    /* previous node */
    struct quicklistNode *next;    /* next node */
    unsigned char *entry;          /* pointer to listpack (or LZF compressed data) */
    size_t sz;                     /* byte size of entry (uncompressed size) */
    unsigned int count : 16;       /* number of elements in this node's listpack */
    unsigned int encoding : 2;     /* QUICKLIST_NODE_ENCODING_RAW=1
                                      QUICKLIST_NODE_ENCODING_LZF=2 */
    unsigned int container : 2;    /* QUICKLIST_NODE_CONTAINER_PLAIN=1  (single large element)
                                      QUICKLIST_NODE_CONTAINER_PACKED=2 (listpack) */
    unsigned int recompress : 1;   /* node was temporarily decompressed; recompress before next access */
    unsigned int attempted_compress : 1; /* tried to compress but node is too small */
    unsigned int dont_compress : 1; /* prevent compression (e.g., node is being iterated) */
    unsigned int extra : 9;        /* reserved */
} quicklistNode;

/* The quicklist itself — a doubly-linked list of quicklistNodes */
typedef struct quicklist {
    quicklistNode *head;                   /* head node */
    quicklistNode *tail;                   /* tail node */
    unsigned long count;                   /* total element count across all nodes */
    unsigned long len;                     /* number of quicklistNodes */
    signed int fill : QL_FILL_BITS;        /* max node size policy */
    unsigned int compress : QL_COMP_BITS;  /* LZF compression depth (nodes from each end to leave uncompressed) */
    unsigned int bookmark_count : QL_BM_BITS;
    quicklistBookmark bookmarks[];         /* for SCAN cursor support */
} quicklist;

8.2.2 Memory Layout Visualization

quicklist struct (on heap, ~48 bytes):
┌─────────┬─────────┬──────────┬──────┬────────────┐
│head ptr │tail ptr │ count    │ len  │fill|compress│
│  8 bytes│  8 bytes│  8 bytes │8bytes│  4 bytes   │
└────┬────┴────┬────┴──────────┴──────┴────────────┘
     │         │
     ▼         ▼
┌──────────────┐              ┌──────────────┐
│quicklistNode │◄────────────►│quicklistNode │
│prev=NULL     │              │next=NULL     │
│next ─────────┼──────────►   │prev ─────────│◄──
│entry ────────┼──►           │entry ────────┼──►
│sz=7800       │              │sz=3200       │
│count=642     │              │count=260     │
│encoding=RAW  │              │encoding=LZF  │
└──────────────┘              └──────────────┘
       │                              │
       ▼                              ▼
┌─────────────────────┐    ┌─────────────────────┐
│  listpack (7.8KB)   │    │  LZF blob (~1.8KB)  │
│  [total-bytes=7800] │    │  original sz=3200   │
│  [num-elems=642]    │    │  compressed: 56%    │
│  [elem1]...[elem642]│    │  ratio              │
│  [0xFF]             │    │                     │
└─────────────────────┘    └─────────────────────┘

8.3 The fill Parameter Explained

8.3.1 list-max-listpack-size Configuration

The fill field (signed int) determines the maximum capacity of each quicklistNode's listpack:

Positive values — element count limit:
  fill = 1:    max 1 element per node
  fill = 128:  max 128 elements per node (size unconstrained)
  fill = 512:  max 512 elements per node

Negative values — byte size limit:
  fill = -1:   max  4,096 bytes per listpack  (4KB)
  fill = -2:   max  8,192 bytes per listpack  (8KB)  ← DEFAULT
  fill = -3:   max 16,384 bytes per listpack (16KB)
  fill = -4:   max 32,768 bytes per listpack (32KB)
  fill = -5:   max 65,536 bytes per listpack (64KB)

8.3.2 Implementation

/* quicklist.c — check if a value of size sz can be inserted into node */
REDIS_STATIC int
_quicklistNodeAllowInsert(const quicklistNode *node,
                          const int fill, const size_t sz) {
    if (unlikely(!node || !node->entry)) return 0;

    /* PLAIN nodes (large elements) cannot be combined */
    if (unlikely(QL_NODE_IS_PLAIN(node) || isLargeElement(sz))) return 0;

    /* SIZE_ESTIMATE_OVERHEAD accounts for listpack element overhead (~11 bytes) */
    if (likely(_quicklistNodeSizeMeetsOptimizationRequirement(
            node->sz + sz + SIZE_ESTIMATE_OVERHEAD, fill)))
        return 1;
    else if (!sizeMeetsSafetyLimit(node->sz + sz + SIZE_ESTIMATE_OVERHEAD))
        return 0;  /* absolute hard limit: never exceed 8KB safety margin */
    else if ((int)node->count < fill)
        return 1;  /* positive fill: count-based limit */
    return 0;
}

/* Map negative fill value to byte limit */
REDIS_STATIC int
_quicklistNodeSizeMeetsOptimizationRequirement(const size_t sz, const int fill) {
    if (fill >= 0) return 0;  /* positive fill uses count limit, not size */
    size_t offset = (-fill) - 1;
    static const size_t optimization_level[] = {4096, 8192, 16384, 32768, 65536};
    if (offset < 5)
        return sz <= optimization_level[offset];
    return 0;
}

8.3.3 Why 8KB is the Default Sweet Spot

Analysis of fill=-2 (8KB per listpack):

Typical element sizes:
  Short strings (URLs, keys):   50-200 bytes
  Small JSON objects:          200-500 bytes
  Log lines:                   100-500 bytes

Elements per node with fill=-2:
  @ 50 bytes/element:  8192 / (50+11) ≈ 134 elements per node
  @ 200 bytes/element: 8192 / (200+11) ≈  39 elements per node
  @ 500 bytes/element: 8192 / (500+11) ≈  16 elements per node

Cache behavior:
  L1 cache: 32KB (typical) — 8KB listpack fits in L1
  L1 cache line: 64 bytes → 8KB = 128 cache lines (prefetchable)
  Memory page: 4KB → 8KB = 2 pages (minimal TLB pressure)

memmove cost within one listpack:
  8KB memmove on modern hardware: ~0.5μs
  Acceptable for Redis's single-threaded model

8.4 LZF Compression

8.4.1 list-compress-depth Configuration

compress = 0:  No compression (default)
compress = 1:  1 node from each end uncompressed; rest compressed
compress = 2:  2 nodes from each end uncompressed; rest compressed
compress = N:  N nodes from each end uncompressed; rest compressed

Visual with compress=2, 8 nodes:
  [node1] [node2] [node3] [node4] [node5] [node6] [node7] [node8]
  ─ RAW ──── RAW ─── LZF ─── LZF ─── LZF ─── LZF ─── RAW ─── RAW ─

Access patterns this optimizes for:
  - LPUSH/RPUSH: hit node1 or node8 (RAW, no decompression needed)
  - LPOP/RPOP:   hit node1 or node8 (RAW)
  - LRANGE 0 N:  iterate from node1 forward (decompress each interior as needed)
  - Interior nodes: rarely accessed → compression reduces memory

8.4.2 LZF Algorithm Internals

LZF is a LZ77-variant designed for speed over compression ratio:

/* Conceptual LZF compress implementation */
int lzf_compress(const void *in_data, unsigned int in_len,
                 void *out_data, unsigned int out_len) {
    /*
     * State: hash table mapping 3-byte sequences to their last seen position
     * Hash: HTAB[((v >> 8) ^ v) & HMASK]  where v = 3 bytes packed as int
     *
     * For each input position:
     *   1. Compute 3-byte hash at current position
     *   2. Look up hash table for a recent match
     *   3a. Match found within MAX_DISTANCE bytes?
     *       Output back-reference: [dist_high | len-2] [dist_low]
     *       Advance ip by match length
     *   3b. No match: output literal byte, advance ip by 1
     *   4. Update hash table
     *
     * Back-reference encoding (2 bytes):
     *   Byte 0: [len-2 in high 3 bits] [dist_high in low 5 bits]
     *           len: 3-8 bytes short ref, 9+ bytes long ref
     *   Byte 1: dist_low (8 bits)
     *   Max distance: (1<<13) = 8192 bytes
     */
}

8.4.3 Compression/Decompression in quicklist

/* quicklist.c — compress a RAW node with LZF */
REDIS_STATIC int quicklistCompressNode(quicklistNode *node) {
    if (node->encoding == QUICKLIST_NODE_ENCODING_LZF) return 0;
    if (node->dont_compress) return 0;

    /* Try LZF compression */
    quicklistLZF *lzf = zmalloc(sizeof(*lzf) + node->sz);
    lzf->sz = lzf_compress(node->entry, node->sz, lzf->compressed,
                           node->sz);  /* output buffer same size as input */
    if (lzf->sz == 0 || lzf->sz >= node->sz) {
        /* Compression made it larger or failed: mark as attempted */
        node->attempted_compress = 1;
        zfree(lzf);
        return 0;
    }
    lzf = zrealloc(lzf, sizeof(*lzf) + lzf->sz);  /* shrink to actual size */
    zfree(node->entry);
    node->entry = (unsigned char *)lzf;
    node->encoding = QUICKLIST_NODE_ENCODING_LZF;
    return 1;
}

/* Decompress a node for access (marks recompress=1) */
REDIS_STATIC void quicklistDecompressNodeForUse(quicklistNode *node) {
    if (node->encoding != QUICKLIST_NODE_ENCODING_LZF) return;

    quicklistLZF *lzf = (quicklistLZF *)node->entry;
    void *data = zmalloc(node->sz);   /* node->sz = original uncompressed size */
    if (lzf_decompress(lzf->compressed, lzf->sz, data, node->sz) == 0) {
        zfree(data);
        return;
    }
    zfree(node->entry);
    node->entry = data;
    node->encoding = QUICKLIST_NODE_ENCODING_RAW;
    node->recompress = 1;   /* flag: must recompress after use */
}

8.4.4 Real-World Compression Numbers

Test: Redis List storing 1000 Nginx access log lines (avg 180 bytes/line)
      fill=-2, 7 quicklistNodes needed

Without compression (compress=0):
  7 × ~8KB = ~56KB of listpack data
  + quicklist overhead: ~7 × 56 bytes (quicklistNode structs) = ~392 bytes
  Total: ~56.4KB

With compression (compress=1):
  2 end nodes uncompressed: 2 × 8KB = 16KB
  5 interior nodes LZF compressed:
    Nginx log compression ratio: ~3:1 (repetitive timestamps, URLs, IPs)
    5 × 8KB / 3 = ~13.3KB compressed
  Total: 16 + 13.3 = ~29.3KB

Memory savings: (56.4 - 29.3) / 56.4 ≈ 48% reduction

LZF performance on 8KB block:
  Compression:   ~20μs at 400 MB/s
  Decompression: ~10μs at 800 MB/s
  
Access cost: compress=1, accessing interior node:
  Decompress (10μs) + access + recompress (20μs) = ~30μs extra latency

8.5 LPUSH / RPUSH Operation Flow

/* quicklist.c — push a value to the head of the quicklist */
int quicklistPushHead(quicklist *quicklist, void *value, size_t sz) {
    quicklistNode *orig_head = quicklist->head;

    if (unlikely(isLargeElement(sz))) {
        /* Large element (> 8KB): create a PLAIN node — not a listpack */
        __quicklistInsertPlainNode(quicklist, quicklist->head, value, sz, 0);
        return 1;
    }

    if (likely(
        _quicklistNodeAllowInsert(quicklist->head, quicklist->fill, sz))) {
        /* Head node has capacity: insert directly into its listpack */
        quicklist->head->entry = lpPrepend(quicklist->head->entry, value, sz);
        quicklistNodeUpdateSz(quicklist->head);
    } else {
        /* Head node is full: create a new head node */
        quicklistNode *node = quicklistCreateNode();
        node->entry = lpPrepend(lpNew(0), value, sz);
        quicklistNodeUpdateSz(node);
        _quicklistInsertNodeBefore(quicklist, quicklist->head, node);
    }
    quicklist->count++;
    quicklist->head->count++;

    /* If a new head was created, check if old head needs LZF compression */
    return (orig_head != quicklist->head);
}

Operation complexity summary:

LPUSH/RPUSH:
  Head/tail node has space: O(1) — lpPrepend/lpAppend on listpack
  Head/tail node full:      O(1) — allocate new node + lpPrepend

LPOP/RPOP:
  From listpack (node has >1 element): O(1) — lpDeleteFirst/lpDeleteLast
  Node becomes empty: O(1) — free listpack, remove node, free quicklistNode

LLEN:
  Direct read of quicklist->count: O(1)

LRANGE start stop:
  O(log N) to skip to start node (node count traversal), then O(M) scan
  Actually O(start + M) due to sequential node traversal

8.6 LINDEX and LINSERT: O(n) Random Access

/* O(n) — must traverse nodes until cumulative count reaches idx */
quicklistEntry quicklistIndex(quicklist *quicklist, long long idx) {
    quicklistNode *n;
    long long accum = 0;
    int forward = (idx >= 0);

    if (!forward) {
        /* Negative index: count from tail */
        idx = (-idx) - 1;
        n = quicklist->tail;
    } else {
        n = quicklist->head;
    }

    while (n) {
        if ((accum + n->count) > idx) {
            /* Target element is in this node */
            quicklistDecompressNodeForUse(n);  /* decompress if needed */
            long offset = idx - accum;
            if (!forward) offset = n->count - 1 - offset;
            unsigned char *elem = lpIndex(n->entry, offset);
            /* ... return entry */
            n->recompress = 1;
            break;
        }
        if (forward) {
            accum += n->count;
            n = n->next;
        } else {
            accum += n->count;
            n = n->prev;
        }
    }
}

Real-world cost of LINDEX:

Scenario: 10,000-element list, fill=-2, compress=1
  Nodes: ~10000/134 ≈ 75 nodes

LINDEX 5000 (middle element):
  Phase 1: Traverse ~37 nodes to find target node
           37 × (pointer dereference + count check) ≈ 0.5μs
  Phase 2: Decompress target node (interior: LZF): ~10μs
  Phase 3: lpIndex traversal within listpack (~67 elements): ~0.3μs
  Phase 4: Mark recompress=1, recompress when iter releases: ~20μs
  Total: ~31μs (30× slower than a hash lookup)

Lesson: Use Redis List for queue semantics (LPUSH/RPOP).
        For random access by index, consider Hash with integer keys.

8.7 Iterator and LRANGE Implementation

/* quicklist iterator */
typedef struct quicklistIter {
    quicklist *quicklist;    /* the quicklist being iterated */
    quicklistNode *current;  /* current node */
    unsigned char *zi;       /* current element pointer within current node's listpack */
    long offset;             /* element offset within current node */
    int direction;           /* AL_START_HEAD or AL_START_TAIL */
} quicklistIter;

/* Iterate to next element */
int quicklistNext(quicklistIter *iter, quicklistEntry *entry) {
    /* ... */
    if (iter->zi == NULL) {
        /* Start of node: decompress if needed */
        quicklistDecompressNodeForUse(iter->current);
        iter->zi = lpFirst(iter->current->entry);
    } else {
        iter->zi = lpNext(iter->current->entry, iter->zi);
    }

    if (iter->zi) {
        /* Got an element from current node */
        entry->zi = iter->zi;
        /* ... populate entry */
        return 1;
    }

    /* Current node exhausted — move to next node */
    quicklistCompress(iter->quicklist, iter->current);  /* recompress */
    iter->current = (iter->direction == AL_START_HEAD) ?
                    iter->current->next : iter->current->prev;
    /* ... reset zi for new node */
}

LRANGE uses this iterator, making it significantly more efficient than repeated LINDEX calls:

LRANGE key 0 999 (1000 elements from a 10K-element list):

Using iterator:
  1. Skip to first node containing idx=0: O(1) — always head node
  2. Sequentially iterate 1000 elements across nodes:
     - Within each node: sequential listpack traversal (cache-friendly)
     - Node boundaries: recompress current, decompress next
  Total: O(1000) element visits + O(node_count_in_range) decompress ops
  
  Contrast with 1000 × LINDEX: O(1000 × N/2) = O(500K) — 500x worse

8.8 Large Element Handling (PLAIN nodes)

Redis 7.0 introduced handling for elements larger than the fill threshold:

/* Elements larger than 8KB get their own PLAIN node (not a listpack) */
#define QL_FILL_PLAIN_NODE_THRESHOLD 8192  /* bytes */

static int isLargeElement(size_t sz) {
    return sz >= QL_FILL_PLAIN_NODE_THRESHOLD;
}

/* PLAIN node memory layout:
   quicklistNode.container = QUICKLIST_NODE_CONTAINER_PLAIN
   quicklistNode.entry → raw data bytes (not a listpack)
   quicklistNode.sz    = actual byte size of the element
   quicklistNode.count = 1 (always)
*/

This ensures that even extreme-size values (e.g., a 1MB JSON blob) don't break the fixed-size listpack model.

8.9 Performance Benchmark

Environment: Redis 7.0, list-max-listpack-size=-2, list-compress-depth=0
Hardware: 4-core Xeon, 16GB RAM, no persistence

Command         List Size    Throughput    P99 Latency
──────────────────────────────────────────────────────
LPUSH           10K          2.1M ops/s    0.08ms
RPUSH           10K          2.1M ops/s    0.08ms
LPOP            10K          2.0M ops/s    0.09ms
RPOP            10K          2.0M ops/s    0.09ms
LRANGE 0 99     10K          180K ops/s    0.5ms
LRANGE 0 999    10K           30K ops/s    2.8ms
LRANGE 0 9999   10K            3K ops/s   28ms
LINDEX (middle) 10K          450K ops/s    0.3ms
LINDEX (middle) 100K          80K ops/s    1.5ms
LINSERT         10K          420K ops/s    0.35ms

With compress=1 (interior nodes LZF):
  LPUSH/RPUSH:   ~same (head/tail always RAW)
  LRANGE 0 999:  ~22K ops/s (vs 30K) — ~27% slower due to decompress
  Memory:        ~48% savings for repetitive string data

8.10 Summary

quicklist represents a principled engineering trade-off:

Characteristic Pure listpack Pure linkedlist quicklist
Head/tail push O(1) + realloc O(1) O(1) amortized
Random access O(n) elements O(n) nodes O(n) elements
Memory density Highest (contiguous) Lowest (fragmented) Medium
Cache behavior Excellent Poor Good
LZF compression Not supported Not supported Supported
Maximum memmove Entire list None Bounded (fill size)

The key invariant that makes quicklist work: by bounding listpack size, the cost of any single memmove operation is bounded to at most fill_bytes bytes (default 8KB). This converts unbounded worst-case behavior into a predictable, manageable constant.

Combined with LZF compression for cold interior nodes, quicklist delivers the performance profile that Redis List users experience: fast queue operations at the ends, acceptable (if not optimal) range scans, and memory efficiency for large lists.

Rate this chapter
4.8  / 5  (50 ratings)

💬 Comments