SDS Dynamic String: Why Not C String
Chapter 5: SDS — Simple Dynamic String: Why Not C Strings?
5.1 The Five Fundamental Flaws of C Strings
Before examining SDS, we must understand precisely why C strings fail as a foundation for a high-performance data store. C strings are null-terminated char arrays — a design that made sense in 1970s Unix, but introduces five structural problems for any system that processes untrusted data at high throughput.
5.1.1 O(n) Length Retrieval
The C standard library's strlen() scans every byte until it finds a null terminator:
/* Standard strlen implementation */
size_t strlen(const char *s) {
const char *p = s;
while (*p != '\0') p++;
return p - s;
}
For a 10 MB string, this requires 10 million memory reads. If strlen() is called in a loop — during string concatenation, pattern matching, or log formatting — the cost accumulates rapidly. On modern CPUs with L1 cache lines of 64 bytes, scanning a cold string causes a cache miss every 64 bytes, making this particularly expensive for large strings.
SDS solution: The len field is stored directly in the header. sdslen() is a single memory read — O(1) regardless of string size:
/* src/sds.h */
static inline size_t sdslen(const sds s) {
unsigned char flags = s[-1]; /* flags is at offset -1 from buf */
switch (flags & SDS_TYPE_MASK) {
case SDS_TYPE_5:
return SDS_TYPE_5_LEN(flags); /* high 5 bits of flags = length */
case SDS_TYPE_8:
return SDS_HDR(8, s)->len; /* direct field read */
case SDS_TYPE_16:
return SDS_HDR(16, s)->len;
case SDS_TYPE_32:
return SDS_HDR(32, s)->len;
case SDS_TYPE_64:
return SDS_HDR(64, s)->len;
}
return 0;
}
5.1.2 Buffer Overflow
Standard C string functions do not check destination buffer capacity:
char s1[6] = "Hello"; /* 6 bytes: 5 chars + '\0' */
char s2[] = " World"; /* 7 bytes */
strcat(s1, s2); /* UNDEFINED BEHAVIOR: writes 13 bytes into 6-byte buffer */
/* This overwrites adjacent stack/heap memory */
/* CVE-2015-3416, CVE-2018-16395, and hundreds of others trace back to this */
SDS solution: Every write operation calls sdsMakeRoomFor() before modifying the buffer, guaranteeing sufficient capacity:
/* src/sds.c */
sds sdsMakeRoomFor(sds s, size_t addlen) {
size_t avail = sdsavail(s); /* avail = alloc - len */
if (avail >= addlen) return s; /* Already enough space, no-op */
size_t len = sdslen(s);
size_t newlen = len + addlen;
/* Space preallocation policy */
if (newlen < SDS_MAX_PREALLOC) /* SDS_MAX_PREALLOC = 1MB (1 << 20) */
newlen *= 2; /* Double for small strings */
else
newlen += SDS_MAX_PREALLOC; /* Add 1MB for large strings */
/* Type may change (e.g., sdshdr8 → sdshdr16 if newlen > 255) */
return _sdsMakeRoomFor(s, addlen, 1);
}
The buffer is always valid before any write; the SDS contract guarantees len ≤ alloc at all times.
5.1.3 Frequent Reallocation
Appending to a C string requires realloc on every operation:
/* Naive C string append (must realloc every time) */
char *s = strdup("hello");
for (int i = 0; i < 1000; i++) {
size_t len = strlen(s);
s = realloc(s, len + 4 + 1); /* +4 for " foo", +1 for '\0' */
strcat(s, " foo");
}
/* 1000 realloc calls = 1000 potential memory copies */
realloc in the worst case: allocates new memory, copies the old contents, frees the old allocation. For a string growing from 1KB to 2KB, that is 1KB of data copied per realloc. For a 1MB string: 1MB copied per append.
SDS solution: Space preallocation (covered in 5.4) combined with lazy space release (5.5) dramatically reduces realloc frequency.
5.1.4 Cannot Store Binary Data Containing Null Bytes
The null terminator convention makes C strings fundamentally incompatible with binary data:
/* Attempt to store a JPEG file header */
char data[] = "\xff\xd8\xff\xe0\x00\x10JFIF";
strlen(data); /* Returns 4! Stops at the '\xff' that happens to be... */
/* Actually stops at the first 0x00 byte */
/* 0x00 at offset 4 terminates the string — the "JFIF" marker is invisible */
memcpy(dest, data, strlen(data)); /* Copies only 4 bytes of the 10-byte header */
Any protocol that uses 0x00 bytes as part of its framing — gzip, PNG, TLS records, MessagePack — cannot be safely stored in a C string.
SDS solution: The len field records the actual byte count; buf[] is treated as an arbitrary byte sequence:
# Store a raw PNG header (contains 0x00 bytes)
redis-cli SET imgmeta $'\x89\x50\x4e\x47\x0d\x0a\x1a\x0a\x00\x00\x00\x0d'
redis-cli STRLEN imgmeta # → 12 (correct: counts all bytes including 0x00)
# Binary-safe GETRANGE
redis-cli GETRANGE imgmeta 0 3 # → \x89PNG (first 4 bytes, correct)
5.1.5 Unsafe Standard Library Functions
Functions like gets(), scanf("%s", buf), sprintf() without width limits, and strcpy() are the root cause of decades of security vulnerabilities. Even "safer" alternatives (snprintf, strncat) require the programmer to manually calculate remaining buffer space on every call — and mistakes are common.
SDS eliminates all use of these functions in the hot path. All string operations go through the SDS API, which handles bounds checking internally.
5.2 The Five SDS Header Structures
SDS selects the smallest header that can represent the string's length, minimizing per-string overhead:
/* src/sds.h — all headers use __attribute__((packed)) */
/* sdshdr5: length stored in high 5 bits of flags; max string: 31 bytes */
struct __attribute__ ((__packed__)) sdshdr5 {
unsigned char flags; /* bits [7:3] = length (0-31); bits [2:0] = SDS_TYPE_5 */
char buf[]; /* no len or alloc fields — cannot grow without type change */
};
/* sdshdr8: max string length = 255 bytes (uint8_t) */
struct __attribute__ ((__packed__)) sdshdr8 {
uint8_t len; /* Used bytes in buf (not counting '\0') */
uint8_t alloc; /* Total allocated bytes in buf (not counting '\0') */
unsigned char flags; /* bits [2:0] = SDS_TYPE_8 (0x01) */
char buf[]; /* String content + '\0' */
};
/* sdshdr16: max string length = 65,535 bytes (64 KB) */
struct __attribute__ ((__packed__)) sdshdr16 {
uint16_t len;
uint16_t alloc;
unsigned char flags; /* SDS_TYPE_16 (0x02) */
char buf[];
};
/* sdshdr32: max string length = 4,294,967,295 bytes (~4 GB) */
struct __attribute__ ((__packed__)) sdshdr32 {
uint32_t len;
uint32_t alloc;
unsigned char flags; /* SDS_TYPE_32 (0x03) */
char buf[];
};
/* sdshdr64: max string length = 2^64 - 1 bytes (theoretical; RAM is the real limit) */
struct __attribute__ ((__packed__)) sdshdr64 {
uint64_t len;
uint64_t alloc;
unsigned char flags; /* SDS_TYPE_64 (0x04) */
char buf[];
};
5.2.1 __attribute__((packed)): Eliminating Alignment Padding
Without packed, the C compiler aligns struct fields to their natural alignment boundary:
/* Without packed: sdshdr8 layout on a 64-bit compiler */
struct sdshdr8_unaligned {
uint8_t len; /* offset 0 (1 byte) */
/* 1 byte padding to align alloc to 2-byte boundary */
uint8_t alloc; /* offset 2 (1 byte) */
/* 1 byte padding */
unsigned char flags; /* offset 4 (1 byte) */
/* padding to next alignment boundary */
char buf[]; /* offset varies */
};
/* sizeof: potentially 6+ bytes */
/* With packed: sdshdr8 actual layout */
struct sdshdr8_packed {
uint8_t len; /* offset 0 */
uint8_t alloc; /* offset 1 */
unsigned char flags; /* offset 2 */
char buf[]; /* offset 3 */
};
/* sizeof: 3 bytes (zero waste) */
For sdshdr8, packed saves 3 bytes per header. At 100 million short strings, this is 300 MB of RAM savings.
Performance trade-off: Unaligned memory access is slower on some architectures (particularly older ARM and MIPS). On modern x86/x86_64 with hardware alignment handling, the penalty is negligible (the CPU silently issues a second fetch for crossing a cache line boundary). Redis's primary deployment target is x86_64 Linux servers, making this trade-off acceptable.
5.2.2 SDS Type Selection Logic
/* src/sds.c */
static inline char sdsReqType(size_t string_size) {
if (string_size < 1<<5) return SDS_TYPE_5; /* < 32 bytes */
if (string_size < 1<<8) return SDS_TYPE_8; /* < 256 bytes */
if (string_size < 1<<16) return SDS_TYPE_16; /* < 65,536 bytes */
#if (LONG_MAX == LLONG_MAX)
if (string_size < 1ll<<32) return SDS_TYPE_32; /* < 4 GB (64-bit OS) */
#endif
return SDS_TYPE_64;
}
Header sizes by type:
| SDS Type | Header Size | Max String Length | Notes |
|---|---|---|---|
| sdshdr5 | 1 byte | 31 bytes | No alloc field; rarely used for values |
| sdshdr8 | 3 bytes | 255 bytes | Used by embstr for strings ≤ 44 bytes |
| sdshdr16 | 5 bytes | 65,535 bytes (64 KB) | |
| sdshdr32 | 9 bytes | 4,294,967,295 bytes (4 GB) | |
| sdshdr64 | 17 bytes | Effectively unlimited | RAM is the real constraint |
Practical note: sdshdr5 is used in Redis primarily for key names (which are typically short). It has no alloc field, making it unsuitable for values that might grow. sdshdr8 is the workhorse for most short values.
5.2.3 The flags Field: Dual Purpose
sdshdr5 flags byte:
Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 | Bit 2 Bit 1 Bit 0
───────────────────────────────────────────────────────────
String length (5 bits) | SDS type (3 bits)
0 to 31 | 000 = SDS_TYPE_5
Other sdshdr* flags byte:
Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 | Bit 2 Bit 1 Bit 0
───────────────────────────────────────────────────────────
Reserved (0) | SDS type (3 bits)
| 001 = SDS_TYPE_8
| 010 = SDS_TYPE_16
| 011 = SDS_TYPE_32
| 100 = SDS_TYPE_64
The SDS pointer (of type sds, which is typedef char *) points to buf[0]. The flags byte is always at buf[-1]. This placement allows the SDS type to be determined from any sds pointer with a single byte read:
/* Determine SDS type from any sds pointer */
#define SDS_TYPE_MASK 7
unsigned char flags = s[-1];
int type = flags & SDS_TYPE_MASK; /* Mask off upper bits */
/* Access the header of the correct type */
#define SDS_HDR(T, s) ((struct sdshdr##T *)((s) - (sizeof(struct sdshdr##T))))
/* Example: get len for sdshdr8 */
uint8_t len = SDS_HDR(8, s)->len; /* s - 3 bytes = start of sdshdr8 */
5.3 Complete Memory Layout Diagram
Storing the string "Redis" (5 bytes) as an sdshdr8:
Memory address Hex Meaning
──────────────────────────────────────────────────────────────────
ptr - 3 0x05 sdshdr8.len = 5 (string length)
ptr - 2 0x05 sdshdr8.alloc = 5 (allocated capacity)
ptr - 1 0x01 sdshdr8.flags: bits[2:0] = 001 = SDS_TYPE_8
ptr + 0 0x52 buf[0] = 'R'
ptr + 1 0x65 buf[1] = 'e'
ptr + 2 0x64 buf[2] = 'd'
ptr + 3 0x69 buf[3] = 'i'
ptr + 4 0x73 buf[4] = 's'
ptr + 5 0x00 buf[5] = '\0' (NUL terminator, not in len)
──────────────────────────────────────────────────────────────────
Hex dump of the entire sdshdr8 structure:
Offset: 00 01 02 03 04 05 06 07 08
Value: 05 05 01 52 65 64 69 73 00
─────── ─ ───────────── ──
hdr flg "Redis" \0
The sds variable (the pointer returned by sdsnew()) points to offset 3 (the start of buf). This makes SDS pass-by-pointer compatible with C string functions — printf("%s", sds_ptr) works correctly because buf is null-terminated.
Accessing the header from buf:
sds s = sdsnew("Redis");
/* s points to buf[0] */
printf("%s\n", s); /* Works: C string semantics */
printf("%d\n", sdslen(s)); /* 5: O(1) length */
/* Navigate to header via negative offset */
struct sdshdr8 *hdr = (struct sdshdr8 *)(s - 3);
/* or equivalently: */
struct sdshdr8 *hdr = SDS_HDR(8, s);
printf("len=%d alloc=%d\n", hdr->len, hdr->alloc); /* len=5 alloc=5 */
5.4 Space Preallocation
When SDS needs to grow (e.g., APPEND command), it allocates more than the minimum required — called space preallocation:
/* src/sds.c — simplified preallocation logic */
sds sdsMakeRoomFor(sds s, size_t addlen) {
size_t avail = sdsavail(s);
if (avail >= addlen) return s; /* Already sufficient */
size_t len = sdslen(s);
size_t newlen = len + addlen;
if (newlen < SDS_MAX_PREALLOC) {
newlen *= 2; /* Double: less than 1 MB */
} else {
newlen += SDS_MAX_PREALLOC; /* Add 1 MB: 1 MB or larger */
}
/* SDS_MAX_PREALLOC = 1 << 20 = 1,048,576 bytes */
char type = sdsReqType(newlen);
/* ... realloc, possibly upgrade type (e.g., sdshdr8 → sdshdr16) ... */
return newsh->buf;
}
Effect on a growing string (100-byte segments appended repeatedly):
| State | len |
alloc |
Realloc? |
|---|---|---|---|
| Initial | 0 | 0 | — |
| After 1st APPEND (100 bytes) | 100 | 200 | Yes |
| After 2nd APPEND (100 bytes) | 200 | 200 | No (avail = 100) |
| After 3rd APPEND (100 bytes) | 300 | 600 | Yes |
| After 4th APPEND (100 bytes) | 400 | 600 | No (avail = 200) |
| After 5th APPEND (100 bytes) | 500 | 600 | No (avail = 100) |
| After 6th APPEND (100 bytes) | 600 | 600 | No (avail = 0) |
| After 7th APPEND (100 bytes) | 700 | 1400 | Yes |
Without preallocation: 7 reallocs for 7 appends. With preallocation: 3 reallocs. The savings ratio improves as the string grows and doubling allocates larger chunks.
The 1 MB cap rationale: Doubling a 500 MB string would allocate 1 GB of additional memory — completely unreasonable. The cap at 1 MB means large strings grow in 1 MB increments, which is predictable and bounded. In practice, any value approaching 1 MB in Redis represents a design smell and should be refactored.
Practical verification:
SET mystring ""
DEBUG OBJECT mystring
# serializedlength:0 encoding:embstr
# Append enough to trigger RAW encoding and watch alloc grow
for i in {1..5}; do redis-cli APPEND mystring "$(python3 -c "print('x'*100, end='')") "; done
DEBUG OBJECT mystring
# You can observe the encoding change and length growth
STRLEN mystring # → 505 (5 * 101 bytes)
5.5 Lazy Space Release
When a string is shortened — by SETRANGE, by a Lua script overwriting with a shorter value, or by explicit SDSRANGE — SDS does not immediately release the freed space:
/* src/sds.c */
void sdssetlen(sds s, size_t newlen) {
unsigned char flags = s[-1];
switch(flags & SDS_TYPE_MASK) {
case SDS_TYPE_8:
SDS_HDR(8, s)->len = newlen; /* Reduce len */
/* alloc is NOT changed — free space is retained */
break;
case SDS_TYPE_16:
SDS_HDR(16, s)->len = newlen;
break;
/* ... */
}
s[newlen] = '\0'; /* Place NUL at new end position */
}
After sdssetlen(s, 3) on a string with len=10, alloc=20:
lenbecomes 3allocremains 20sdsavail()returns 17 (20 - 3)- The 7 bytes between position 3 and 10 are logically gone but physically present
Benefit: If the string grows again (APPEND within the same size range), the existing allocation is reused immediately. For strings that oscillate in size (common in protocol parsing buffers), lazy release prevents realloc oscillation.
Explicit release: When Redis needs to reclaim memory (e.g., before saving an RDB snapshot, or when resizing internal buffers), it calls sdsRemoveFreeSpace():
/* src/sds.c */
sds sdsRemoveFreeSpace(sds s, int would_regrow) {
void *sh;
char type, oldtype = s[-1] & SDS_TYPE_MASK;
int hdrlen, oldhdrlen = sdsHdrSize(oldtype);
size_t len = sdslen(s);
size_t avail = sdsavail(s);
if (avail == 0) return s; /* Nothing to release */
/* Possibly downgrade type: if len now fits in a smaller header */
type = sdsReqType(len);
hdrlen = sdsHdrSize(type);
/* Realloc to exact size: hdrlen + len + 1 (NUL) */
/* ... */
}
5.6 Binary Safety
The len field makes SDS binary-safe by construction. The string is delimited by len, not by the first \0 byte:
/* Binary-safe string comparison */
int sdscmp(const sds s1, const sds s2) {
size_t l1 = sdslen(s1);
size_t l2 = sdslen(s2);
size_t minlen = (l1 < l2) ? l1 : l2;
int cmp = memcmp(s1, s2, minlen); /* memcmp, not strcmp */
/* strcmp stops at '\0'; memcmp compares all minlen bytes */
if (cmp == 0) return (l1 > l2) ? 1 : (l1 < l2) ? -1 : 0;
return cmp;
}
memcmp compares exactly minlen bytes regardless of \0 bytes — the core of binary-safe comparison. strcmp would give incorrect results for strings containing embedded nulls.
Practical binary storage:
# Store a TLS certificate (binary DER format)
# In production, use base64 or store as a file reference — but SDS can handle raw binary
redis-cli SET cert "$(cat certificate.der | base64 -d)"
# Store arbitrary bytes including 0x00
redis-cli SET sensor:reading $'\x02\xFF\x00\x1A\x3B\x00\xC4'
redis-cli STRLEN sensor:reading # → 7 (counts all bytes)
redis-cli GETRANGE sensor:reading 2 3 # → \x00\x1A (bytes at offset 2-3)
5.7 SDS and C String Interoperability
SDS maintains a '\0' byte at buf[len] (not counted in len) specifically to allow SDS strings to be passed to C library functions:
/* sds is typedef'd as char* */
typedef char *sds;
sds s = sdsnew("Hello, World");
/* These all work correctly for non-binary SDS strings: */
printf("%s\n", s); /* Prints via NUL terminator */
puts(s); /* Prints via NUL terminator */
fprintf(log_file, s); /* Writes via NUL terminator */
fwrite(s, 1, sdslen(s), f);/* Binary-safe: uses len explicitly */
/* Correct internal Redis usage (always uses sdslen): */
size_t n = sdslen(s); /* O(1) */
The compatibility guarantee: Since buf always has '\0' at position len, any C function that treats SDS as a C string will get correct behavior for strings that do not contain embedded nulls. For binary data, always use sdslen() and memcpy()/fwrite() rather than C string functions.
5.8 The embstr / raw Split: Deep Implementation Analysis
5.8.1 Why 44 Bytes Exactly?
The 44-byte threshold for embstr is derived from three constants:
jemalloc size class that encompasses the combined structure: 64 bytes
Budget allocation:
sizeof(robj) = 16 bytes (type:4 + encoding:4 + lru:24 + refcount + ptr)
sizeof(sdshdr8) = 3 bytes (len + alloc + flags, each 1 byte, packed)
NUL terminator = 1 byte
─────────────────────────────
Total overhead = 20 bytes
Remaining for data = 64 - 20 = 44 bytes
The 64-byte jemalloc size class is the critical constant. jemalloc's small size classes go: 8, 16, 32, 48, 64, 80, 96... bytes. A 44-byte string requires 20 + 44 + 1 = 65 bytes, which rounds up to jemalloc's 80-byte class — but Redis sets the threshold at 44, not 43, because the 64-byte allocation (for strings ≤ 44 bytes) fits exactly in one CPU cache line.
Verification:
/* src/object.c */
#define OBJ_ENCODING_EMBSTR_SIZE_LIMIT 44
robj *createStringObject(const char *ptr, size_t len) {
if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT)
return createEmbeddedStringObject(ptr, len);
else
return createRawStringObject(ptr, len);
}
# Verify the boundary behavior
python3 -c "import subprocess; [subprocess.run(['redis-cli', 'SET', f'k{n}', 'x'*n]) for n in [43,44,45]]"
redis-cli OBJECT ENCODING k43 # → embstr
redis-cli OBJECT ENCODING k44 # → embstr
redis-cli OBJECT ENCODING k45 # → raw
5.8.2 embstr Allocation: One malloc Call
/* src/object.c */
robj *createEmbeddedStringObject(const char *ptr, size_t len) {
robj *o = zmalloc(sizeof(robj) + sizeof(struct sdshdr8) + len + 1);
/* ─────────────────────────────────────────────────────
16 bytes 3 bytes len bytes 1 byte(\0)
robj sdshdr8 header content NUL
*/
struct sdshdr8 *sh = (void *)(o + 1); /* sdshdr8 starts right after robj */
o->type = OBJ_STRING;
o->encoding = OBJ_ENCODING_EMBSTR;
o->ptr = sh + 1; /* ptr points to sh->buf (the string content) */
o->refcount = 1;
o->lru = LRU_CLOCK();
sh->len = len;
sh->alloc = len; /* alloc == len: embstr never has extra space */
sh->flags = SDS_TYPE_8;
if (ptr) {
memcpy(sh->buf, ptr, len);
}
sh->buf[len] = '\0';
return o;
}
Memory layout for "hello" (5 bytes) as embstr:
Single allocation of 25 bytes, rounded up to 32 by jemalloc:
Byte 0: type:4=0(STRING), encoding:4=8(EMBSTR), lru:24=... ⎫ robj
Byte 4: refcount = 1 ⎪ (16 bytes)
Byte 8: ptr → &byte[19] ⎭
Byte 16: len = 5 ⎫ sdshdr8
Byte 17: alloc = 5 ⎪ (3 bytes)
Byte 18: flags = 0x01 (SDS_TYPE_8) ⎭
Byte 19: 'h' ⎫ buf[]
Byte 20: 'e' ⎪ (6 bytes)
Byte 21: 'l' ⎪
Byte 22: 'l' ⎪
Byte 23: 'o' ⎪
Byte 24: '\0' ⎭
(Bytes 25-31: jemalloc internal metadata / padding)
Both the robj and sdshdr8 fit within a single 64-byte cache line. Accessing o->ptr to get the string content, and sdslen(o->ptr) to get its length, both resolve to data in the same cache line — zero additional cache misses after the initial load.
5.8.3 The embstr → raw Conversion Path
embstr strings are effectively immutable. Redis converts them to raw on any modification:
/* src/t_string.c — APPEND command */
void appendCommand(client *c) {
long long totlen;
robj *append;
robj *o = lookupKeyWrite(c->db, c->argv[1]);
if (o == NULL) {
o = createRawStringObject(c->argv[2]->ptr, sdslen(c->argv[2]->ptr));
dbAdd(c->db, c->argv[1], o);
} else {
if (checkStringLength(c, curlen, applen) != C_OK) return;
/* This call converts embstr or INT to raw before modifying */
o = dbUnshareStringValue(c->db, c->argv[1], o);
/* Now safe to modify: o->encoding is RAW */
o->ptr = sdscatlen(o->ptr, append, applen);
}
// ...
}
/* src/db.c */
robj *dbUnshareStringValue(redisDb *db, robj *key, robj *o) {
if (o->refcount != 1 || o->encoding == OBJ_ENCODING_EMBSTR) {
robj *decoded = getDecodedObject(o);
/* getDecodedObject converts INT → string, EMBSTR → RAW */
o = createRawStringObject(decoded->ptr, sdslen(decoded->ptr));
decrRefCount(decoded);
dbOverwrite(db, key, o);
}
return o;
}
SET mykey "hello"
OBJECT ENCODING mykey # → embstr (5 bytes ≤ 44)
APPEND mykey " world" # dbUnshareStringValue called → converts to raw
OBJECT ENCODING mykey # → raw
# "hello world" is 11 bytes — still < 44, but encoding stays raw
# Once converted to raw, never reverts to embstr
STRLEN mykey # → 11
This is by design: dbUnshareStringValue converts embstr to raw once and for all. Tracking whether the modified string would fit back in embstr would add complexity for marginal benefit — the conversion cost (one malloc) is a one-time operation.
5.9 SDS API Reference
| Function | Purpose | Complexity |
|---|---|---|
sdsnew(s) |
Create SDS from C string | O(n) |
sdsnewlen(s, len) |
Create SDS from byte array + explicit length | O(n) |
sdsempty() |
Create empty SDS | O(1) |
sdslen(s) |
Get string length | O(1) |
sdsavail(s) |
Get unused allocated space | O(1) |
sdsdup(s) |
Duplicate SDS | O(n) |
sdsfree(s) |
Free SDS (includes header) | O(1) |
sdscpy(s, t) |
Copy C string into SDS (resizes if needed) | O(n) |
sdscat(s, t) |
Append C string (resizes if needed) | O(n) |
sdscatsds(s, t) |
Append SDS to SDS | O(n) |
sdscatlen(s, t, len) |
Append binary data of explicit length | O(n) |
sdscmp(s1, s2) |
Binary-safe comparison | O(n) |
sdsrange(s, start, end) |
In-place substring (like Python slice) | O(n) |
sdstrim(s, cset) |
Strip leading/trailing chars from set | O(m×n) |
sdsMakeRoomFor(s, n) |
Ensure n bytes of additional capacity | O(n) if realloc |
sdsRemoveFreeSpace(s, r) |
Release unused allocated space | O(n) realloc |
sdsIncrLen(s, n) |
Advance len by n (after manual write into buf) | O(1) |
5.10 SDS Usage Throughout Redis Internals
SDS is not just for user-facing string values. It permeates Redis's internal implementation:
| Context | Usage |
|---|---|
| Key names | Every key in the main dict is an SDS |
| Hash fields and values | SDS in HT encoding; embedded in listpack in listpack encoding |
| ZSet members | SDS in skiplist nodes |
| Set members | SDS in HT encoding (non-integer members) |
client.querybuf |
Incoming network data buffer (SDS) |
client.buf |
Outgoing response buffer (fixed array, not SDS) |
server.aof_buf |
AOF write buffer — grows as commands are buffered |
server.aof_rewrite_buf_blocks |
AOF rewrite background buffer |
| Lua script responses | Intermediate string results |
The AOF buffer deserves special attention:
/* src/aof.c — command appended to AOF buffer after execution */
void feedAppendOnlyFile(struct redisCommand *cmd, int dictid, robj **argv, int argc) {
sds buf = sdsempty();
// ... format command into RESP protocol ...
buf = sdscatlen(buf, cmd_str, cmd_len);
server.aof_buf = sdscatlen(server.aof_buf, buf, sdslen(buf));
/* server.aof_buf grows with each command; flushed to disk periodically */
}
The AOF buffer can grow to many megabytes between flushes. SDS's space preallocation reduces the realloc frequency even for this internal buffer — the same mechanism that benefits user-facing APPEND operations.
5.11 Common SDS-Related Performance Pitfalls
5.11.1 Repeated APPEND to Large Strings
# Anti-pattern: building a large log in a single key
for i in {1..100000}; do
redis-cli APPEND log:daily "$(date): event $i\n"
done
# When log:daily reaches 10 MB, each APPEND involves:
# 1. Lock the string
# 2. sdsMakeRoomFor: realloc (copies 10 MB of data)
# 3. Append new content
# 4. Total: O(N) per append on large string
Solution: Shard the log across multiple keys (by time bucket, sequence number, or hash), or use a Redis List/Stream, which grows via linked structure addition (O(1) append):
# Better: use a List (O(1) LPUSH/RPUSH)
RPUSH log:daily "$(date): event $i"
# Or Stream for time-series with consumer groups
XADD log:stream * msg "event $i" ts "$(date +%s)"
5.11.2 Unintentional embstr → raw Conversion
SET counter 0 # INT encoding (shared integer)
INCR counter # → 1 (still INT)
OBJECT ENCODING counter # → int
APPEND counter "" # Empty append still triggers INT → embstr → raw conversion!
OBJECT ENCODING counter # → raw
# From this point, counter is no longer INT-encoded
# Memory: 16 bytes (raw robj) + 4 bytes (sdshdr8) + 2 bytes ("1\0") ≈ 32 bytes
# vs. INT encoding: 16 bytes (robj, no SDS allocation)
Rule: Never mix APPEND with INCR/DECR on the same key. If you need both string manipulation and numeric operations, store the counter as a separate key.
5.11.3 Large Value Key Scan
# Checking encoding of all keys (dangerous on large datasets)
for key in $(redis-cli KEYS "*"); do
redis-cli OBJECT ENCODING "$key"
done
# KEYS * is O(N) and blocks Redis; never use in production
# Use SCAN for safe iteration:
redis-cli --scan --pattern "*" | while read key; do
echo "$key: $(redis-cli OBJECT ENCODING "$key")"
done
5.12 Summary
SDS addresses every fundamental shortcoming of C strings in a way that is both memory-efficient and practical:
| Problem | C String | SDS |
|---|---|---|
| Length retrieval | O(n) strlen scan | O(1) len field read |
| Buffer overflow | Undefined behavior | Checked by sdsMakeRoomFor |
| Frequent realloc | Every modification | Amortized via preallocation |
| Binary data with 0x00 | Impossible | Native via len field |
| Unsafe stdlib functions | Everywhere | Avoided entirely |
The five SDS header types (sdshdr5 through sdshdr64) each use __attribute__((packed)) to eliminate alignment padding, trading microscopic access penalties for significant memory savings at scale.
The embstr/raw split reflects a careful analysis of jemalloc's size classes: a 44-byte string with its 3-byte sdshdr8 header and 16-byte robj wrapper fits exactly in a 64-byte allocation — one cache line, one malloc. This is not accidental; it represents the kind of deliberate system-level thinking that makes Redis code worth studying.
In the next chapter, we examine listpack — the data structure that replaced ziplist in Redis 7.0 and serves as the compact encoding for small Hash, List, and ZSet values.