Chapter 13

HDD vs SSD: What's the Real Difference

HDD vs SSD: What's the Difference?

A hard disk drive is like a spinning vinyl record—except instead of music, it stores your files. An SSD has no moving parts at all; it's more like a giant USB drive. Both can hold terabytes of data, but they work in fundamentally different ways—and those differences are exactly why opening a large file on an SSD feels instant while an HDD makes you wait for the spinning circle.

Core Concepts

Hard Disk Drives (HDD): The Spinning Platter

A mechanical hard drive stacks multiple magnetic platters on a shared spindle, like a set of records:

HDD internal structure:

         Spindle motor (7200 RPM)
              ↓
   ┌──────────────────────────────┐
   │  ●────────────────────────●  │ ← magnetic platter (multiple stacked)
   │  ●────────────────────────●  │
   │  ●────────────────────────●  │
   └──────────────────────────────┘
           ↑
     Actuator arm
     One read/write head per platter surface
     Head floats 3–5 nm above the platter surface
     (1/20,000th the diameter of a human hair)

Data is written on concentric circular tracks, which are divided into sectors of 512 bytes or 4 KB each.

Reading or writing any piece of data requires three steps:

Seek                    Rotational latency              Transfer
────────────────────────────────────────────────────────────────
Move head to right track   Wait for sector to spin   Data flows from
                           under the head            platter to head
~3–10 ms                   ~4 ms avg at 7200 RPM     ~0.1 ms/sector

This is the fundamental reason HDDs are slow at random I/O: every random access requires both a seek and a rotational wait, averaging 5–15 ms. Sequential access is different—the head doesn't need to move, data just flows continuously, reaching 150–200 MB/s.

HDD typical performance (7200 RPM desktop drive):
┌────────────────────────┬──────────────┐
│ Sequential read/write  │ 150-200 MB/s │
│ Random read (4 KB)     │ ~0.5 MB/s   │  ← 300–400× slower!
│ Average seek time      │ 8-12 ms     │
│ IOPS (4 KB random)     │ ~100        │
└────────────────────────┴──────────────┘

SSDs: Flash Memory Storage

An SSD has no moving parts. Its core storage element is NAND Flash, based on the floating gate transistor:

Floating gate transistor structure:

     Control gate
          │
    ──────┼──────
     Floating gate  ← completely surrounded by oxide; charge is trapped here
    ──────┼──────
          │
         Channel
  Source               Drain

How it stores data:
  Floating gate has charge → high threshold voltage → reads as "0" (programmed)
  Floating gate has no charge → low threshold voltage → reads as "1" (erased)

Writing injects electrons into the floating gate through quantum tunneling by applying a high voltage to the control gate. Erasing pulls the electrons back out, again with high voltage.

SLC / MLC / TLC / QLC: How Many Bits Per Cell

The amount of charge trapped in the floating gate can be precisely controlled, allowing each cell to represent multiple bits by distinguishing multiple charge levels:

┌──────┬─────────────┬────────────┬───────┬────────┬───────────────────┐
│ Type │ Bits/cell   │ Volt levels│ Speed │ Life   │ Typical use       │
├──────┼─────────────┼────────────┼───────┼────────┼───────────────────┤
│ SLC  │ 1 bit       │ 2          │ Fastest│ Longest│ Enterprise SSD   │
│      │             │            │       │ ~100K  │                   │
├──────┼─────────────┼────────────┼───────┼────────┼───────────────────┤
│ MLC  │ 2 bits      │ 4          │ Fast  │ Long   │ High-end consumer │
│      │             │            │       │ ~10K   │                   │
├──────┼─────────────┼────────────┼───────┼────────┼───────────────────┤
│ TLC  │ 3 bits      │ 8          │ Medium│ Medium │ Mainstream SSDs   │
│      │             │            │       │ ~3K    │ (most drives)     │
├──────┼─────────────┼────────────┼───────┼────────┼───────────────────┤
│ QLC  │ 4 bits      │ 16         │ Slower│ Shorter│ High-capacity/    │
│      │             │            │       │ ~1K    │ budget drives     │
└──────┴─────────────┴────────────┴───────┴────────┴───────────────────┘

Lifespan is measured in P/E cycles (Program/Erase): one full write-then-erase counts as one cycle. Cells that store more bits have smaller voltage margins between states, are more sensitive to the stress of high-voltage programming, and wear out sooner.

SSD vs HDD Performance Comparison

┌─────────────────────────┬──────────────────┬────────────────────┐
│                         │ HDD (7200 RPM)   │ NVMe SSD           │
├─────────────────────────┼──────────────────┼────────────────────┤
│ Sequential read         │ 150–200 MB/s     │ 3500–7000 MB/s     │
│ Sequential write        │ 120–160 MB/s     │ 3000–7000 MB/s     │
│ Random read (4 KB)      │ ~0.5 MB/s        │ 500–3000 MB/s      │
│ Random read IOPS (4 KB) │ ~100             │ 500,000–1,000,000  │
│ Access latency          │ 5–15 ms          │ 0.05–0.1 ms        │
│ Noise                   │ Yes (motor spin) │ None               │
│ Shock resistance        │ Poor             │ Excellent          │
│ Price per GB            │ ~$0.02–0.03      │ ~$0.08–0.15        │
└─────────────────────────┴──────────────────┴────────────────────┘

Wear Leveling and Write Amplification

Flash cells have a limited number of erase cycles. If the OS kept rewriting the same logical address, the corresponding physical cells would wear out prematurely. SSDs include a firmware layer called the FTL (Flash Translation Layer), analogous to virtual memory: it maps logical addresses to physical cells, and on each write, it chooses the cells with the most remaining life—spreading wear evenly across all cells. This is wear leveling.

Write amplification is a related phenomenon. Flash's minimum erase unit (an erase block, typically 128–512 KB) is far larger than its minimum write unit (a page, typically 4–16 KB). If you update just one 4 KB page within a block that still has live data in its other pages, the SSD must read the entire block into a buffer, modify the 4 KB, erase the block, and write everything back. The actual data written to flash far exceeds what you requested. A higher write amplification factor means faster wear. Good FTL design minimizes write amplification.

Try It Yourself

Check SSD health and P/E cycle count via SMART on Linux:

# Install smartmontools
sudo apt install smartmontools

# View SSD health data
sudo smartctl -a /dev/nvme0

# Key fields to look for:
# Media_Wearout_Indicator   or   Percentage_Used
# Shows what percentage of rated endurance has been consumed

# Measure sequential read/write speed
# Write 2 GB test file bypassing cache (oflag=direct)
dd if=/dev/zero of=/tmp/test bs=1M count=2048 oflag=direct

# Clear page cache, then read
sudo sync && sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
dd if=/tmp/test of=/dev/null bs=1M iflag=direct

Simulate sequential vs random access latency in Python:

import time, random, os

SIZE = 100 * 1024 * 1024  # 100 MB
BLOCK = 4096               # 4 KB

# Create test file
with open('/tmp/disktest', 'wb') as f:
    f.write(os.urandom(SIZE))

# Sequential read
with open('/tmp/disktest', 'rb') as f:
    t = time.time()
    while f.read(BLOCK):
        pass
    print(f"Sequential 100 MB: {time.time()-t:.3f} s")

# Random reads (simulates HDD worst case)
positions = random.sample(range(0, SIZE - BLOCK, BLOCK), 1000)
with open('/tmp/disktest', 'rb') as f:
    t = time.time()
    for pos in positions:
        f.seek(pos)
        f.read(BLOCK)
    print(f"Random 1000 reads: {time.time()-t:.3f} s")

os.unlink('/tmp/disktest')

🔬 Going Deeper

3D NAND: stacking layers vertically

Early flash memory arranged cells in a flat 2D grid on silicon. As manufacturers pushed cell density higher, cells got closer together, interference between them worsened, and endurance fell. In 2013, Samsung introduced V-NAND: instead of cramming cells closer together in 2D, they stacked them vertically in 3D layers—current mainstream drives stack 200+ layers. With larger cells and more room between them, interference is lower and endurance actually improves compared to late-generation 2D NAND. The trade-off is that each layer has slightly different electrical characteristics, demanding sophisticated signal processing in the FTL.

NVMe vs SATA: the interface sets the ceiling

Early SSDs used the SATA interface originally designed for HDDs, which caps throughput at 600 MB/s—nowhere near what flash can deliver. NVMe (Non-Volatile Memory Express) was designed from scratch for flash: it runs directly over the PCIe bus, has lower command overhead, and supports up to 65,535 queues with 65,535 commands each (versus SATA's single queue of 32 commands). Today's top NVMe SSDs on PCIe 5.0 deliver sequential reads exceeding 12 GB/s.

ZNS SSD: making flash last longer

Traditional SSDs hide the FTL from the host, which means the host's write patterns often don't align with how flash erases blocks, leading to high write amplification. ZNS (Zoned Namespace Storage) exposes the SSD as a set of zones that must be written sequentially—aligning host write patterns with flash erase units. This brings write amplification from >10 toward 1, dramatically extending endurance and improving throughput. RocksDB and Ceph already have ZNS support, and it's increasingly common in data-center SSDs.

Where to learn more

"The Internals of PostgreSQL" (free online book) — Chapter 8 analyzes how HDD vs SSD characteristics influence database I/O strategies, bridging theory and engineering.
Flash Memory Summit technical presentations (flashmemorysummit.com) — Annual reports on NAND technology advances; 3D NAND, ZNS, and CXL are covered firsthand as they emerge.
Systems Performance by Brendan Gregg — Chapter 9 covers disk I/O analysis with tools like iostat and biolatency, with a focus on measuring and diagnosing real-world I/O problems.

Rate this chapter

4.7 / 5 (23 ratings)