Chapter 28

Database Sharding in Practice

MySQL Sharding Complete Guide

Sharding is MySQL's ultimate horizontal scalability solution and one of the most complex architectural decisions. This chapter covers when to shard, shard key design, middleware selection, cross-shard queries, distributed transactions, and zero-downtime data migration.

Architecture principle: Sharding is a last resort, not a first choice. Before sharding, exhaust these options: index optimization, read/write splitting, caching, and vertical database splitting (microservices). Premature sharding introduces catastrophic complexity.

1. When to Shard?

1.1 Single-node MySQL limits

A high-end server (32-core, 256GB RAM, NVMe SSD) can handle roughly:

Metric Rough Limit Notes
QPS (simple queries) ~50,000 Indexed point lookups
Write TPS ~5,000–10,000 fsync'd transaction writes
Rows per table 100M–500M Watch query performance above 20M
Data per instance 1–5 TB Degrades once data exceeds Buffer Pool

1.2 Shard decision tree

Table > 20M rows? ├── No → optimize indexes first └── Yes → Are queries slow? ├── No → keep watching (50M may still be fine) └── Yes → Have you tried: ├── Better indexes / query rewriting → do first ├── Larger Buffer Pool → do first ├── Read/write splitting (replica) → do first ├── Vertical split (microservices DB) → do first └── All done and still not enough → consider sharding

2. Split Types

2.1 Vertical splitting (do this first)

Move different business tables to different databases — typically mirrors microservice boundaries. Reduces coupling, no complex shard routing needed.

2.2 Horizontal sharding

Distribute rows of the same table across multiple databases/tables using a shard key. Each shard holds a partition of the data.

3. Shard Key Design

3.1 Selection principles

  1. High query routing rate: 80%+ of queries must carry this field, or you'll scatter-gather every query
  2. Uniform data distribution: avoid hotspots (one shard much larger than others)
  3. Natural business clustering: keep one user's data on one shard to enable single-shard joins
  4. Immutable: changing a shard key value means migrating data across shards — extremely costly

3.2 Sharding strategy comparison

Strategy Pros Cons
Hash modulo Even distribution, simple Re-sharding requires rebalancing all data
Consistent hashing Only migrate ~1/N data when adding shards Complex, potential imbalance
Range Easy to add new shards Hotspot on latest range
Time range Natural archival Write hotspot on current period
Lookup table Flexible routing Need to maintain mapping table

4. Global Unique IDs

AUTO_INCREMENT is only unique within a single database. Options:

5. Middleware Selection

Scenario Recommendation
Java stack, mid-scale ShardingSphere JDBC mode
Multi-language, proxy needed ShardingSphere Proxy
Kubernetes, large-scale Vitess
Cloud SaaS PlanetScale (managed Vitess)
Go, simple routing Custom per-shard connection pool routing

6. Cross-Shard Query Challenges

7. Distributed Transactions

Approach Consistency Performance Recommended For
XA 2-Phase Commit Strong Low Finance, strict consistency
Seata AT Eventual Medium Java stack, low intrusion
TCC Eventual High High concurrency with compensation
Outbox Pattern Eventual High Most practical

8. Zero-Downtime Migration

Dual-write migration (recommended):

  1. Build new sharded cluster
  2. Enable dual-write (write to both old and new, read from old)
  3. Backfill historical data in batches (rate-limited to avoid impacting production)
  4. Validate consistency between old and new
  5. Gradually shift read traffic to new cluster, monitor
  6. Atomically switch write traffic; keep old DB read-only as rollback option
  7. Decommission old DB after stability confirmed
Rate this chapter
4.5  / 5  (4 ratings)

💬 Comments