Chapter 48

Percona Toolkit in Practice

Percona Toolkit — Essential DBA Toolkit

Percona Toolkit is the most important open-source toolkit for MySQL DBAs, maintained by Percona with 30+ tools covering performance analysis, online DDL, data consistency verification, data archival, and more. This chapter covers the 5 most essential tools.

1. Installation

# macOS
brew install percona-toolkit

# RHEL/CentOS
yum install -y percona-toolkit

# Ubuntu/Debian
apt-get install -y percona-toolkit

# Verify
pt-query-digest --version

2. pt-query-digest — Slow Query Log Analysis

Analyzes MySQL slow query logs, General Log, and Binlog. Groups queries by fingerprint and produces aggregated statistics — the fastest way to find performance hotspots.

# Most useful: rank by total Query_time (identifies the biggest offenders)
pt-query-digest /var/log/mysql/slow.log \
  --limit 20 --order-by Query_time:sum

# Filter by database
pt-query-digest /var/log/mysql/slow.log \
  --filter '$event->{db} eq "mydb"' --limit 20

# Live capture via SHOW PROCESSLIST
pt-query-digest --processlist h=localhost,u=root,p=password \
  --interval 1 --run-time 60

# Analyze binlog for write patterns
mysqlbinlog /var/lib/mysql/binlog.000001 | pt-query-digest --type binlog

Output interpretation: Look at Response time % (total time share) and R/Call (avg time per call). High total time = optimize first. High max time but low total = occasional rogue query.

3. pt-online-schema-change — Online DDL

Performs DDL changes on large tables without locking. Creates a shadow table, copies data in chunks, then swaps tables atomically.

# Add index to 100M-row table without locking
pt-online-schema-change \
  --host=localhost -u root -p password \
  --alter="ADD INDEX idx_created (created_at)" \
  D=mydb,t=orders \
  --chunk-size=5000 \
  --max-load="Threads_running=25" \
  --critical-load="Threads_running=50" \
  --sleep=0.5 \
  --execute

# Dry run first (no --execute)
pt-online-schema-change \
  --alter="ADD INDEX idx_test (col1)" D=mydb,t=orders \
  --dry-run

pt-osc vs gh-ost: gh-ost (GitHub, open-source) uses binlog instead of triggers — less primary pressure, safer for replicated environments. Recommended for large-scale production.

4. pt-table-checksum — Replication Consistency Verification

# Check primary/replica consistency
pt-table-checksum \
  --host=primary_host -u root -p password \
  --databases=mydb \
  --replicate=percona.checksums

# Find inconsistent chunks
SELECT db, tbl, chunk, master_cnt, this_cnt
FROM percona.checksums
WHERE master_cnt != this_cnt OR master_crc != this_crc;

# Fix inconsistencies with pt-table-sync
pt-table-sync --execute --replicate=percona.checksums \
  h=primary,u=root,p=pass h=replica

5. pt-archiver — Low-impact Data Archival

# Archive 2-year-old orders to history table (rate-limited)
pt-archiver \
  --source h=primary,D=mydb,t=orders,u=root,p=pass \
  --dest   h=archive_db,D=archive,t=orders_history,u=root,p=pass \
  --where  "created_at < '2022-01-01'" \
  --limit 1000 --sleep 0.1

# Purge old logs (delete only, no archive)
pt-archiver \
  --source h=localhost,D=mydb,t=access_logs,u=root,p=pass \
  --where "created_at < '2022-01-01'" \
  --limit 5000 --sleep 0.05 --purge

6. pt-kill — Query Management

# Kill SELECT queries running over 60 seconds
pt-kill --host=localhost -u root -p pass \
  --match-command Query --match-info "^SELECT" \
  --busy-time 60 --kill --interval 5 --run-time 3600

7. Other Notable Tools

Tool Purpose
pt-duplicate-key-checker Find duplicate/redundant indexes
pt-index-usage Find unused indexes from General Log
pt-mysql-summary Generate comprehensive MySQL diagnostic report
pt-config-diff Compare configuration between two MySQL instances
pt-heartbeat Precisely measure replication lag
pt-slave-delay Intentionally delay replica (for mistake recovery)
Rate this chapter
4.6  / 5  (3 ratings)

💬 Comments