Percona Toolkit in Practice
Percona Toolkit — Essential DBA Toolkit
Percona Toolkit is the most important open-source toolkit for MySQL DBAs, maintained by Percona with 30+ tools covering performance analysis, online DDL, data consistency verification, data archival, and more. This chapter covers the 5 most essential tools.
1. Installation
# macOS
brew install percona-toolkit
# RHEL/CentOS
yum install -y percona-toolkit
# Ubuntu/Debian
apt-get install -y percona-toolkit
# Verify
pt-query-digest --version
2. pt-query-digest — Slow Query Log Analysis
Analyzes MySQL slow query logs, General Log, and Binlog. Groups queries by fingerprint and produces aggregated statistics — the fastest way to find performance hotspots.
# Most useful: rank by total Query_time (identifies the biggest offenders)
pt-query-digest /var/log/mysql/slow.log \
--limit 20 --order-by Query_time:sum
# Filter by database
pt-query-digest /var/log/mysql/slow.log \
--filter '$event->{db} eq "mydb"' --limit 20
# Live capture via SHOW PROCESSLIST
pt-query-digest --processlist h=localhost,u=root,p=password \
--interval 1 --run-time 60
# Analyze binlog for write patterns
mysqlbinlog /var/lib/mysql/binlog.000001 | pt-query-digest --type binlog
Output interpretation: Look at Response time % (total time share) and R/Call (avg time per call). High total time = optimize first. High max time but low total = occasional rogue query.
3. pt-online-schema-change — Online DDL
Performs DDL changes on large tables without locking. Creates a shadow table, copies data in chunks, then swaps tables atomically.
# Add index to 100M-row table without locking
pt-online-schema-change \
--host=localhost -u root -p password \
--alter="ADD INDEX idx_created (created_at)" \
D=mydb,t=orders \
--chunk-size=5000 \
--max-load="Threads_running=25" \
--critical-load="Threads_running=50" \
--sleep=0.5 \
--execute
# Dry run first (no --execute)
pt-online-schema-change \
--alter="ADD INDEX idx_test (col1)" D=mydb,t=orders \
--dry-run
pt-osc vs gh-ost: gh-ost (GitHub, open-source) uses binlog instead of triggers — less primary pressure, safer for replicated environments. Recommended for large-scale production.
4. pt-table-checksum — Replication Consistency Verification
# Check primary/replica consistency
pt-table-checksum \
--host=primary_host -u root -p password \
--databases=mydb \
--replicate=percona.checksums
# Find inconsistent chunks
SELECT db, tbl, chunk, master_cnt, this_cnt
FROM percona.checksums
WHERE master_cnt != this_cnt OR master_crc != this_crc;
# Fix inconsistencies with pt-table-sync
pt-table-sync --execute --replicate=percona.checksums \
h=primary,u=root,p=pass h=replica
5. pt-archiver — Low-impact Data Archival
# Archive 2-year-old orders to history table (rate-limited)
pt-archiver \
--source h=primary,D=mydb,t=orders,u=root,p=pass \
--dest h=archive_db,D=archive,t=orders_history,u=root,p=pass \
--where "created_at < '2022-01-01'" \
--limit 1000 --sleep 0.1
# Purge old logs (delete only, no archive)
pt-archiver \
--source h=localhost,D=mydb,t=access_logs,u=root,p=pass \
--where "created_at < '2022-01-01'" \
--limit 5000 --sleep 0.05 --purge
6. pt-kill — Query Management
# Kill SELECT queries running over 60 seconds
pt-kill --host=localhost -u root -p pass \
--match-command Query --match-info "^SELECT" \
--busy-time 60 --kill --interval 5 --run-time 3600
7. Other Notable Tools
| Tool | Purpose |
|---|---|
| pt-duplicate-key-checker | Find duplicate/redundant indexes |
| pt-index-usage | Find unused indexes from General Log |
| pt-mysql-summary | Generate comprehensive MySQL diagnostic report |
| pt-config-diff | Compare configuration between two MySQL instances |
| pt-heartbeat | Precisely measure replication lag |
| pt-slave-delay | Intentionally delay replica (for mistake recovery) |