Chapter 22
Backup & Recovery
MySQL Backup and Recovery Complete Guide
Backup and recovery are critical operational capabilities. This guide covers backup strategies, tools, recovery procedures, and RPO/RTO planning.
1. Backup Types
1.1 Logical vs Physical Backups
LOGICAL BACKUPS (SQL dumps):
mysqldump:
├─ Exports as SQL statements
├─ CREATE TABLE, INSERT statements
├─ Human-readable, portable
└─ Slow for large databases (I/O intensive)
Usage:
mysqldump -u root -p --all-databases > backup.sql
mysqldump -u root -p mydb > mydb_backup.sql
mysqldump -u root -p mydb mytable > mytable_backup.sql
Restoration:
mysql -u root -p < backup.sql
mysql -u root -p mydb < mydb_backup.sql
Advantages:
├─ Easy to understand (just SQL)
├─ Version-independent (portable between versions)
├─ Can restore specific tables
└─ Can be version-controlled
Disadvantages:
├─ Very slow for large databases
├─ Blocking (locks tables during dump)
├─ Large file size (less compression)
└─ Long restore time
PHYSICAL BACKUPS (binary files):
Percona XtraBackup:
├─ Copies InnoDB data files directly
├─ Non-blocking (background process)
├─ Very fast for large databases
└─ Binary format (not portable across architectures)
Usage:
xtrabackup --backup --target-dir=/backup/base/
xtrabackup --prepare --target-dir=/backup/base/
Restoration:
xtrabackup --copy-back --target-dir=/backup/base/
Advantages:
├─ Very fast (direct file copy)
├─ Non-blocking (can backup running server)
├─ Smaller file size (binary format)
└─ Fast restore (direct file copy)
Disadvantages:
├─ Binary format (version-specific)
├─ Requires xtrabackup tool
└─ Less portable
1.2 Full vs Incremental Backups
FULL BACKUP:
├─ Copy entire database
├─ Large size (all data)
├─ Slow (read all data)
└─ Fast restore (single backup)
INCREMENTAL BACKUP:
├─ Copy only changed blocks since last backup
├─ Small size (only changes)
├─ Fast (read only changes)
└─ Slow restore (need multiple backups)
Timeline:
Day 1: Full backup (100GB)
Day 2: Incremental (5GB changes)
Day 3: Incremental (3GB changes)
Day 4: Incremental (7GB changes)
Restore Day 4 state: Apply 4 backups in sequence
Cost: 100 + 5 + 3 + 7 = 115GB backup storage
BEST STRATEGY:
Weekly full + daily incremental:
└─ Storage: 100GB + (6 × ~5GB) = 130GB/week
Monthly full + weekly incremental + daily log:
└─ Maximum flexibility
└─ Point-in-time recovery (to exact second)
└─ Higher storage
2. Backup Tools Comparison
mysqldump (Built-in):
├─ Cost: Free
├─ Speed: Slow (minutes to hours)
├─ Blocking: Yes (table locks)
├─ Portability: High
└─ Use: Small databases, logical backups
Percona XtraBackup (Open Source):
├─ Cost: Free
├─ Speed: Fast (minutes)
├─ Blocking: No (background)
├─ Portability: Medium (binary)
└─ Use: Production, large databases
MySQL Enterprise Backup (Commercial):
├─ Cost: $$$ (with MySQL Enterprise)
├─ Speed: Very fast
├─ Blocking: No
├─ Portability: High
└─ Use: High-uptime enterprises
AWS RDS (Cloud):
├─ Cost: Included in RDS
├─ Speed: Automatic
├─ Blocking: No (automated)
├─ Portability: AWS-specific
└─ Use: Cloud environments
RECOMMENDATION:
├─ Development: mysqldump
├─ Production small: mysqldump + cron
├─ Production large: Percona XtraBackup
└─ Cloud: Native backup (RDS, etc)
3. Recovery Procedures
3.1 Full Recovery from Backup
SCENARIO: Server crashed, data corrupted
RECOVERY STEPS (XtraBackup):
1. Stop MySQL
systemctl stop mysql
2. Backup current corrupted data
mv /var/lib/mysql /var/lib/mysql.backup
3. Prepare backup
xtrabackup --prepare --target-dir=/backup/base/
4. Copy backup to MySQL directory
xtrabackup --copy-back --target-dir=/backup/base/
5. Fix permissions
chown -R mysql:mysql /var/lib/mysql
6. Start MySQL
systemctl start mysql
7. Verify data
SELECT COUNT(*) FROM tables...;
Typical restore time:
├─ Stop MySQL: 10 seconds
├─ Copy files: 1-5 minutes (SSD faster)
├─ Start MySQL: 20 seconds
└─ Total: 2-7 minutes RTO
3.2 Point-in-Time Recovery
SCENARIO: Someone ran DELETE without WHERE clause
Recovery strategy:
├─ Full backup from before incident
├─ Apply binary logs up to before DELETE
├─ Restore specific table
STEPS:
1. Restore from full backup (or previous backup)
xtrabackup --copy-back ...
2. Find delete timestamp in binlog
mysqlbinlog /var/log/mysql/binlog.000001 | grep -A5 "DELETE"
-- Shows exact timestamp
3. Create temporary database
CREATE DATABASE recovery_db;
4. Restore tables to temp database
mysqlbinlog /var/log/mysql/binlog.000001 \
--stop-datetime='2024-04-24 15:00:00' | \
mysql -u root -p recovery_db
5. Copy table from recovery database
INSERT INTO mydb.mytable
SELECT * FROM recovery_db.mytable;
6. Verify data
SELECT COUNT(*) FROM mytable;
KEY POINTS:
├─ Keep binary logs: set binlog_expire_logs_days = 30
├─ Enable binary logging: log_bin = ON
├─ Save backup metadata: which binlog, which position
└─ Test recovery procedures regularly
4. Backup Strategy Planning
DEFINE REQUIREMENTS:
RPO (Recovery Point Objective):
├─ Can you afford to lose 1 hour of data?
├─ Can you afford to lose 1 day of data?
└─ Smaller RPO = more frequent backups
RTO (Recovery Time Objective):
├─ Can service be down 1 hour?
├─ Can service be down 5 minutes?
└─ Smaller RTO = faster backup/recovery
EXAMPLE STRATEGIES:
High Availability (RPO < 15 min, RTO < 1 min):
├─ Multi-master with semi-sync replication
├─ Hourly backups for verification
├─ Automated failover
└─ Real-time replicas
Standard Production (RPO < 1 hour, RTO < 15 min):
├─ Daily full backup (XtraBackup)
├─ Hourly incremental backups
├─ Binary logs kept for 30 days (PITR)
├─ Weekly restoration tests
Cost-Conscious (RPO < 1 day, RTO < 1 hour):
├─ Daily mysqldump at 2 AM
├─ Weekly full + daily incremental
├─ Stored on cheap storage (S3 Glacier)
└─ Monthly restoration drills
STORAGE CALCULATION:
Database size: 500GB
Full backup: 150GB (compression ratio 3:1)
Weekly strategy: 1 full + 6 incrementals (~5GB each)
├─ Weekly storage: 150GB + (6 × 5GB) = 180GB
└─ With 3-month retention: ~26TB
5. Testing Recovery
NEVER rely on untested backups!
MONTHLY RECOVERY TEST:
#!/bin/bash
# Test backup restoration
# 1. Restore to test server
xtrabackup --copy-back \
--target-dir=/backup/latest \
--datadir=/data/test/
# 2. Start MySQL
systemctl start mysql@test
# 3. Run validation queries
mysql -u root < /test/validation.sql
# 4. Compare with production
mysqldump -u root prod_db > /tmp/prod.sql
mysqldump -u root test_db > /tmp/test.sql
diff /tmp/prod.sql /tmp/test.sql
# 5. Check data integrity
# - Row counts match?
# - Checksums match?
# - Foreign keys valid?
# 6. Log results
# - Backup age
# - Restore time
# - Verification status
# - Any issues found
ALERTS:
├─ If restore fails → CRITICAL
├─ If restore > expected RTO → WARNING
├─ If data mismatches → CRITICAL
└─ Missing backups → CRITICAL
6. Best Practices
- Backup regularly — daily minimum for production
- Test recovery monthly — untested backups are worthless
- Keep binary logs — enables point-in-time recovery
- Store offsite — protect against data center failure
- Encrypt backups — protect sensitive data
- Document procedures — recovery is stressful, need clear steps
- Monitor backup jobs — failures should alert immediately
- Use versioning — keep multiple backup versions
- Calculate RTO/RPO — understand what's acceptable
- Automate backups — manual backups get skipped
Conclusion
Backup and recovery are not optional. Plan your backup strategy based on RPO/RTO requirements, test regularly, and monitor backup job success. A backup that hasn't been tested is not a backup at all.