Chapter 21
High Availability Architecture
MySQL High Availability Architecture
High availability ensures MySQL databases remain accessible during failures. This guide covers architecture patterns, failover mechanisms, and operational challenges.
1. HA Fundamentals
1.1 Metrics for High Availability
RTO (Recovery Time Objective):
├─ How long until service restored after failure?
├─ Master failure: 30 seconds → 30 minute RTO unacceptable
└─ Target: < 1 minute for most applications
RPO (Recovery Point Objective):
├─ How much data loss acceptable?
├─ Synchronous replication: RPO ≈ 0 (no data loss)
├─ Asynchronous replication: RPO = hours of data
└─ Target: Minutes or less of data loss
AVAILABILITY = (Total Time - Downtime) / Total Time
99% availability = 3.65 days/year downtime
99.9% availability = 8.76 hours/year downtime
99.99% availability = 52.6 minutes/year downtime
High Availability requires:
├─ Redundancy (multiple servers)
├─ Automated detection (health checks)
├─ Automated failover (switch to replica)
├─ Data consistency (no data loss)
└─ Monitoring (alerts on degradation)
1.2 Master-Slave Replication
SYNCHRONOUS vs ASYNCHRONOUS:
Asynchronous (Traditional):
Master writes
↓ (immediately returns success)
Binlog sent to replica (backgroundthread)
↓ (1-10ms delay)
Replica applies changes
Risks:
├─ Master crashes before binlog sent → Data loss
├─ Replica hasn't applied → Stale reads
└─ Network latency visible to applications
Synchronous (MySQL 5.7+):
Master writes
↓
Binlog sent to replica
↓
Replica acknowledges receipt
↓ (then client returns success)
Benefits:
├─ No data loss (RPO ≈ 0)
├─ Consistent replicas
Costs:
├─ Higher latency (wait for replica)
├─ Slower writes (network round trip)
└─ If replica down → Master waits (blocked writes)
SEMI-SYNCHRONOUS (Hybrid):
Master writes
↓
Binlog sent to ONE replica
↓ (faster than waiting for all)
Master proceeds (acceptable compromise)
Best of both:
├─ Good performance (don't wait for all)
├─ Good safety (wait for at least one replica)
└─ Default choice for most deployments
2. HA Solutions Comparison
2.1 MySQL MHA (Master High Availability)
Architecture:
Master (primary writes)
↓
Slave 1 (replication)
↓
Slave 2 (replication)
MHA Manager (external process)
VIP (virtual IP)
Failover Process:
1. Monitor detects master down (multiple health checks)
2. Check which slave has latest binary log events
3. Apply unapplied logs from other slaves to best slave
4. Promote best slave to master
5. Update other slaves to replicate from new master
6. Move VIP to new master
7. Update application connection string
Features:
├─ Automatic master failover
├─ No single point of failure
├─ Preserves data consistency
└─ Fast failover (usually < 30 seconds)
Challenges:
├─ Split-brain: network partition causes two masters
├─ Complex configuration (requires scripting)
├─ External dependency (MHA manager process)
└─ Manual intervention for some scenarios
CONFIGURATION:
masterha_default.cnf:
[server default]
user=monitoring_user
password=password
ping_interval=5
[server1]
hostname=master.example.com
[server2]
hostname=slave1.example.com
[server3]
hostname=slave2.example.com
FAILOVER COMMAND:
masterha_master_switch --conf=/etc/mha.cnf --master_state=dead
-- Automatically promotes best slave
-- Updates replication topology
2.2 Keepalived + MySQL
Architecture:
Master (MySQL + Keepalived) ← VIP
Slave (MySQL + Keepalived)
Uses VRRP (Virtual Router Redundancy Protocol)
Failover:
1. Keepalived health check fails on master
2. VIP automatically moves to slave
3. Applications connect to VIP (transparent failover)
4. Slave already in replication, ready as master
Advantages:
├─ Simple (L3 failover with VIP)
├─ Fast (sub-second)
├─ Network-transparent (VIP handles routing)
└─ No external process needed (kernel VRRP)
Challenges:
├─ Only 2-node (master + slave)
├─ No automatic slave promotion
├─ Manual SQL changes on new master
├─ Requires split-brain detection
KEEPALIVED CONFIG:
vrrp_instance MySQL {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
virtual_ipaddress {
192.168.1.100 # VIP
}
notify_master /usr/local/bin/mysql_promote.sh
}
FAILOVER SCRIPT (mysql_promote.sh):
#!/bin/bash
# Called when this becomes MASTER
mysql -e "SET GLOBAL read_only = OFF;"
mysql -e "CHANGE MASTER TO master_host='';" # Stop replication
# Update application DNS/VIP routing
2.3 MySQL InnoDB Cluster (MGR)
Group Replication (MySQL 8.0+):
Architecture:
Node 1 (Primary, writes)
↕ Group Replication
Node 2 (Secondary, reads)
↕
Node 3 (Secondary, reads)
Features:
├─ Multi-master (any node can write)
├─ Automatic consistency checks
├─ Built-in quorum (prevents split-brain)
├─ Transparent failover
└─ Read-replica scaling (distribute reads)
Failover:
1. Primary fails
2. Group detects failure (quorum voting)
3. Group selects new primary automatically
4. Seconds downtime (automatic, no human intervention)
Benefits:
├─ Truly automated (no external process)
├─ Consistent (built-in conflict detection)
├─ Easy deployment (MySQL Router handles routing)
└─ Modern solution
Challenges:
├─ MySQL 8.0+ only (relatively new)
├─ Higher write overhead (group communication)
├─ Slower writes (consensus protocol)
├─ Not suitable for high-write workloads
DEPLOYMENT:
dba.createCluster('myCluster', {
gtidSetIsComplete: true
})
cluster.addInstance('[email protected]:3306')
cluster.addInstance('[email protected]:3306')
MySQL Router (automatic failover):
mysqlrouter --bootstrap user@primary:3306
# Handles routing to primary/secondaries
3. Failover Challenges
3.1 Split-Brain Problem
SCENARIO: Network partition between master and slaves
Before failure:
Master (primary writes)
│ ← Network partition
├─ Slave 1 (replication stops)
└─ Slave 2 (replication stops)
Without proper handling:
1. Monitoring detects master unreachable
2. Promotes Slave 1 to master
3. Slave 1 starts accepting writes
4. Original master still running, also accepting writes
5. TWO MASTERS (split-brain) = data corruption!
SOLUTION 1: Quorum
Majority of servers must agree to failover
├─ 3-node cluster: need 2/3 nodes alive
├─ 5-node cluster: need 3/5 nodes alive
└─ Prevents minority from claiming leadership
SOLUTION 2: Fencing
Original master forced offline if not part of quorum
├─ STONITH (Shoot The Other Node In The Head)
├─ Kill master's network port
├─ Kill master process
└─ Ensures only one master alive
SOLUTION 3: VIP Takeover
Virtual IP moves to new master
├─ Original master IP still there, but VIP gone
├─ Applications connect to VIP only
└─ Original master isolated from network
RECOVERY:
1. Failover completes (new master active)
2. Original master back online
3. Reset original master as slave:
CHANGE MASTER TO MASTER_HOST='new_master';
START SLAVE;
4. Replication catches up
3.2 Data Consistency During Failover
PROBLEM: Binlog events not replicated before master fails
Timeline:
1. Master receives write from client
2. Master writes to disk (fsync)
3. Returns success to client
4. Binlog sent to slave (async)
5. Master crashes before binlog sent
6. Data lost on all slaves!
SOLUTION 1: Synchronous Replication
Master waits for slave to acknowledge before returning
├─ Safety: RPO ≈ 0 (no data loss)
└─ Cost: Higher latency
SOLUTION 2: Semi-Synchronous Replication
Master waits for at least ONE slave to acknowledge
├─ Balance: Good safety, good performance
└─ Most common in production
SOLUTION 3: Binlog Filtering + Recovery
Identify unapplied binlog events on slaves
├─ MHA: PT-uname-binlog analyzes latest master binlog
├─ Apply unapplied events to best slave before promotion
└─ Maximizes data consistency
BEST PRACTICE:
SET GLOBAL binlog_format = 'ROW'; # Row-based safer
SET GLOBAL binlog_row_image = 'FULL'; # Full image for recovery
SET GLOBAL rpl_semi_sync_master_enabled = 1; # Semi-sync
4. Monitoring and Alerts
CRITICAL ALERTS:
1. Master Unavailable
SELECT 1; # Heartbeat every 5 seconds
If fails → Failover trigger
2. Replication Lag
SHOW SLAVE STATUS\G
Seconds_Behind_Master > 60?
Alert: "Slave lagging X seconds"
3. Slave SQL Error
SHOW SLAVE STATUS\G
Last_Error != ''?
Alert: "Slave SQL error: [error message]"
4. Binary Log Position Divergence
Compare SHOW MASTER STATUS with each slave
If replicas at different positions → potential issue
5. Disk Space
df -h on data directory
If > 90% → will run out soon
MONITORING COMMANDS:
-- Replication health
SHOW SLAVE STATUS\G
-- Compare master/slave
SHOW MASTER STATUS;
(Run on slave) SHOW SLAVE STATUS\G
-- Connection status
SHOW PROCESSLIST;
-- Binlog status
SHOW BINARY LOGS;
SHOW BINLOG EVENTS LIMIT 10;
TOOLS:
MySQL Enterprise Monitor (paid)
├─ Real-time monitoring
├─ Automatic alerting
└─ Performance baselines
Percona Monitoring and Management (PMM)
├─ Open source
├─ Grafana dashboards
├─ Query analytics
MySQL Utilities (mysqlreplicate, mysqlrplcheck)
├─ Monitor replication health
├─ Verify consistency
5. Solution Comparison Matrix
| Feature | MHA | Keepalived | Group Replication |
|---|---|---|---|
| Nodes | 3+ | 2 | 3+ |
| Failover Time | 10-30s | <1s | 5-10s |
| Automation | High | Medium | Very High |
| Complexity | High | Low | Medium |
| Multi-Master | No | No | Yes |
| Read Distribution | Manual | Manual | Automatic |
| MySQL Version | 5.5+ | 5.5+ | 8.0+ |
6. Best Practices
- Use semi-synchronous replication — balances safety and performance
- Deploy 3+ nodes — enables quorum for split-brain prevention
- Implement monitoring — detect failures quickly
- Test failover regularly — catch issues before production failure
- Document procedures — manual failover needed sometimes
- Separate monitoring cluster — monitoring shouldn't depend on primary
- Use VIP/DNS — transparent failover for applications
- Configure fencing — prevent split-brain scenarios
- Plan for recovery — failed master needs reset as slave
- Consider InnoDB Cluster — simpler for MySQL 8.0+ environments
Conclusion
High availability requires combining replication, monitoring, and automated failover. Choose solution based on node count, version, and automation needs. Test thoroughly before production deployment to ensure failover procedures work as expected.