Chapter 21

High Availability Architecture

MySQL High Availability Architecture

High availability ensures MySQL databases remain accessible during failures. This guide covers architecture patterns, failover mechanisms, and operational challenges.

1. HA Fundamentals

1.1 Metrics for High Availability


RTO (Recovery Time Objective):
├─ How long until service restored after failure?
├─ Master failure: 30 seconds → 30 minute RTO unacceptable
└─ Target: < 1 minute for most applications

RPO (Recovery Point Objective):
├─ How much data loss acceptable?
├─ Synchronous replication: RPO ≈ 0 (no data loss)
├─ Asynchronous replication: RPO = hours of data
└─ Target: Minutes or less of data loss

AVAILABILITY = (Total Time - Downtime) / Total Time

99% availability = 3.65 days/year downtime
99.9% availability = 8.76 hours/year downtime
99.99% availability = 52.6 minutes/year downtime

High Availability requires:
├─ Redundancy (multiple servers)
├─ Automated detection (health checks)
├─ Automated failover (switch to replica)
├─ Data consistency (no data loss)
└─ Monitoring (alerts on degradation)

1.2 Master-Slave Replication


SYNCHRONOUS vs ASYNCHRONOUS:

Asynchronous (Traditional):
  Master writes
    ↓ (immediately returns success)
  Binlog sent to replica (backgroundthread)
    ↓ (1-10ms delay)
  Replica applies changes

Risks:
├─ Master crashes before binlog sent → Data loss
├─ Replica hasn't applied → Stale reads
└─ Network latency visible to applications

Synchronous (MySQL 5.7+):
  Master writes
    ↓
  Binlog sent to replica
    ↓
  Replica acknowledges receipt
    ↓ (then client returns success)

Benefits:
├─ No data loss (RPO ≈ 0)
├─ Consistent replicas

Costs:
├─ Higher latency (wait for replica)
├─ Slower writes (network round trip)
└─ If replica down → Master waits (blocked writes)

SEMI-SYNCHRONOUS (Hybrid):
  Master writes
    ↓
  Binlog sent to ONE replica
    ↓ (faster than waiting for all)
  Master proceeds (acceptable compromise)

Best of both:
├─ Good performance (don't wait for all)
├─ Good safety (wait for at least one replica)
└─ Default choice for most deployments

2. HA Solutions Comparison

2.1 MySQL MHA (Master High Availability)


Architecture:
  Master (primary writes)
    ↓
  Slave 1 (replication)
    ↓
  Slave 2 (replication)
  MHA Manager (external process)
  VIP (virtual IP)

Failover Process:
1. Monitor detects master down (multiple health checks)
2. Check which slave has latest binary log events
3. Apply unapplied logs from other slaves to best slave
4. Promote best slave to master
5. Update other slaves to replicate from new master
6. Move VIP to new master
7. Update application connection string

Features:
├─ Automatic master failover
├─ No single point of failure
├─ Preserves data consistency
└─ Fast failover (usually < 30 seconds)

Challenges:
├─ Split-brain: network partition causes two masters
├─ Complex configuration (requires scripting)
├─ External dependency (MHA manager process)
└─ Manual intervention for some scenarios

CONFIGURATION:

masterha_default.cnf:
  [server default]
  user=monitoring_user
  password=password
  ping_interval=5

  [server1]
  hostname=master.example.com

  [server2]
  hostname=slave1.example.com

  [server3]
  hostname=slave2.example.com

FAILOVER COMMAND:

masterha_master_switch --conf=/etc/mha.cnf --master_state=dead
-- Automatically promotes best slave
-- Updates replication topology

2.2 Keepalived + MySQL


Architecture:
  Master (MySQL + Keepalived) ← VIP
  Slave (MySQL + Keepalived)

  Uses VRRP (Virtual Router Redundancy Protocol)

Failover:
1. Keepalived health check fails on master
2. VIP automatically moves to slave
3. Applications connect to VIP (transparent failover)
4. Slave already in replication, ready as master

Advantages:
├─ Simple (L3 failover with VIP)
├─ Fast (sub-second)
├─ Network-transparent (VIP handles routing)
└─ No external process needed (kernel VRRP)

Challenges:
├─ Only 2-node (master + slave)
├─ No automatic slave promotion
├─ Manual SQL changes on new master
├─ Requires split-brain detection

KEEPALIVED CONFIG:

vrrp_instance MySQL {
  state MASTER
  interface eth0
  virtual_router_id 51
  priority 100
  advert_int 1

  virtual_ipaddress {
    192.168.1.100  # VIP
  }

  notify_master /usr/local/bin/mysql_promote.sh
}

FAILOVER SCRIPT (mysql_promote.sh):
#!/bin/bash
# Called when this becomes MASTER
mysql -e "SET GLOBAL read_only = OFF;"
mysql -e "CHANGE MASTER TO master_host='';"  # Stop replication
# Update application DNS/VIP routing

2.3 MySQL InnoDB Cluster (MGR)


Group Replication (MySQL 8.0+):

Architecture:
  Node 1 (Primary, writes)
    ↕ Group Replication
  Node 2 (Secondary, reads)
    ↕
  Node 3 (Secondary, reads)

Features:
├─ Multi-master (any node can write)
├─ Automatic consistency checks
├─ Built-in quorum (prevents split-brain)
├─ Transparent failover
└─ Read-replica scaling (distribute reads)

Failover:
1. Primary fails
2. Group detects failure (quorum voting)
3. Group selects new primary automatically
4. Seconds downtime (automatic, no human intervention)

Benefits:
├─ Truly automated (no external process)
├─ Consistent (built-in conflict detection)
├─ Easy deployment (MySQL Router handles routing)
└─ Modern solution

Challenges:
├─ MySQL 8.0+ only (relatively new)
├─ Higher write overhead (group communication)
├─ Slower writes (consensus protocol)
├─ Not suitable for high-write workloads

DEPLOYMENT:

dba.createCluster('myCluster', {
  gtidSetIsComplete: true
})

cluster.addInstance('[email protected]:3306')
cluster.addInstance('[email protected]:3306')

MySQL Router (automatic failover):
  mysqlrouter --bootstrap user@primary:3306
  # Handles routing to primary/secondaries

3. Failover Challenges

3.1 Split-Brain Problem


SCENARIO: Network partition between master and slaves

Before failure:
  Master (primary writes)
    │ ← Network partition
    ├─ Slave 1 (replication stops)
    └─ Slave 2 (replication stops)

Without proper handling:
1. Monitoring detects master unreachable
2. Promotes Slave 1 to master
3. Slave 1 starts accepting writes
4. Original master still running, also accepting writes
5. TWO MASTERS (split-brain) = data corruption!

SOLUTION 1: Quorum
Majority of servers must agree to failover
├─ 3-node cluster: need 2/3 nodes alive
├─ 5-node cluster: need 3/5 nodes alive
└─ Prevents minority from claiming leadership

SOLUTION 2: Fencing
Original master forced offline if not part of quorum
├─ STONITH (Shoot The Other Node In The Head)
├─ Kill master's network port
├─ Kill master process
└─ Ensures only one master alive

SOLUTION 3: VIP Takeover
Virtual IP moves to new master
├─ Original master IP still there, but VIP gone
├─ Applications connect to VIP only
└─ Original master isolated from network

RECOVERY:
1. Failover completes (new master active)
2. Original master back online
3. Reset original master as slave:
   CHANGE MASTER TO MASTER_HOST='new_master';
   START SLAVE;
4. Replication catches up

3.2 Data Consistency During Failover


PROBLEM: Binlog events not replicated before master fails

Timeline:
1. Master receives write from client
2. Master writes to disk (fsync)
3. Returns success to client
4. Binlog sent to slave (async)
5. Master crashes before binlog sent
6. Data lost on all slaves!

SOLUTION 1: Synchronous Replication
Master waits for slave to acknowledge before returning
├─ Safety: RPO ≈ 0 (no data loss)
└─ Cost: Higher latency

SOLUTION 2: Semi-Synchronous Replication
Master waits for at least ONE slave to acknowledge
├─ Balance: Good safety, good performance
└─ Most common in production

SOLUTION 3: Binlog Filtering + Recovery
Identify unapplied binlog events on slaves
├─ MHA: PT-uname-binlog analyzes latest master binlog
├─ Apply unapplied events to best slave before promotion
└─ Maximizes data consistency

BEST PRACTICE:
SET GLOBAL binlog_format = 'ROW';  # Row-based safer
SET GLOBAL binlog_row_image = 'FULL';  # Full image for recovery
SET GLOBAL rpl_semi_sync_master_enabled = 1;  # Semi-sync

4. Monitoring and Alerts


CRITICAL ALERTS:

1. Master Unavailable
   SELECT 1;  # Heartbeat every 5 seconds
   If fails → Failover trigger

2. Replication Lag
   SHOW SLAVE STATUS\G
   Seconds_Behind_Master > 60?
   Alert: "Slave lagging X seconds"

3. Slave SQL Error
   SHOW SLAVE STATUS\G
   Last_Error != ''?
   Alert: "Slave SQL error: [error message]"

4. Binary Log Position Divergence
   Compare SHOW MASTER STATUS with each slave
   If replicas at different positions → potential issue

5. Disk Space
   df -h on data directory
   If > 90% → will run out soon

MONITORING COMMANDS:

-- Replication health
SHOW SLAVE STATUS\G

-- Compare master/slave
SHOW MASTER STATUS;
(Run on slave) SHOW SLAVE STATUS\G

-- Connection status
SHOW PROCESSLIST;

-- Binlog status
SHOW BINARY LOGS;
SHOW BINLOG EVENTS LIMIT 10;

TOOLS:

MySQL Enterprise Monitor (paid)
├─ Real-time monitoring
├─ Automatic alerting
└─ Performance baselines

Percona Monitoring and Management (PMM)
├─ Open source
├─ Grafana dashboards
├─ Query analytics

MySQL Utilities (mysqlreplicate, mysqlrplcheck)
├─ Monitor replication health
├─ Verify consistency

5. Solution Comparison Matrix

Feature MHA Keepalived Group Replication
Nodes 3+ 2 3+
Failover Time 10-30s <1s 5-10s
Automation High Medium Very High
Complexity High Low Medium
Multi-Master No No Yes
Read Distribution Manual Manual Automatic
MySQL Version 5.5+ 5.5+ 8.0+

6. Best Practices

Conclusion

High availability requires combining replication, monitoring, and automated failover. Choose solution based on node count, version, and automation needs. Test thoroughly before production deployment to ensure failover procedures work as expected.

Rate this chapter
4.7  / 5  (10 ratings)

💬 Comments