Sentinel Guide

1. Architecture Overview

Redis Sentinel provides automatic failover, service discovery, and configuration management. A minimum of 3 Sentinel instances is recommended because quorum voting requires a majority to agree a master is down and to elect a failover leader. With only 2 Sentinels, a network partition can prevent majority consensus, risking split-brain scenarios.

┌──────────────┐ │ Sentinel-1 │ :26379 │ (voter) │ └──────┬───────┘ │ ┌──────────────┼──────────────┐ │ │ │ ┌──────┴───────┐ ┌───┴──────┐ ┌─────┴────────┐ │ Sentinel-2 │ │ Master │ │ Sentinel-3 │ │ (voter) │ │ :6379 │ │ (voter) │ └──────────────┘ └────┬─────┘ └──────────────┘ ┌────┴─────┐ ┌────┴───┐ ┌───┴────┐ │Replica1│ │Replica2│ │ :6380 │ │ :6381 │ └────────┘ └────────┘ quorum=2: at least 2 of 3 Sentinels must agree to trigger failover
Deploy Sentinel instances on separate physical machines or availability zones to avoid single points of failure. Each Sentinel independently evaluates master health; distributing across failure domains ensures quorum can still be reached even if one machine goes down.

2. Quick Setup

Step 1: Install Redis

# Ubuntu/Debian sudo apt update && sudo apt install redis-server -y # CentOS/RHEL sudo yum install epel-release -y && sudo yum install redis -y # macOS brew install redis # Verify redis-server --version

Step 2: Configure Master

# /etc/redis/redis-master.conf bind 0.0.0.0 port 6379 requirepass YourStrongPassword masterauth YourStrongPassword dir /var/lib/redis appendonly yes appendfsync everysec

Step 3: Configure Replicas

# /etc/redis/redis-replica1.conf (replica2 similar, port 6381) bind 0.0.0.0 port 6380 replicaof 192.168.1.10 6379 masterauth YourStrongPassword requirepass YourStrongPassword dir /var/lib/redis-replica1 appendonly yes

Step 4: Configure Sentinels

# /etc/redis/sentinel.conf (repeat on 3 machines) port 26379 daemonize yes logfile /var/log/redis/sentinel.log dir /tmp sentinel monitor mymaster 192.168.1.10 6379 2 sentinel auth-pass mymaster YourStrongPassword sentinel down-after-milliseconds mymaster 5000 sentinel parallel-syncs mymaster 1 sentinel failover-timeout mymaster 60000 sentinel deny-scripts-reconfig yes

Step 5: Start Everything

# Start master redis-server /etc/redis/redis-master.conf # Start replicas redis-server /etc/redis/redis-replica1.conf redis-server /etc/redis/redis-replica2.conf # Start sentinels (on each sentinel machine) redis-sentinel /etc/redis/sentinel.conf # Verify replication redis-cli -a YourStrongPassword INFO replication # Verify sentinel redis-cli -p 26379 SENTINEL master mymaster

3. sentinel.conf Complete Reference

Directive Default Description
sentinel monitor <name> <ip> <port> <quorum>-Monitor a master. Quorum is the minimum Sentinels needed for ODOWN. Recommended: N/2+1 (2 for 3 nodes, 3 for 5 nodes)
sentinel down-after-milliseconds30000Time without response before SDOWN. Production: 5000-10000ms. Too low causes false positives
sentinel parallel-syncs1Number of replicas reconfigured simultaneously after failover. Higher = faster recovery, but replicas are unavailable during sync
sentinel failover-timeout180000Failover timeout in ms. After this, Sentinel considers failover failed and allows another Sentinel to retry. Set at least 2x replication timeout
sentinel auth-pass-Master password. Sentinel uses this to authenticate to master and replicas
sentinel auth-user-Redis 6+ ACL username, used with auth-pass
sentinel deny-scripts-reconfigyesPrevent changing notification script paths via SENTINEL SET. Must be enabled in production for security
sentinel notification-script-Script executed on state changes. Receives event type and description as arguments
sentinel client-reconfig-script-Script executed after failover with old/new master addresses. Useful for updating DNS or load balancers
sentinel resolve-hostnamesnoRedis 6.2+, allows hostnames instead of IPs, useful for containers and DNS environments

4. Sentinel API Commands

Command Description
SENTINEL mastersList all monitored masters and their state
SENTINEL master <name>Return detailed info for a specific master (IP, port, flags, replica count, etc.)
SENTINEL replicas <name>List all replicas of a master
SENTINEL sentinels <name>List other Sentinel instances monitoring this master
SENTINEL get-master-addr-by-name <name>Return current master IP and port (used by clients for service discovery)
SENTINEL ckquorum <name>Check if enough Sentinels are online to reach quorum
SENTINEL failover <name>Manually trigger failover (no agreement from other Sentinels needed)
SENTINEL reset <pattern>Reset state for matching masters, clearing known replicas and Sentinels
SENTINEL flushconfigForce write running configuration to sentinel.conf
SENTINEL pending-scriptsList pending notification scripts in queue
SENTINEL myidReturn the unique ID of this Sentinel instance

5. Failover Process

SDOWN ODOWN Leader Election Replica Selection Promotion Reconfiguration
Phase Details
SDOWNSubjective Down: a single Sentinel has not received a valid PING reply from the master within down-after-milliseconds. This is a unilateral judgment and does not trigger failover.
ODOWNObjective Down: when quorum Sentinels report SDOWN, the state escalates to ODOWN. Sentinels confirm via SENTINEL is-master-down-by-addr messages.
Leader ElectionUsing a Raft-like algorithm, Sentinels vote to elect a leader to perform failover. Requires majority (N/2+1) votes. Each Sentinel votes only once per epoch to prevent duplicate elections.
Replica SelectionThe leader Sentinel selects the best replica by: priority (replica-priority, lower = higher, 0 = never promote), replication offset (most up-to-date data), then Run ID (lexicographically smallest) as tiebreaker.
PromotionSends REPLICAOF NO ONE to the chosen replica, making it a standalone master. Waits for confirmation that role switch is complete.
ReconfigurationAll other replicas execute REPLICAOF <new-master> to follow the new master. Sentinel updates its config and publishes +switch-master via Pub/Sub to notify clients.
There is a brief write unavailability window during failover (typically 1-3 seconds). Asynchronous replication means recently written data (not yet synced to replicas) may be lost. Use min-replicas-to-write and min-replicas-max-lag to limit the data loss window.

6. Client Connection

Node.js (ioredis)

const Redis = require("ioredis"); const client = new Redis({ sentinels: [ { host: "sentinel1", port: 26379 }, { host: "sentinel2", port: 26379 }, { host: "sentinel3", port: 26379 }, ], name: "mymaster", password: "YourStrongPassword", sentinelPassword: "SentinelPass", // Redis 6+ sentinel ACL role: "master", enableTLSForSentinelMode: false, sentinelRetryStrategy: (times) => Math.min(times * 100, 3000), }); client.on("error", (err) => console.error("Redis error:", err));

Python (redis-py)

from redis.sentinel import Sentinel sentinel = Sentinel( [("sentinel1", 26379), ("sentinel2", 26379), ("sentinel3", 26379)], socket_timeout=0.5, sentinel_kwargs={"password": "SentinelPass"}, password="YourStrongPassword", ) master = sentinel.master_for("mymaster", socket_timeout=0.5) replica = sentinel.slave_for("mymaster", socket_timeout=0.5) master.set("key", "value") print(replica.get("key")) # read from replica

Java (Lettuce)

RedisURI uri = RedisURI.Builder .sentinel("sentinel1", 26379, "mymaster") .withSentinel("sentinel2", 26379) .withSentinel("sentinel3", 26379) .withPassword("YourStrongPassword".toCharArray()) .build(); RedisClient client = RedisClient.create(uri); StatefulRedisMasterReplicaConnection<String, String> conn = MasterReplica.connect(client, StringCodec.UTF8, uri); conn.setReadFrom(ReadFrom.REPLICA_PREFERRED);

Go (go-redis)

import "github.com/redis/go-redis/v9" client := redis.NewFailoverClient(&redis.FailoverOptions{ MasterName: "mymaster", SentinelAddrs: []string{"sentinel1:26379", "sentinel2:26379", "sentinel3:26379"}, SentinelPassword: "SentinelPass", Password: "YourStrongPassword", DB: 0, DialTimeout: 5 * time.Second, ReadTimeout: 3 * time.Second, WriteTimeout: 3 * time.Second, })

7. Sentinel vs Redis Cluster

Feature Sentinel Redis Cluster
Primary UseHigh availability (HA)HA + data sharding
Data ShardingNo, single master holds all dataAuto-sharded across 16384 hash slots
Data CapacityLimited to single-node memoryScales linearly
Min Nodes3 Sentinel + 1M + 2R = 63 masters + 3 replicas = 6
Failover Time~5-15s~1-5s
Client ComplexityRequires Sentinel-aware clientRequires Cluster-aware client, handles MOVED/ASK redirects
Multi-key OpsFully supported (single master)Only for same-slot keys, use {hash_tag}
Best ForDatasets < 25GB, simple HA neededLarge datasets, horizontal scaling

8. Production Best Practices

Network & Deployment Topology

Deploy 3 or 5 Sentinels across different availability zones or racks. Avoid co-locating Sentinel with Redis data nodes on the same machine -- if that machine dies, you lose both a voter and a monitored instance. Use an odd number of Sentinels (3 or 5); even numbers add complexity without improving fault tolerance.

Quorum Sizing

Quorum controls the ODOWN detection threshold, but failover election requires a majority (N/2+1). 3 nodes quorum=2: tolerates 1 Sentinel failure. 5 nodes quorum=3: tolerates 2 failures, suitable for 3-datacenter deployments.

Monitoring & Alerting

# Subscribe to all Sentinel events for monitoring redis-cli -p 26379 PSUBSCRIBE '*' # Key events to alert on: # +sdown / -sdown — SDOWN state changes # +odown / -odown — ODOWN state changes # +switch-master — master address changed (failover completed) # +failover-end — failover finished # -failover-abort-* — failover aborted (investigate!) # Check quorum health periodically redis-cli -p 26379 SENTINEL ckquorum mymaster

Security: TLS + ACL

# sentinel.conf — TLS (Redis 6+) tls-port 26380 port 0 tls-cert-file /etc/redis/tls/sentinel.crt tls-key-file /etc/redis/tls/sentinel.key tls-ca-cert-file /etc/redis/tls/ca.crt tls-replication yes tls-auth-clients optional # Redis 6+ ACL: create sentinel-specific user # On master/replica redis.conf: user sentinel-user on >SentinelPass ~* +ping +info +multi +exec +subscribe +psubscribe +publish +slaveof +replicaof +config|rewrite +config|set +client|setname +client|kill +client|getname

Data Safety Settings

# On master redis.conf — reject writes if too few replicas are connected # This prevents data loss during network partitions min-replicas-to-write 1 min-replicas-max-lag 10 # seconds

9. Troubleshooting

Issue Cause & Solution
Split-Brain Network partition causes two masters to coexist. Prevention: set min-replicas-to-write 1 so the isolated old master rejects writes when it cannot reach replicas. Ensure Sentinels are deployed across at least 3 network zones.
SDOWN Flapping Sentinel repeatedly toggles between SDOWN and normal. Usually caused by down-after-milliseconds set too low or network latency jitter. Fix: increase timeout to 10000ms+, check network quality between Sentinel and master.
Failover Not Triggering Possible causes: (1) not enough Sentinels online for quorum -- verify with SENTINEL ckquorum. (2) A failed failover within failover-timeout cooldown period. (3) No eligible replicas (all have priority=0 or are disconnected).
Replication Lag After Failover Replicas of new master require full sync (RDB transfer). Mitigation: ensure repl-diskless-sync yes (default in Redis 6.0+), increase repl-backlog-size to support partial resync, set parallel-syncs 1 to avoid bandwidth saturation from all replicas syncing simultaneously.
Sentinel Config Drift Sentinel auto-rewrites sentinel.conf (adding discovered replicas and Sentinels). If config management tools (Ansible/Puppet) overwrite this file, state becomes inconsistent. Fix: only manage initial config, let Sentinel self-govern after startup.

10. FAQ

Can Sentinel monitor multiple Redis masters?

Yes. Add multiple sentinel monitor lines in sentinel.conf, each with a different name. Each master group can have its own quorum, timeout, and password settings. This is useful when providing HA for multiple independent applications.

What are the considerations for Sentinel in Docker/Kubernetes?

The main challenge in containers is NAT and port mapping. Sentinel broadcasts its discovered IP to other Sentinels, but in Docker this may be an internal container IP. Solutions: (1) Use sentinel announce-ip and sentinel announce-port to declare externally reachable addresses. (2) Redis 6.2+ supports sentinel resolve-hostnames yes for DNS names. (3) In Kubernetes, use StatefulSet + Headless Service.

Can data be lost during Sentinel failover?

Yes, because Redis uses asynchronous replication. The master accepts writes and syncs to replicas in the background. If the master crashes before sync completes, un-propagated writes are lost. Use min-replicas-to-write + min-replicas-max-lag to shrink the loss window: the master rejects writes when not enough low-lag replicas are connected, limiting maximum data loss to max-lag seconds.

What is the difference between quorum and majority?

Quorum is the minimum Sentinels needed to detect ODOWN, configurable in sentinel monitor. Majority is N/2+1, used for leader election. Even if quorum is set to 1, failover still requires a majority of Sentinels online to elect a leader. Example with 5 Sentinels + quorum=2: 2 agreements mark ODOWN, but 3 must be online to elect a leader and execute failover.

How to achieve zero-downtime Redis version upgrades?

Recommended rolling upgrade: (1) Upgrade all replicas first (restart one at a time). (2) Manually trigger SENTINEL failover mymaster to switch master to an upgraded replica. (3) Upgrade the old master (now a replica). (4) Finally, upgrade Sentinel nodes one by one. Service remains available throughout, with only 1-3 seconds write interruption during failover.

How large a Redis deployment can Sentinel handle?

Sentinel is designed for single-master scenarios where data fits in one machine's memory (typically 25-100GB). If you need more capacity than a single machine or write horizontal scaling, use Redis Cluster. Sentinel can monitor multiple master groups simultaneously (each fails over independently), but each group remains a single-master architecture.

💬 Comments