Leader Election
Also known as:
Master Election
Coordinator Election
Definition
Leader election is how a group of nodes picks one node to coordinate writes, order operations, or own a shard. The winner stays leader for some time. If it dies or stops sending heartbeats, the rest of the group runs another election to pick a new one.
Key Takeaways
- A single leader makes coordination simple. Clients have one place to send writes and ordering happens in one place.
- Most leader elections are built on top of consensus like Paxos or Raft, or on leases backed by a strongly consistent store.
- Without fencing tokens, an old leader that wakes up after a pause can keep writing. That is one cause of split brain.
- Failover time is set by the lease TTL or election timeout, not by how fast the new leader can boot.
How It Works
- Each candidate tries to claim the leader slot in some shared place, like a Raft term vote, a ZooKeeper ephemeral znode, or an etcd lease key.
- The shared place makes sure only one candidate can hold the slot at a time.
- The winner publishes itself as leader, takes a fencing token, and starts sending heartbeats to keep the lease alive.
- If heartbeats stop or the lease expires, the other nodes start a new election with a higher token.
Where It Is Used
- Kubernetes uses a Lease object to elect the active controller manager and scheduler.
- Kafka’s controller is elected through ZooKeeper or KRaft and decides who owns each partition.
- etcd, Consul, and Vault all expose a leader election helper that wraps lease and watch primitives.