Software Engineering Glossary

Gossip Protocol

Also known as: Epidemic Protocol Gossip Dissemination

A gossip protocol is a way to spread information through a cluster by having each node talk to a few random peers. Updates spread like a rumor and reach every node in a few rounds. There is no central server, so the protocol keeps working even when nodes fail or the network gets messy.

Key Takeaways

  • Gossip scales to thousands of nodes because each node only talks to a handful of peers per round.
  • It is fast in practice. An update usually reaches the whole cluster in O(log N) rounds.
  • Gossip is good for membership, failure detection, and metadata. It is not good for data that needs strong consistency.
  • It survives partitions and uneven failures better than a central control plane because there is no single point of failure.

How It Works

  1. On each gossip tick, every node picks one or a few random peers.
  2. It sends them a summary of its current view, like membership, version vectors, and heartbeat counters.
  3. The peer compares its own state, takes anything newer, and sends back anything the sender is missing.
  4. Both nodes end up with the same view of that subset, and the cycle repeats.

Where It Is Used

  • Cassandra uses gossip to spread cluster membership, schema versions, and load info.
  • DynamoDB and Riak use gossip style protocols and Phi accrual failure detection for membership and anti entropy.
  • Consul, Serf, and Memberlist use SWIM, a gossip based failure detector, for service health checks.