Primary-replica vs peer-to-peer replication
One node owns writes and the rest copy it, or every node is an equal that can take writes — simplicity and order versus availability and no single bottleneck.
Two replication shapes
When you keep copies of data on multiple nodes, who is allowed to write?
- Primary-replica (leader-based) — one node (the primary) accepts writes and streams them to read-only replicas.
- Peer-to-peer (leaderless / multi-leader) — every node is an equal that can accept reads and writes; nodes gossip changes to each other.
Primary-replica
- Data flow: writes → primary → replicas (one direction).
- Wins: simple and consistent — one node orders all writes, so no write–write conflicts; replicas scale reads; clear failover target.
- Costs: the primary is a write bottleneck and a failure concern; on primary death you need a failover (election + possible data-loss window); replica reads can be stale.
- Use when: read-heavy workloads, strong-ish consistency, moderate write volume. The default for most relational databases.
Peer-to-peer
- Data flow: any node takes writes; changes propagate to peers (gossip).
- Wins: no single write bottleneck or SPOF; excellent write availability and scalability; great for multi-region (write locally anywhere).
- Costs: write conflicts when two nodes update the same key concurrently — needs resolution (last-write-wins, version vectors, CRDTs); typically eventual consistency; much harder to reason about.
- Use when: very high write volume, multi-region writes, availability above strict consistency. The Dynamo/Cassandra model (paired with quorums).
The side-by-side
| Primary-replica | Peer-to-peer | |
|---|---|---|
| Who writes | One primary | Any node |
| Consistency | Stronger, simpler | Eventual; conflicts to resolve |
| Write scaling | Limited (one primary) | Horizontal |
| SPOF on writes | Yes (until failover) | No |
| Complexity | Lower | Higher |
| Examples | Postgres/MySQL, MongoDB | Cassandra, DynamoDB, Riak |
The core trade-off
Simplicity and a single write order (primary-replica) vs write availability and no bottleneck (peer-to-peer) — bought with conflict resolution and eventual consistency. It’s CAP/PACELC again: leader-based leans CP/consistent, leaderless leans AP/available.
The interview cue
“Reads dominate and I want simple consistency, so primary with read replicas; I’ll note the primary is a write bottleneck and design failover. If this had to accept writes in every region with high availability, I’d switch to a leaderless, quorum-based store and handle conflicts with version vectors.” Picking the shape and naming its failure mode is the signal.