Database Replication
Database Replication
A single database server is a single point of failure. If that machine crashes, your entire application goes down. Even before an outage, a single server has a hard ceiling on the read throughput it can handle — every query, analytical report, and background job competes for the same CPU, memory, and I/O. Replication is the practice of keeping synchronized copies of your data on multiple database servers, solving both problems at once: it eliminates the single point of failure and distributes read load across many machines.
Understanding replication deeply — especially the trade-offs between synchronous and asynchronous modes — is essential for any engineer designing systems that must be both highly available and consistent.
Leader-Follower Replication
The most common replication topology is leader-follower (also called primary-replica or master-slave). The rules are simple:
- One node is designated the leader (primary). All writes go to the leader — it is the single authoritative source of truth.
- One or more nodes are followers (replicas). They receive a copy of every write from the leader and apply it to their own storage. Followers service read queries.
- Clients route writes to the leader and reads to any follower (or the leader, depending on consistency needs).
This model is used by PostgreSQL streaming replication, MySQL binary log replication, MongoDB replica sets (in primary-secondary mode), and most managed databases (AWS RDS read replicas, Google Cloud SQL, etc.).
How Replication Works Internally
When a write is committed on the leader, the database records it in a replication log (called the Write-Ahead Log in PostgreSQL, or the binary log in MySQL). The leader streams log entries to each follower. Each follower applies those entries in order, eventually holding an identical copy of the data.
The key question every replication system must answer is: when does the leader consider a write "done"? This gives us the two fundamental modes.
Synchronous Replication
In synchronous mode, the leader waits for acknowledgment from at least one follower before confirming the write to the client. The write is not considered complete until the follower has durably stored the data.
Guarantee: If the leader crashes immediately after confirming a write, at least one follower has the data. No committed write is ever lost.
Cost: Every write must survive a network round-trip to the follower before the client gets a response. If your leader and follower are in the same data center, this adds 1–5 ms. Across regions, it adds 50–200 ms — and that latency is on the critical path of every write.
Asynchronous Replication
In asynchronous mode, the leader confirms the write to the client immediately after writing to its own local disk. It ships the change to followers in the background — the client never waits for followers to acknowledge.
Benefit: Write latency is as low as a single-server database. Follower failures never block the leader. This is the default mode in MySQL replication and the default for AWS RDS read replicas.
Cost: If the leader crashes after confirming a write but before followers have received it, that write is lost. This is called replication lag, and it introduces a consistency risk: a client might write data to the leader and then read from a follower that has not yet applied that write, seeing stale data.
pg_stat_replication.write_lag in PostgreSQL or Seconds_Behind_Source in MySQL) and alert when it exceeds your tolerance.
Synchronous vs. Asynchronous: Side-by-Side Comparison
Failover and Promotion
When the leader fails, one of the followers must be promoted to become the new leader. This is called failover. How smoothly this happens depends on your replication mode:
- With synchronous replication: the promoted follower is guaranteed to have every committed write. Failover is clean and data loss is zero.
- With asynchronous replication: the follower may be slightly behind. Any writes that the leader confirmed to clients but had not yet shipped to the follower are lost. Most managed databases (RDS, Cloud SQL) accept this trade-off because the write loss window is typically sub-second under normal conditions.
After promotion, the old followers must be reconfigured to follow the new leader. In systems like PostgreSQL Patroni or MySQL Orchestrator, this reconfiguration is automatic.
Read Scaling and Read-Your-Writes Consistency
One of the key benefits of follower replicas is read scaling: you can add replicas to absorb more read traffic without touching the leader. A single leader with five followers can serve 5× the read queries of a standalone server, while writes remain on the leader.
However, this introduces a subtle consistency problem. Suppose a user updates their profile picture. The write goes to the leader. A millisecond later the same user refreshes the page — but the read is routed to a follower that has not yet applied the write. The user sees their old profile picture. This violation is called stale read or breaking read-your-writes consistency.
Common mitigations:
- Route reads that the current user just wrote (within N seconds) back to the leader.
- Track the replication position the client last wrote to; route reads only to followers that have caught up to that position.
- Accept a small inconsistency window for data the user did not just modify (e.g. a social feed — seeing a post a second late is fine).
Semi-Synchronous Replication
MySQL offers a middle ground called semi-synchronous replication: the leader waits for at least one follower to confirm it has received the write (but not necessarily flushed it to disk). This eliminates zero data loss on leader crash at lower latency cost than fully synchronous replication, while ensuring the data has left the leader machine.
Summary
Leader-follower replication is the foundational pattern for database high availability and read scaling. The leader is the single write authority; followers stream the replication log and serve reads. The core trade-off is between synchronous mode — which guarantees no committed write is lost on failover but adds network latency to every write — and asynchronous mode — which delivers low-latency writes at the cost of a potential data loss window and eventual consistency for reads. In practice most large-scale systems use asynchronous replication by default and layer compensating techniques (routing, replication position tracking) on top to achieve acceptable consistency for their use cases.