Databases & Storage

Database Replication

18 min Lesson 5 of 10

Database Replication

A single database server is a single point of failure. If that machine crashes, your entire application goes down. Even before an outage, a single server has a hard ceiling on the read throughput it can handle — every query, analytical report, and background job competes for the same CPU, memory, and I/O. Replication is the practice of keeping synchronized copies of your data on multiple database servers, solving both problems at once: it eliminates the single point of failure and distributes read load across many machines.

Understanding replication deeply — especially the trade-offs between synchronous and asynchronous modes — is essential for any engineer designing systems that must be both highly available and consistent.

Leader-Follower Replication

The most common replication topology is leader-follower (also called primary-replica or master-slave). The rules are simple:

One node is designated the leader (primary). All writes go to the leader — it is the single authoritative source of truth.
One or more nodes are followers (replicas). They receive a copy of every write from the leader and apply it to their own storage. Followers service read queries.
Clients route writes to the leader and reads to any follower (or the leader, depending on consistency needs).

This model is used by PostgreSQL streaming replication, MySQL binary log replication, MongoDB replica sets (in primary-secondary mode), and most managed databases (AWS RDS read replicas, Google Cloud SQL, etc.).

Leader-follower topology: all writes go to the single leader; replicas stream changes and serve read queries from the application.

How Replication Works Internally

When a write is committed on the leader, the database records it in a replication log (called the Write-Ahead Log in PostgreSQL, or the binary log in MySQL). The leader streams log entries to each follower. Each follower applies those entries in order, eventually holding an identical copy of the data.

The key question every replication system must answer is: when does the leader consider a write "done"? This gives us the two fundamental modes.

Synchronous Replication

In synchronous mode, the leader waits for acknowledgment from at least one follower before confirming the write to the client. The write is not considered complete until the follower has durably stored the data.

Guarantee: If the leader crashes immediately after confirming a write, at least one follower has the data. No committed write is ever lost.

Cost: Every write must survive a network round-trip to the follower before the client gets a response. If your leader and follower are in the same data center, this adds 1–5 ms. Across regions, it adds 50–200 ms — and that latency is on the critical path of every write.

Synchronous replication can block writes entirely. If the synchronous follower goes offline or becomes slow, the leader cannot confirm any write until that follower recovers. This is why most production systems designate at most ONE synchronous follower and make the rest asynchronous — a configuration PostgreSQL calls "synchronous_standby_names = 1".

Asynchronous Replication

In asynchronous mode, the leader confirms the write to the client immediately after writing to its own local disk. It ships the change to followers in the background — the client never waits for followers to acknowledge.

Benefit: Write latency is as low as a single-server database. Follower failures never block the leader. This is the default mode in MySQL replication and the default for AWS RDS read replicas.

Cost: If the leader crashes after confirming a write but before followers have received it, that write is lost. This is called replication lag, and it introduces a consistency risk: a client might write data to the leader and then read from a follower that has not yet applied that write, seeing stale data.

Replication lag is not a bug, it is a design choice. Under normal conditions asynchronous followers are only milliseconds behind. But under high write load or a slow network, lag can grow to seconds or even minutes. Always measure replication lag in production (e.g. pg_stat_replication.write_lag in PostgreSQL or Seconds_Behind_Source in MySQL) and alert when it exceeds your tolerance.

Synchronous vs. Asynchronous: Side-by-Side Comparison

Synchronous vs asynchronous replication: synchronous waits for a follower ACK before confirming to the client; asynchronous confirms immediately and ships the log in the background.

Failover and Promotion

When the leader fails, one of the followers must be promoted to become the new leader. This is called failover. How smoothly this happens depends on your replication mode:

With synchronous replication: the promoted follower is guaranteed to have every committed write. Failover is clean and data loss is zero.
With asynchronous replication: the follower may be slightly behind. Any writes that the leader confirmed to clients but had not yet shipped to the follower are lost. Most managed databases (RDS, Cloud SQL) accept this trade-off because the write loss window is typically sub-second under normal conditions.

After promotion, the old followers must be reconfigured to follow the new leader. In systems like PostgreSQL Patroni or MySQL Orchestrator, this reconfiguration is automatic.

Test your failover before you need it. Many teams discover their failover is broken only during an actual outage. Run scheduled failover drills (Netflix calls this "Chaos Engineering") to confirm that your application correctly reconnects to the new leader, that your monitoring detects the old leader as down within seconds, and that the total downtime is within your SLA.

Read Scaling and Read-Your-Writes Consistency

One of the key benefits of follower replicas is read scaling: you can add replicas to absorb more read traffic without touching the leader. A single leader with five followers can serve 5× the read queries of a standalone server, while writes remain on the leader.

However, this introduces a subtle consistency problem. Suppose a user updates their profile picture. The write goes to the leader. A millisecond later the same user refreshes the page — but the read is routed to a follower that has not yet applied the write. The user sees their old profile picture. This violation is called stale read or breaking read-your-writes consistency.

Common mitigations:

Route reads that the current user just wrote (within N seconds) back to the leader.
Track the replication position the client last wrote to; route reads only to followers that have caught up to that position.
Accept a small inconsistency window for data the user did not just modify (e.g. a social feed — seeing a post a second late is fine).

Semi-Synchronous Replication

MySQL offers a middle ground called semi-synchronous replication: the leader waits for at least one follower to confirm it has received the write (but not necessarily flushed it to disk). This eliminates zero data loss on leader crash at lower latency cost than fully synchronous replication, while ensuring the data has left the leader machine.

Summary

Leader-follower replication is the foundational pattern for database high availability and read scaling. The leader is the single write authority; followers stream the replication log and serve reads. The core trade-off is between synchronous mode — which guarantees no committed write is lost on failover but adds network latency to every write — and asynchronous mode — which delivers low-latency writes at the cost of a potential data loss window and eventual consistency for reads. In practice most large-scale systems use asynchronous replication by default and layer compensating techniques (routing, replication position tracking) on top to achieve acceptable consistency for their use cases.