Data Consistency & Replication

Distributed Transactions & Two-Phase Commit

18 min Lesson 8 of 10

Distributed Transactions & Two-Phase Commit

A distributed transaction is an operation that must succeed or fail atomically across two or more independent services or databases. The classic example: a payment system that must (1) debit the customer account, (2) credit the merchant account, and (3) update an inventory record — all on separate databases. If step 2 succeeds but step 3 crashes, money has moved but stock is wrong. That is the core problem distributed transactions solve.

Why Local ACID Is Not Enough

Within a single database, the engine gives you ACID for free: a BEGIN … COMMIT either applies every write or rolls them all back. Across services, there is no shared transaction log. Each service can only see its own data store. Coordinating them requires a protocol that both participants can trust even when the network fails mid-way.

Two-Phase Commit (2PC)

Two-Phase Commit is the canonical solution. It introduces a neutral Coordinator (often the service that initiates the transaction) and one or more Participants (each owning a resource — database, queue, cache). The protocol runs in two phases:

Phase 1 — Prepare (Voting): The coordinator sends a PREPARE request to every participant. Each participant writes the intended changes to a durable write-ahead log, locks the relevant rows, and replies YES (ready to commit) or NO (abort).
Phase 2 — Commit or Abort: If all participants voted YES, the coordinator logs COMMIT and sends COMMIT to everyone. If any participant voted NO or timed out, the coordinator logs ABORT and sends ROLLBACK. Participants apply or undo their changes and release locks.

Two-Phase Commit: Phase 1 collects votes; Phase 2 commits only if all voted YES, otherwise rolls back.

The Blocking Problem

2PC is a blocking protocol. Consider what happens if the coordinator crashes immediately after logging COMMIT but before sending the COMMIT message to participants. The participants are stuck: they have locked their rows and cannot decide whether to commit or abort on their own. They must wait until the coordinator recovers. During that window, those rows are unavailable — potentially for minutes or longer.

Coordinator failure = indefinite lock. This is 2PC's most dangerous failure mode. In production, the coordinator must persist its decision to durable storage (e.g., a WAL or dedicated transaction log database) before sending Phase 2 messages. On recovery, it re-reads its log and retransmits the decision to any participant that has not yet acknowledged.

Real-World Usage: XA Transactions

The XA standard (defined by the Open Group) is the most common implementation of 2PC you will encounter. MySQL, PostgreSQL, Oracle, and IBM MQ all support XA. Java applications use it via javax.transaction.XAResource. The coordinator role is typically played by a transaction manager such as Atomikos, Narayana, or a JTA-compatible application server.

Example flow in a banking microservices context: the Payment Service acts as coordinator; it opens an XA transaction spanning the Accounts DB (PostgreSQL) and the Ledger DB (MySQL). Both databases perform their prepare, the coordinator commits, and both apply. From the user's perspective, the debit and credit appear simultaneously.

Performance Characteristics

A single 2PC round trip adds at minimum 2 × network RTT + 2 × disk fsync latency to every transaction. At 5 ms RTT and 1 ms fsync, that is roughly 12 ms of pure overhead before any application logic runs. For high-throughput systems (thousands of transactions per second), this compounds quickly. Locks held across the network further reduce concurrency. These costs explain why modern microservice architectures prefer eventual-consistency patterns (Sagas, covered in the next lesson) over 2PC for long-running or cross-service operations.

2PC is well-suited for: short, low-latency transactions across a small number of participants (2–3 databases in the same data center), where strict atomicity is non-negotiable — for example, financial double-entry bookkeeping or inventory reservation in a warehouse system. It is ill-suited for multi-service operations spanning the public internet or for operations that take more than a few hundred milliseconds.

Three-Phase Commit (3PC) — a Brief Note

Three-Phase Commit (3PC) attempts to solve the blocking problem by inserting a pre-commit phase between prepare and commit. If the coordinator crashes after pre-commit, participants can safely make a decision without waiting. However, 3PC introduces a new failure mode under network partition (split-brain commit vs. abort) and is rarely used in practice. Most production systems stay with 2PC and invest in high-availability coordinators rather than adopting 3PC's added complexity.

Alternatives and When to Use Them

Because of 2PC's blocking and performance costs, the industry has converged on these alternatives for most microservice scenarios:

Outbox Pattern + CDC: Write the event to an outbox table in the same local transaction as the business data. A Change-Data-Capture process (e.g., Debezium) tails the WAL and publishes the event. Atomic within one DB; eventual across services.
Sagas (Choreography / Orchestration): Break the distributed transaction into a sequence of local transactions, each publishing an event. Compensating transactions handle rollback. Covered in Lesson 9.
Google Spanner / CockroachDB TrueTime: Distributed databases that implement their own distributed commit protocol internally, hiding the complexity from the application and achieving serializable isolation at global scale.

Decision rule: Use 2PC when you control all participants and they are co-located (same data center, sub-millisecond RTT), when you need atomicity and can tolerate the blocking risk, and when the transaction is short-lived (under 100 ms). Otherwise, reach for Sagas or the Outbox pattern.

2PC versus Saga: choose based on latency requirements, participant topology, and tolerance for partial visibility.

Key Takeaways

2PC coordinates an atomic commit across multiple independent participants using a Prepare → Commit/Abort protocol.
The coordinator is a single point of failure; its log must be durable so it can retransmit decisions after a crash.
Participants hold locks between Phase 1 and Phase 2, creating a blocking window proportional to coordinator recovery time.
XA is the standard implementation; it is well-supported in relational databases and enterprise middleware.
For most modern microservice architectures, the Outbox Pattern and Sagas are preferable — they trade strict atomicity for availability and scalability.