Architecture Patterns

CQRS: Separating Read and Write Models

18 min Lesson 4 of 10

CQRS: Separating Read and Write Models

Command Query Responsibility Segregation (CQRS) is an architectural pattern that splits your application into two distinct paths: one for writes (commands that change state) and one for reads (queries that return data). The insight is simple but powerful — the shape of data you need to store reliably rarely matches the shape you need to display efficiently. Forcing both through a single model creates friction that grows with every feature you add.

The Problem with a Unified Model

Imagine an e-commerce order service. Writing an order requires strict validation, business rules, and ACID guarantees. Reading orders for a customer dashboard, however, needs a pre-joined, denormalized view with product names, statuses, and totals — often fetched by millions of concurrent users. A single Order model must satisfy both. The write path slows down because the read path demands complex eager-loading; the read path adds indexes that hurt write throughput. Scaling one means scaling both, even though their bottlenecks are completely different.

Core Concept: Commands vs. Queries

Command — an intent to change state: PlaceOrder, CancelOrder, UpdateInventory. A command is validated, executed against the write model (the command side), and either succeeds or fails. It returns no data beyond an acknowledgement.
Query — a request for data: GetOrderSummary, ListOrdersByCustomer. A query reads from the read model (the query side) and returns a DTO shaped exactly for the caller. It never mutates state.

Key idea: A method should either change state or return data — never both. This principle, first articulated by Bertrand Meyer, is the foundation of CQRS.

CQRS splits the application into a write path (commands → normalized DB) and a read path (queries → denormalized read store), synchronized asynchronously via projections.

The Read Model: Projections

When a command succeeds, it emits a domain event (e.g. OrderPlaced). A projection listens to those events and builds a read-optimized view — perhaps a flat document in Elasticsearch or a materialized table in a read replica. The read model is updated asynchronously. This means the read side can be denormalized, pre-joined, and indexed exactly for the queries your UI needs — with no joins at query time.

A real example: at Shopify, product listings need data from products, variants, inventory, and pricing tables. On the write side, each is normalized. On the read side, a projection assembles a single document per listing so the storefront query is a single key-value lookup — no joins, sub-millisecond latency even at millions of requests per minute.

Scaling Each Side Independently

Because the paths are separate, you can scale them with completely different strategies:

Write side — needs strong consistency, transactional integrity. Typically a single primary relational DB (PostgreSQL, MySQL) with synchronous replication. Scale vertically or with sharding by aggregate ID.
Read side — needs high throughput, low latency. Can be a Redis cache, a read replica cluster, Elasticsearch, or a Cassandra cluster. Add replicas freely; there is no write contention.

Real-world numbers: LinkedIn's feed service handles ~1.5 billion feed reads per day. By maintaining a pre-computed, per-user feed store (the read model) updated asynchronously when connections post content (the write model), each feed read is a single cache lookup rather than a join across tens of millions of rows.

The Trade-off: Eventual Consistency

The synchronization between write and read stores is asynchronous. After a user submits an order, there may be a window — typically milliseconds to a few seconds — where the read model has not yet been updated. If the user immediately refreshes their order list, they might not see the new order yet. This is eventual consistency: the system will converge, but not instantly.

The eventual consistency window: a command is persisted immediately, but the read model reflects the change only after the projection processes the event (typically milliseconds to seconds).

When to Use CQRS

CQRS is not a default choice — it adds operational complexity. It pays off when:

Read and write loads are significantly asymmetric (e.g. 100:1 reads to writes).
The read model requires complex aggregations that are expensive to compute at query time.
You need multiple read representations of the same data (a mobile app, a dashboard, an analytics pipeline — each with a different shape).
You are already using Event Sourcing (covered in the next lesson), where projections are a natural fit.

Do not reach for CQRS prematurely. For a standard CRUD application with modest traffic, a single model with read replicas is far simpler and usually sufficient. CQRS increases the number of moving parts: you now maintain two schemas, a synchronization mechanism, and must handle projection failures. Introduce it where the read/write impedance mismatch is measurably causing pain.

CQRS in Practice: Typical Stack

A common production setup: the write side uses PostgreSQL with strict schema validation; commands are dispatched through a message bus (e.g. RabbitMQ, Kafka) and processed by handlers that validate business rules before persisting. On success, a domain event is published to the bus. Projection consumers (separate services or async workers) subscribe to these events and update an Elasticsearch cluster or Redis sorted sets that serve the read API. The read API exposes lightweight GET endpoints that do nothing but look up pre-built documents.

Teams at Netflix, Uber, and Amazon have published variations of this architecture. The specifics differ, but the core split — write to a normalized store, read from a purpose-built view — is consistent across all of them.