Architecture Patterns

Event Sourcing

18 min Lesson 5 of 10

Event Sourcing

In conventional systems, a database table stores the current state of an entity. When an order ships, you update a row. The previous state — draft, payment pending, fulfilled — is gone. Event Sourcing inverts this: instead of storing current state, you store an ordered log of every event that ever happened. Current state is a derived view, computed by replaying those events from the beginning (or from a snapshot). The log is the truth; the table is a cache.

This is not a new idea. Your bank's ledger, a Git commit history, and an accounting journal all use event sourcing. What is new is applying it deliberately to application-level state in distributed systems at scale.

The Core Model

An event is an immutable fact: something that happened. Events are named in the past tense — OrderPlaced, PaymentReceived, ItemShipped. Each event carries a payload (fields relevant to that fact), a timestamp, and a sequence number or version. Events are appended to an event store; they are never updated or deleted.

To read current state, a consumer replays the event stream: start from zero (or the latest snapshot) and apply each event to a reducer function until you reach the present. For an order:

OrderCreated  { orderId: 42, userId: 7, items: [...] }   → state: { status: "draft" }
PaymentReceived { amount: 99.00 }                          → state: { status: "paid" }
ItemShipped   { trackingId: "1Z999..." }                  → state: { status: "shipped" }

Event Sourcing flow: commands produce events that are appended to the event store; projections replay the stream to build read models.

Why Not Just Update a Row?

The answer is in what you lose with a mutable row:

Audit trail: Who changed this order to "cancelled"? When? Why? With event sourcing every mutation is a permanent, queryable fact.
Time travel: You can reconstruct exactly what an entity looked like at any point in the past. This is indispensable for debugging, compliance (GDPR, SOX), and dispute resolution.
New projections: The business wants a new analytics view you did not plan for three years ago. In a mutable database, historical data is gone. In an event store, replay the entire history into any new model you like.
Integration via events: Other services can subscribe to the event stream and maintain their own read models — eliminating the dual-write problem you saw in the CQRS lesson.

Event Sourcing and CQRS: These two patterns are deeply complementary. CQRS separates write and read models. Event Sourcing gives you a perfect write model (the log). Together they let different services project the same event stream into radically different read models — a relational table for the API, a search index for full-text, a time-series DB for analytics — all staying consistent via the shared stream.

Snapshots — Taming Replay Cost

Replaying thousands of events every time you need current state is expensive. The fix is periodic snapshots: persist the fully-reduced state at version N, then only replay events after that point. A common strategy: snapshot every 50 or 100 events. The event store retains the full history; the snapshot is just a performance cache.

Snapshots reduce replay cost: load the nearest snapshot, then apply only the events that followed it.

The Event Store in Practice

Popular choices for an event store include:

EventStoreDB — purpose-built, streams per aggregate, built-in projections, subscriptions, and snapshots. The reference implementation for event sourcing.
Apache Kafka — high-throughput log used as an event store. Topics with long retention (weeks or indefinitely) serve as the event history. Widely used when the same stream feeds both sourcing and inter-service messaging.
PostgreSQL + append-only table — a pragmatic starting point. One table with columns (aggregate_id, version, event_type, payload JSONB, occurred_at), a unique constraint on (aggregate_id, version) to detect optimistic concurrency conflicts, and a partial index on aggregate_id.

Start simple: If you are new to event sourcing, begin with a Postgres append-only events table and one or two projections. You do not need Kafka or EventStoreDB on day one. The pattern is the idea; the infrastructure can grow with you.

Trade-offs and Pitfalls

Event sourcing is a significant architectural commitment. Understand the costs:

Eventual read models: Projections are updated asynchronously. There is a window (typically milliseconds to low seconds) where a read model lags behind the event store. Design your UX to handle it — optimistic updates, or inform the user that "your changes are being processed".
Schema evolution: Events are immutable, but business rules change. Evolve event schemas carefully using upcasting (transform old event payloads on read), event versioning, or weak schema strategies (JSONB with optional fields). Never delete or alter old events.
Querying is harder: You cannot SELECT * FROM orders WHERE status = 'paid' against an event log. All ad-hoc queries must go to projections. Plan your read models ahead of time, or accept that you will replay to build new ones on demand.
Operational complexity: You now manage an event store, projection workers, and multiple read-model databases instead of one. This overhead is worth it at scale; it may not be worth it for a small team or a low-traffic service.

Do not event-source everything. Event sourcing shines for complex domain aggregates with rich history requirements (financial transactions, order lifecycle, audit-heavy workflows). For a simple user-profile CRUD or a lookup table, a normal mutable row is the right choice. Over-applying event sourcing inflates complexity without commensurate benefit.

Real-World Example: E-Commerce Order Service

A major online retailer processes 10 million orders per day. Each order aggregate emits roughly 8 events on average (created, paid, picked, packed, labelled, shipped, out-for-delivery, delivered). That is 80 million events per day — about 925 events per second at peak. EventStoreDB or a Kafka topic with 30-day retention handles this trivially. Read models built from the stream serve the customer-facing order-tracking API (Redis-backed, sub-millisecond reads), the warehouse picking dashboard (Postgres), and the fraud analytics pipeline (ClickHouse time-series).

When the fraud team wants a new signal — "flag any order where payment was received more than 10 minutes after creation" — they build a new projection by replaying the full 30 days of history. In a conventional database this data was never stored; in the event store it has always been there.