Scaling & Load Balancing

Stateless vs Stateful Services

18 min Lesson 2 of 10

Stateless vs Stateful Services

One of the most consequential decisions in system design is whether a service should store state internally or delegate that state elsewhere. This single choice determines whether you can spin up ten more servers in sixty seconds — or whether adding a second server breaks everything.

What Is "State"?

State is any information that must persist between requests to serve future requests correctly. Common examples include:

A logged-in user's session data (who they are, what they are allowed to do)
Items in a shopping cart mid-checkout
A partially uploaded file buffer
A WebSocket connection's message history

A stateless service keeps none of this data in its own memory. Every incoming request carries all the information the server needs to handle it — or the server fetches that information from a shared external store. Once the response is sent, the server forgets everything about that request.

A stateful service remembers things between requests in its own process memory or local disk. Subsequent requests from the same client must reach the same server instance, or the service breaks.

Why Stateless Services Scale Horizontally

Imagine you have one application server handling 1,000 requests per minute. Traffic doubles. With a stateless service, the fix is mechanical: launch a second identical server, put a load balancer in front of both, and route requests to either one. Neither server knows or cares which requests the other is handling — they are perfectly interchangeable replicas.

With a stateful service, that move is blocked. If Server A holds Alice's session in RAM and the load balancer sends Alice's next request to Server B, Server B has no idea who Alice is. She gets logged out, or worse, sees corrupted data.

The horizontal scaling rule: You can scale a tier horizontally if and only if any server in that tier can handle any request equally well. Stateless design is what makes that true.

Stateless servers share state via an external store so any server handles any request. Stateful servers trap sessions in RAM, forcing sticky routing and blocking true horizontal scale.

The "Sticky Session" Workaround — and Why It Fails

Teams running stateful services often reach for sticky sessions (also called session affinity): the load balancer tags each client with a cookie and always routes that client to the same server. This makes the service appear to scale, but introduces serious problems:

Uneven load. High-traffic users are pinned to one server. That server heats up while others idle. The load balancer is no longer balancing.
No graceful failover. If Server A dies, all of its pinned users lose their sessions instantly. There is nothing to fall back to.
Deployment friction. Rolling deploys become dangerous: draining a server's sticky users before taking it offline requires careful orchestration.
Hard cap on scale. At extreme scale (millions of sessions), RAM fills up. You cannot add RAM faster than sessions grow.

Sticky sessions are a band-aid, not a solution. They let a stateful service survive behind two servers. They do not let it survive behind two hundred. Treat sticky sessions as a migration step while you move state to a shared store.

How to Externalise State

The correct fix is to move state out of the server process entirely, into a shared store that every server instance can reach. The three main patterns:

1. Shared Session Store (e.g., Redis)

Instead of writing session data to local RAM, write it to a Redis cluster. Every server reads and writes to the same Redis. The session ID travels in a cookie; the actual session data lives in Redis. Adding a new server costs zero migration — it just starts reading Redis too.

2. Token-Based Authentication (e.g., JWT)

Store no session on the server at all. Issue the client a cryptographically signed token (JWT) that contains the identity and permissions inline. The server validates the signature on every request — no database lookup, no shared store needed for auth. Stateless by construction.

JWT trade-off: tokens cannot be revoked until they expire. For short-lived tokens (5–15 minutes) combined with a refresh-token flow, this is usually acceptable. For long sessions that must be instantly revocable (e.g., after a password change), a Redis-backed session store is safer.

3. Delegating to the Client

Some state — UI preferences, shopping cart items — can live in the browser (localStorage, cookies, client-side state). The server becomes a pure function: input in, output out, no memory between calls. This works well for non-sensitive, user-specific presentation state.

Diagram: Externalising Session State with Redis

Three identical app servers all share one Redis session store. Any server can handle any request because session data lives in Redis, not in a server's RAM.

When Stateful Is Intentional

Not all stateful services are design mistakes. Some components are inherently stateful and are designed with that in mind:

Databases and caches — the entire job is storing state. They scale through replication, sharding, and leader-follower patterns (covered in later lessons).
Message queues — Kafka, RabbitMQ. They persist messages across brokers with replication and partitioning.
WebSocket / streaming servers — a live connection is inherently stateful. Handled by routing connection IDs through a pub/sub bus (Redis Pub/Sub, Kafka) so multiple servers can broadcast to the same client.

The distinction is that these services are purpose-built for state management and expose explicit mechanisms for replication and failover. Your application tier should not be doing that job implicitly in RAM.

Practical Checklist: Making a Service Stateless

Audit every variable your process holds between requests. Sessions, in-memory caches, file locks, connection counters — all of these are state.
Move session data to Redis (or a database-backed session store).
Replace user identity from session lookups with JWT validation, or keep thin session tokens pointing to Redis entries.
Move any per-server in-memory cache to a shared cache (Redis, Memcached).
Ensure uploaded files go to object storage (S3, GCS) — not to the server's local filesystem.
Remove any background jobs or timers that run inside the web process — move them to a dedicated queue worker.

The twelve-factor app methodology (12factor.net) encodes this principle as Factor VI: Processes are stateless and share-nothing. Reading the original twelve factors is a productive hour for any engineer building scalable services.

Key Takeaways

A stateless service can be horizontally scaled by adding identical replicas — no coordination required.
A stateful service is pinned: requests must reach the server holding their state, creating bottlenecks and fragility.
The solution is to externalise state: Redis for sessions, JWTs for identity, object storage for files, queues for async work.
Stateful infrastructure (databases, queues) is fine — they have built-in scale mechanisms. Stateful application logic is the problem to eliminate.