Caching & CDNs

Distributed Caching with Redis & Memcached

18 min Lesson 6 of 10

Distributed Caching with Redis & Memcached

A single-server cache only helps the one machine that owns it. The moment you scale horizontally — adding a second or third application server — each machine has its own local cache, and they fall out of sync instantly. One server serves a stale price; another serves the updated one. Users on different instances see different data. This is the problem distributed caching solves: a shared, in-memory data store that every application node reads from and writes to, giving the entire fleet a single, consistent view of cached state.

Distributed Cache Architecture: Multiple App Servers Sharing a Redis Cluster Load Balancer routes requests App Server 1 Node.js / PHP / Java App Server 2 Node.js / PHP / Java App Server 3 Node.js / PHP / Java Redis Cluster (Shared Cache) Primary + Replicas — consistent view for all app servers GET / SET GET / SET GET / SET Primary Database cache miss
All application servers share one Redis cluster — a cache write on Server 1 is immediately visible to Servers 2 and 3.

Why Put the Cache In-Memory?

RAM is orders of magnitude faster than disk. A Redis GET typically completes in under 1 ms on the same data-center network, while a PostgreSQL query that hits disk might take 50–200 ms. For a page that aggregates data from a dozen database calls, caching those results means serving the whole page in single-digit milliseconds instead of half a second. At 10,000 requests per second, that gap is the difference between a fleet of 5 servers and a fleet of 50.

In-memory means volatile. If a Redis node loses power, unconfigured instances lose their data. Production setups mitigate this with persistence (RDB snapshots or AOF write-ahead log) and replication, but the primary design contract is still speed, not durability. Size your cache accordingly and never store data you cannot reconstruct.

Redis vs. Memcached — Choosing the Right Tool

Both are battle-tested, extremely fast, and widely supported. The difference is scope:

  • Memcached is deliberately minimal: string key → string value, multi-threaded, no persistence, no replication built in. It does one thing and does it fast. If all you need is a plain object cache and you want the simplest possible operational profile, Memcached is a solid choice. Facebook still runs it at enormous scale for exactly this reason.
  • Redis supports rich data structures — strings, hashes, lists, sorted sets, sets, streams, bitmaps, HyperLogLogs — plus persistence, pub/sub, Lua scripting, Lua-side transactions, cluster mode, and Sentinel-based high availability. In practice, Redis has become the default choice for new systems because it can replace several separate tools at once (cache, message queue, leaderboard, session store, rate-limit counter).
Rule of thumb: choose Memcached when you want purely horizontal, multi-threaded read throughput for simple key-value blobs and your ops team prefers minimal moving parts. Choose Redis when you need data structures, persistence, pub/sub, cluster mode, or scripting — i.e., almost always.

Redis Data Structures That Change System Design

The reason Redis displaces more than just a cache is that its native types map directly onto common system-design patterns:

  • Hash — store a user profile as HSET user:42 name "Alice" role "admin" plan "pro". Update individual fields atomically without fetching and re-serialising the whole object.
  • Sorted Set (ZSET) — leaderboards: ZADD game:scores 9820 "alice". ZRANGE … WITHSCORES returns top-N in O(log N).
  • List — job queues, activity feeds. LPUSH / BRPOP implements a blocking queue without polling.
  • Set — unique visitors per day, set intersections ("users who bought both A and B").
  • Atomic countersINCR rate:ip:1.2.3.4 + EXPIRE is the canonical rate-limiter: no transactions, no locks, single round trip.

Redis Cluster and Sharding

A single Redis node maxes out at whatever RAM the host provides — typically 32–128 GB in production. Beyond that, you shard. Redis Cluster uses consistent hashing over 16,384 hash slots: each key is mapped to a slot via CRC16(key) mod 16384, and slots are divided across master nodes. Every master has one or more replicas for failover. Clients receive MOVED or ASK redirects when they hit the wrong node, and cluster-aware clients cache the slot map to avoid extra round trips.

Redis Cluster: Hash Slots Distributed Across Three Master/Replica Pairs Client cluster-aware driver Master A Slots 0 – 5460 Master B Slots 5461 – 10922 Master C Slots 10923 – 16383 Replica A async replication Replica B async replication Replica C async replication 16,384 slots distributed across masters via CRC16(key) mod 16384
Redis Cluster distributes 16,384 hash slots across master nodes; each master replicates asynchronously to a standby replica.

Operational Patterns

Connection pooling: Opening a TCP connection to Redis on every request is expensive. Every application server should maintain a persistent pool of connections (typically 10–50) and reuse them. Most Redis client libraries do this by default.

Pipelining: Sending multiple commands in a single network round trip (pipeline) reduces per-command latency dramatically when you need to fetch or set dozens of keys at once. Most clients expose a pipeline() or multi() API.

Key design: Use structured, namespaced keys — product:1234:detail, user:42:session — to keep the keyspace readable and enable pattern-based operations like SCAN. Avoid KEYS * in production; it blocks the event loop on large keyspaces.

Never use KEYS * in production. Redis is single-threaded for command execution. A KEYS * scan over millions of entries will block every other command for seconds, effectively taking the cache offline. Use SCAN with a cursor and a small COUNT hint instead — it iterates incrementally without blocking.

High Availability: Sentinel vs. Cluster

For smaller deployments that do not need sharding, Redis Sentinel provides HA without the complexity of cluster mode: one primary, one or more replicas, and three or more Sentinel processes that vote to promote a replica when the primary becomes unreachable. Clients connect to Sentinel first to discover the current primary address. Cluster mode handles both sharding and failover in one system, making it the better choice once your dataset outgrows a single node.

Both Redis and Memcached are mature, production-proven, and capable of handling millions of operations per second on commodity hardware. The decision between them — and between architectures like Sentinel, Cluster, or managed services such as Amazon ElastiCache or Google Memorystore — comes down to which combination of simplicity, features, and operational overhead fits your team and workload.