Distributed Caching with Redis & Memcached
Distributed Caching with Redis & Memcached
A single-server cache only helps the one machine that owns it. The moment you scale horizontally — adding a second or third application server — each machine has its own local cache, and they fall out of sync instantly. One server serves a stale price; another serves the updated one. Users on different instances see different data. This is the problem distributed caching solves: a shared, in-memory data store that every application node reads from and writes to, giving the entire fleet a single, consistent view of cached state.
Why Put the Cache In-Memory?
RAM is orders of magnitude faster than disk. A Redis GET typically completes in under 1 ms on the same data-center network, while a PostgreSQL query that hits disk might take 50–200 ms. For a page that aggregates data from a dozen database calls, caching those results means serving the whole page in single-digit milliseconds instead of half a second. At 10,000 requests per second, that gap is the difference between a fleet of 5 servers and a fleet of 50.
Redis vs. Memcached — Choosing the Right Tool
Both are battle-tested, extremely fast, and widely supported. The difference is scope:
- Memcached is deliberately minimal: string key → string value, multi-threaded, no persistence, no replication built in. It does one thing and does it fast. If all you need is a plain object cache and you want the simplest possible operational profile, Memcached is a solid choice. Facebook still runs it at enormous scale for exactly this reason.
- Redis supports rich data structures — strings, hashes, lists, sorted sets, sets, streams, bitmaps, HyperLogLogs — plus persistence, pub/sub, Lua scripting, Lua-side transactions, cluster mode, and Sentinel-based high availability. In practice, Redis has become the default choice for new systems because it can replace several separate tools at once (cache, message queue, leaderboard, session store, rate-limit counter).
Redis Data Structures That Change System Design
The reason Redis displaces more than just a cache is that its native types map directly onto common system-design patterns:
- Hash — store a user profile as
HSET user:42 name "Alice" role "admin" plan "pro". Update individual fields atomically without fetching and re-serialising the whole object. - Sorted Set (ZSET) — leaderboards:
ZADD game:scores 9820 "alice".ZRANGE … WITHSCORESreturns top-N in O(log N). - List — job queues, activity feeds.
LPUSH/BRPOPimplements a blocking queue without polling. - Set — unique visitors per day, set intersections ("users who bought both A and B").
- Atomic counters —
INCR rate:ip:1.2.3.4+EXPIREis the canonical rate-limiter: no transactions, no locks, single round trip.
Redis Cluster and Sharding
A single Redis node maxes out at whatever RAM the host provides — typically 32–128 GB in production. Beyond that, you shard. Redis Cluster uses consistent hashing over 16,384 hash slots: each key is mapped to a slot via CRC16(key) mod 16384, and slots are divided across master nodes. Every master has one or more replicas for failover. Clients receive MOVED or ASK redirects when they hit the wrong node, and cluster-aware clients cache the slot map to avoid extra round trips.
Operational Patterns
Connection pooling: Opening a TCP connection to Redis on every request is expensive. Every application server should maintain a persistent pool of connections (typically 10–50) and reuse them. Most Redis client libraries do this by default.
Pipelining: Sending multiple commands in a single network round trip (pipeline) reduces per-command latency dramatically when you need to fetch or set dozens of keys at once. Most clients expose a pipeline() or multi() API.
Key design: Use structured, namespaced keys — product:1234:detail, user:42:session — to keep the keyspace readable and enable pattern-based operations like SCAN. Avoid KEYS * in production; it blocks the event loop on large keyspaces.
KEYS * scan over millions of entries will block every other command for seconds, effectively taking the cache offline. Use SCAN with a cursor and a small COUNT hint instead — it iterates incrementally without blocking.
High Availability: Sentinel vs. Cluster
For smaller deployments that do not need sharding, Redis Sentinel provides HA without the complexity of cluster mode: one primary, one or more replicas, and three or more Sentinel processes that vote to promote a replica when the primary becomes unreachable. Clients connect to Sentinel first to discover the current primary address. Cluster mode handles both sharding and failover in one system, making it the better choice once your dataset outgrows a single node.
Both Redis and Memcached are mature, production-proven, and capable of handling millions of operations per second on commodity hardware. The decision between them — and between architectures like Sentinel, Cluster, or managed services such as Amazon ElastiCache or Google Memorystore — comes down to which combination of simplicity, features, and operational overhead fits your team and workload.