The Building Blocks of a System
The Building Blocks of a System
Every large-scale system — whether it serves ten million users or processes billions of events per day — is assembled from a small, well-understood set of components. Knowing what each component does, why you would reach for it, and what trade-offs it carries is the difference between a system that holds up under pressure and one that collapses. This lesson gives you that toolbox.
The Canonical Five
Before we dive into each component, look at how they fit together in a typical read-heavy web application:
1. Load Balancer
A load balancer sits in front of a pool of servers and distributes incoming requests so that no single server becomes a bottleneck. It also serves as the single public entry point, hiding internal topology from clients.
- Layer 4 (transport): Routes by IP/port — extremely fast, but cannot inspect request content.
- Layer 7 (application): Routes by URL, header, or cookie — can do path-based routing (e.g.,
/api/*→ service A,/static/*→ service B) and SSL termination. - Algorithms: round-robin, least-connections, IP-hash (sticky sessions), weighted.
Real examples: AWS ALB (L7), AWS NLB (L4), Nginx, HAProxy, Cloudflare.
2. Cache
A cache is a fast, in-memory data store placed between your application and a slower data source (typically a database). When a read-heavy endpoint fetches the same data thousands of times per second, hitting the database each time is wasteful and unsustainable. A cache absorbs those reads.
- Cache-aside (lazy loading): App checks cache first; on a miss, reads from DB and populates the cache. Most common pattern.
- Write-through: Writes go to both cache and DB simultaneously. Cache is always warm, but adds write latency.
- Write-behind (write-back): Writes go to cache only; DB is updated asynchronously. Very fast writes, but risk of data loss if cache crashes.
Key metrics: cache hit rate (aim for >90% on hot paths), eviction policy (LRU is the safe default). Real examples: Redis, Memcached, Varnish (HTTP cache).
SETNX).
3. Database
The database is the system of record — the authoritative, durable store. Its design decisions (relational vs. document vs. columnar, primary vs. replica, sharding vs. federation) shape almost everything else.
- Relational (SQL): Strong ACID guarantees, rich query language, best for structured data with complex relationships. PostgreSQL, MySQL, Aurora.
- Document (NoSQL): Flexible schema, horizontal scalability, good for hierarchical/JSON data. MongoDB, DynamoDB, Firestore.
- Read replicas: A standby copy of the primary that serves reads. Offloads 80-95% of traffic in typical read-heavy apps, with a small replication lag (usually <100 ms).
- Sharding: Horizontally partition data across multiple database nodes (e.g., users A–M on shard 1, N–Z on shard 2). Massively increases write throughput, but adds significant operational complexity.
4. Message Queue
A message queue decouples the component that produces a unit of work from the component that processes it. The producer sends a message and returns immediately; one or more consumers process it asynchronously in the background.
This is valuable whenever you have:
- Bursty workloads: A sudden spike of 10,000 image-resize jobs does not overwhelm the resizing service — the queue absorbs the burst and workers drain it at their own pace.
- Long-running tasks: Email sending, video encoding, PDF generation — anything too slow to complete synchronously in an HTTP request.
- Reliability: If the consumer crashes, the message is not lost — it stays in the queue and is retried.
Real examples: Apache Kafka (event streaming, ordered log), Amazon SQS (simple reliable queues), RabbitMQ (routing / pub-sub), Celery (task queue for Python/PHP).
5. CDN (Content Delivery Network)
A CDN is a globally distributed network of edge servers (points of presence, or PoPs) that cache and serve content from a location geographically close to the end user. Instead of every user fetching a 1 MB JavaScript bundle from your origin server in Virginia, a user in Tokyo gets it from a PoP 20 ms away.
- Static asset delivery: Images, CSS, JS, fonts — these are the primary use case. Cache-Control headers tell the CDN how long to hold a file.
- Dynamic content acceleration: Modern CDNs (Cloudflare, Fastly) can cache API responses with short TTLs, or route dynamic requests over optimised backbone networks.
- DDoS protection: CDN edge nodes absorb volumetric attacks before traffic ever reaches your origin.
Real examples: Cloudflare, AWS CloudFront, Fastly, Akamai.
Putting It Together: the Decision Heuristic
When you are designing a system and you reach a bottleneck, ask:
- Too many requests for one server? → Add a load balancer and scale horizontally.
- Too many repeated reads hitting the database? → Add a cache in front of the DB.
- Writes are slow or processing is long? → Move the work to a message queue and process it asynchronously.
- Database is the bottleneck? → Add read replicas, then consider sharding if writes are the problem.
- Static assets are slow for distant users? → Put them on a CDN.