System Design Fundamentals

The System Design Process

18 min Lesson 2 of 10

The System Design Process

Experienced engineers do not sit down and immediately start drawing databases or picking technologies. They follow a repeatable, structured process that guides a vague problem statement — "design Twitter" or "build a ride-sharing platform" — into a concrete, defensible architecture. This lesson teaches that process step by step, so you can apply it in interviews, design reviews, and real projects.

Why a process matters: Without structure, teams argue about technology choices before they agree on what the system must actually do. A shared process aligns everyone on goals before diving into solutions.

The Five-Phase Process

Every system design session — whether a 45-minute interview or a three-week architecture review — maps to the same five phases:

  1. Gather and clarify requirements
  2. Estimate scale and capacity
  3. Produce a high-level design
  4. Deep dive into critical components
  5. Identify bottlenecks and mitigations

These phases are iterative, not strictly waterfall. A deep-dive finding may force you to revise your estimates. A bottleneck you discover might require revisiting the high-level design. Think of the phases as a loop with a consistent starting point.

The Five-Phase System Design Process 1. Requirements Functional & Non-Functional 2. Estimation Scale, Storage, Bandwidth 3. High-Level Components & Data Flow 4. Deep Dive Critical Path, Schema, APIs 5. Bottlenecks SPOFs, Scaling Mitigations iterate as needed
The five phases of system design — with a feedback loop back to earlier phases as new information emerges.

Phase 1 — Gather and Clarify Requirements

Before you draw a single box, spend time asking questions. Requirements fall into two buckets:

  • Functional requirements — what the system does. Example for a URL shortener: users paste a long URL and receive a short one; when someone visits the short URL they are redirected to the original.
  • Non-functional requirements — how well the system performs. Availability target (99.9% vs 99.99%), acceptable read latency (under 100 ms), durability guarantees, geographic distribution, compliance constraints.

Good questions to ask: Who are the users and how do they interact with the system? What does success look like? What are the read vs write ratios? Does data need to be durable? What is the expected growth over the next 1–2 years?

Interview tip: In a 45-minute design session, spend roughly 5 minutes on requirements. Interviewers reward candidates who ask clarifying questions before jumping to solutions — it signals real engineering judgment.

Phase 2 — Estimate Scale and Capacity

Requirements give you the what; estimation gives you the how big. You need rough numbers to make sensible technology choices. Back-of-the-envelope calculations — covered in depth in Lesson 4 — address three areas:

  • Traffic: How many requests per second? (e.g. 100 M daily active users × 10 requests/day = ~11,600 requests/second)
  • Storage: How much data will accumulate? (e.g. 100 B records × 500 bytes average = 50 TB)
  • Bandwidth: What is peak inbound and outbound throughput? Does the system serve large media files or small JSON payloads?

These numbers determine whether you need one database or a distributed cluster, whether you need a CDN, and whether a simple in-process cache is enough or you need a dedicated Redis tier.

Phase 3 — High-Level Design

With requirements and scale in hand, sketch the major components and how data flows between them. A high-level diagram typically shows: clients, load balancers, application servers, caches, databases, message queues, and external services. You are not yet deciding which specific database product to use or how to shard it — that comes next.

The goal of this phase is a shared mental model that everyone in the room can reason about. Keep it simple enough to fit on a whiteboard.

High-Level Design — URL Shortener Example Client Browser / App Load Balancer Round-Robin App Server Write Path App Server Read Path Cache Redis / Memcached Primary DB MySQL / Postgres Read Replica Scales reads replication writes reads
A high-level design for a URL shortener — separating write and read paths, with cache and read replicas.

Phase 4 — Deep Dive into Critical Components

Pick the two or three components that carry the most technical risk and drill down. For each, you should answer:

  • Data model: What does the schema look like? How is data partitioned?
  • API contract: What are the key endpoint signatures and payloads?
  • Algorithm or protocol: How does the component actually do its job? (e.g. consistent hashing for the cache layer, Snowflake IDs for the URL shortener key generation)
  • Failure modes: What happens when this component is slow or unavailable?

Focus your deep dive where the interviewer or your team has expressed the most interest. In a URL shortener, that is usually key generation (ensuring uniqueness at scale) and the redirect hot path (sub-10 ms latency target).

Common pitfall: Many candidates jump straight to the deep dive without establishing requirements or a high-level design first. This leads to over-engineered solutions that solve the wrong problem. Always anchor the deep dive in the requirements from Phase 1.

Phase 5 — Identify Bottlenecks and Mitigations

No design is perfect. In the final phase, stress-test your architecture by asking:

  • What is the single point of failure (SPOF)? How do you eliminate it? (e.g. add a standby load balancer, use primary-replica failover)
  • What happens when traffic grows 10× overnight? Which component breaks first?
  • Are there hot spots? (e.g. one shard receiving disproportionate traffic for a celebrity user)
  • Is there a thundering herd risk when a cache is cold after a restart?

For each bottleneck, propose a concrete mitigation: horizontal scaling, caching at a higher layer, message queues to absorb traffic spikes, circuit breakers to isolate failure, or geographic sharding to reduce latency for international users.

Best practice: Articulate the trade-offs of each mitigation. Adding a cache improves read latency but introduces consistency complexity. Splitting a monolith into microservices reduces coupling but adds network latency and operational overhead. Showing you understand these trade-offs is the mark of a senior engineer.

Putting It Together: A Worked Example

Imagine you are designing a real-time leaderboard for a mobile game with 50 million daily active users. The five phases unfold like this:

  1. Requirements: Show top-100 players globally; update scores within 2 seconds; read-heavy (100:1 read/write ratio).
  2. Estimation: 50 M users × 20 score updates/day = ~11,500 writes/sec; 100:1 ratio = ~1.15 M reads/sec at peak.
  3. High-level design: Score ingestion service → message queue → aggregation workers → Redis sorted set → API servers → clients.
  4. Deep dive: Redis ZADD and ZRANGE on a single sorted set; shard by region if global consistency is relaxed; TTL on stale entries.
  5. Bottlenecks: Redis sorted set SPOF — add replicas and sentinel; hot key risk for global leaderboard — consider approximate top-K with Count-Min Sketch.

Notice how each phase feeds the next and the final system is firmly grounded in the requirements stated at the start. That is the power of a repeatable process.