From Requirements to a First Diagram
From Requirements to a First Diagram
Every real system design interview or real-world architecture session begins the same way: a wall of words — functional requirements, non-functional constraints, scale estimates — and a blank whiteboard. The skill that separates a senior engineer from a junior one is the ability to translate that wall of words into a coherent high-level architecture diagram in under ten minutes. This lesson walks through that translation process step by step, using a concrete worked example.
Why Diagram Early?
A first diagram is not a detailed design document. It is a communication tool. Its purpose is to:
- Make shared assumptions visible so the whole team can challenge them.
- Reveal missing requirements before any code is written.
- Expose the most important trade-offs early, when they are cheap to change.
- Give the conversation a concrete anchor: "should the cache sit here or here?"
The Five-Step Translation Process
Turning requirements into a diagram follows a repeatable sequence:
- Identify the actors. Who or what initiates requests? (Mobile apps, browsers, third-party services, batch jobs.)
- Identify the core data flows. For each actor, what does it send and what does it expect back?
- Name the logical services. Group responsibilities into named boxes. Each box should have a single, stateable job.
- Add the infrastructure connectors. Where do you need a load balancer, a cache, a message queue, or a CDN?
- Mark the data stores. One box per distinct storage concern (primary DB, search index, object store, cache).
This sequence is intentionally top-down and outside-in: start from the boundary of the system (the actors) and work inward. Do not start with the database — the database is an implementation detail of the service, not the other way around.
Worked Example: URL-Shortening Service
Requirements gathered in a prior session:
- Functional: Given a long URL, return a unique short code. Given a short code, redirect to the original URL. Optionally, allow users to choose a custom alias.
- Non-functional: 100 million new links per day (write-heavy write path, ~1,200 writes/sec peak); 10 billion redirect lookups per day (read-heavy, ~115,000 reads/sec peak); redirects must complete in under 10 ms p99; 99.99 % availability; links never expire unless the user deletes them.
- Out of scope (for this first diagram): analytics dashboard, abuse detection, rate limiting.
Step 1 — Identify the actors
Two actors: a Browser / Mobile Client that creates short links (write path) and a Browser / Mobile Client that follows short links (read path). Both are the same client type, so they collapse into one box at the edge.
Step 2 — Identify the core data flows
- Write: Client → POST /shorten → returns
short_code - Read: Client → GET /<code> → 301/302 redirect to original URL
Step 3 — Name the logical services
Two distinct responsibilities emerge: a Link Creation Service (accepts long URLs, generates codes, persists mappings) and a Redirect Service (looks up a code and issues a redirect). Keeping them separate lets you scale them independently — redirects are 100× more frequent than creations.
Step 4 — Add infrastructure connectors
The redirect path hits 115,000 reads/sec. A relational database can handle roughly 10,000–30,000 reads/sec on commodity hardware. The arithmetic demands a cache in front of the database (99 % cache-hit rate → ~1,150 DB reads/sec, easily handled). A load balancer sits in front of both services because both must be horizontally scaled. A CDN is optional for the redirect service but adds useful geographic edge caching.
Step 5 — Mark the data stores
One primary data store: a key-value or relational database mapping short_code → original_url. Given 100 million new links/day × 365 days × 5 years = 182 billion rows, a sharded relational DB or a wide-column store (Cassandra, DynamoDB) is appropriate. Add an in-memory cache (Redis) for the hot read path.
Reading the Diagram Back Against Requirements
A critical habit: once the diagram exists, read each requirement and trace where it is satisfied. This closes the loop and surfaces gaps.
- 100 M writes/day: Load Balancer distributes across multiple Link Creation Service instances. Each writes to the sharded Primary DB.
- 10 B reads/day at <10 ms p99: Redirect Service checks Redis first (sub-millisecond local lookup). Cache-hit ratio target: 99 %. Only cache misses reach the DB. ✓
- 99.99 % availability (52 min downtime/year): Every box must be replicated. The diagram implies multiple instances behind the load balancer and a DB replica — write those assumptions into your notes now, before the interview moves on. ✓
- Links never expire: No TTL logic shown — by design. ✓
Common First-Diagram Mistakes
- Drawing a monolith with one database and calling it done. Correct if the scale genuinely fits a single server — show that you checked the numbers.
- Adding services before justifying them. Every box must earn its place: "I added a separate analytics service because writing analytics on the hot redirect path would add 5–20 ms latency."
- Forgetting the data path direction. An arrow without a direction label is ambiguous. Always mark write vs. read, sync vs. async.
- Treating the first diagram as final. The first diagram should provoke questions. If no one asks "why is the cache here and not there?", the diagram is probably too vague.
From Sketch to Iteration
After presenting the first diagram, the interviewer (or your team lead) will probe the weakest point. Typical probe: "What happens when Redis goes down?" Walk through the diagram: the Redirect Service falls back to the Primary DB. Read throughput drops to ~30,000 reads/sec — enough to survive but not to handle peak. Mitigation: Redis Sentinel or Cluster for automatic failover. Now update the diagram with a small note, or draw a second diagram showing the Redis Cluster topology. Each iteration answers one question and raises the level of confidence in the design.
The discipline of first sketching broadly and then drilling into weak points is the core of the system design process. The diagram is the medium; the conversation around it is the product.