System Design Fundamentals

Gathering & Clarifying Requirements

18 min Lesson 3 of 10

Gathering & Clarifying Requirements

One of the most underestimated skills in system design is the ability to ask the right questions before drawing a single box. In a real interview — or a real engineering project — jumping straight into architecture without understanding the problem leads to designing the wrong system entirely. This lesson teaches you to extract, classify, and prioritize requirements so that every architectural decision you make is grounded in reality.

Key idea: A system design session that starts with five minutes of sharp questions will almost always outperform one that starts with fifteen minutes of drawing diagrams. Requirements are the contract between the problem and the solution.

Two Categories of Requirements

Every system has two complementary kinds of requirements, and you must gather both before sketching an architecture.

Functional Requirements (FR)

Functional requirements describe what the system does — the features and behaviors visible to users or calling services. They answer the question: "What must this system be able to do?"

Users can upload a video up to 5 GB.
The system delivers a shortened URL that redirects to the original within 50 ms.
Riders can track their driver's live location on a map.
The feed service shows posts from people a user follows, ranked by recency.

Functional requirements directly produce the core entities (User, Video, URL, Trip) and the APIs (POST /upload, GET /{shortCode}) of your design. Get these wrong and the whole architecture solves a problem nobody asked for.

Non-Functional Requirements (NFR)

Non-functional requirements describe how well the system does it — qualities like performance, scale, reliability, and security. They answer: "What constraints must the system satisfy?"

Scale: 500 million daily active users; 10 billion URL redirects per day.
Latency: p99 read latency < 100 ms; video transcoding within 5 minutes.
Availability: 99.99 % uptime (52 minutes of downtime per year).
Consistency: Eventual consistency is acceptable for likes; strong consistency required for payment balances.
Durability: Uploaded files must survive any single datacenter failure.
Security: All user data encrypted at rest and in transit; GDPR compliance required.

NFRs are the invisible hand that forces specific architectural choices. A 99.99 % availability target rules out single-node databases. A sub-100 ms read latency rules out fetching fresh data from a remote replica every time. These numbers come from the interviewer or the product brief — never invent them.

Functional requirements define what the system does; non-functional requirements define how well it must do it. Both feed directly into architecture decisions.

Constraints and Assumptions

Beyond FRs and NFRs, you must surface two more categories explicitly:

Constraints are hard limits that are not negotiable: "We must use the existing PostgreSQL cluster," "The budget allows three cloud regions," "All data must stay within the EU." Constraints reduce your design space immediately.
Assumptions are unknowns you are making explicit: "I will assume the read-to-write ratio is 100:1," "I will assume users are globally distributed," "I will assume peak traffic is 10× average." Every assumption should be stated out loud so it can be confirmed or corrected.

Best practice: Write your assumptions on the whiteboard as you make them. This gives the interviewer a chance to correct you early — and demonstrates disciplined thinking. An unstated assumption that turns out to be wrong will invalidate hours of design work.

The Five Clarifying Questions You Must Always Ask

Different problems call for different questions, but these five cover the vast majority of system design scenarios:

Scale: "How many daily active users do we expect? What is the expected QPS (queries per second) at peak?" — This determines whether you need a monolith, a few services, or a full distributed system.
Read/Write Ratio: "Is this system read-heavy, write-heavy, or balanced?" — A 100:1 read-to-write ratio points toward aggressive caching and read replicas. A 1:1 ratio points toward optimizing write throughput.
Data Size: "How much data will we store? What is the average payload size per record?" — This drives storage technology choices and the need for sharding.
Consistency vs. Availability: "Is it acceptable for a user to see slightly stale data? What happens if two users update the same record simultaneously?" — This directly maps to the CAP theorem trade-off.
Latency SLA: "What is the acceptable response time from the user's perspective? Is there a p99 latency target?" — Sub-10 ms requires in-memory caches; sub-100 ms allows well-indexed relational databases; sub-1 s allows many more options.

A Worked Example: Designing a URL Shortener

Let us walk through the requirement-gathering phase for a URL shortener — a classic system design problem.

Functional requirements extracted through dialogue:

A user can submit a long URL and receive a short code (e.g., short.ly/aB3xK).
Anyone visiting the short URL is redirected to the original within < 50 ms.
Optional: users can create custom aliases.
Optional: URLs can expire after a configurable TTL.
Out of scope (confirmed): analytics dashboard, user accounts, paid tiers.

Non-functional requirements extracted:

100 million new URLs created per day; 10 billion redirects per day (100:1 read/write).
p99 redirect latency < 50 ms.
99.99 % availability — the service is widely embedded in external content.
URLs once created are immutable (no editing); eventual consistency for read replicas is acceptable.
Data retention: URLs stored for at least 5 years.

Assumptions stated explicitly:

Average long URL length ≈ 100 bytes; short code ≈ 7 characters.
Storage estimate: 100 M × 365 × 5 years × ≈ 500 bytes/record ≈ 91 TB over 5 years.
Peak redirect QPS ≈ 10 billion / 86,400 s ≈ 115,000 QPS (with 3× peak factor ≈ 345,000 QPS).

Note: These numbers now force concrete decisions: 345,000 redirects/second cannot be served by a single database node. We need a distributed cache (Redis cluster), read replicas, and likely consistent hashing for key routing. None of that would have been obvious without the numbers.

Requirement Gathering as a Conversation

In an interview, requirement gathering is a structured dialogue, not an interrogation. A useful mental model is the funnel: start broad, then progressively narrow.

The requirements gathering funnel: move from broad problem understanding down to precise constraints before touching the architecture.

Prioritizing: MoSCoW for System Design

Not every feature matters equally. Apply the MoSCoW framework to functional requirements:

Must Have: The system is useless without these. (URL shortening, redirect.)
Should Have: High value, but can ship in a later iteration. (Custom aliases, TTL.)
Could Have: Nice to have if time permits. (QR code generation.)
Won't Have (now): Explicitly out of scope. (Analytics dashboard, billing.)

In a 45-minute interview, designing for every "could have" is a trap. Explicitly parking features in "Won't Have" shows senior-level prioritization and keeps the architecture focused on what actually matters.

Common pitfall: Treating every stated requirement as equally important. If you design for a feature that appears once in a 10-minute requirements discussion and spend 20 minutes on its architecture, you will run out of time for the core components. Clarify priority explicitly: "Should I focus the design on the core redirect path, or is the analytics pipeline equally important?"

Documenting the Requirements

Before moving to architecture, write a brief requirement summary — even on a whiteboard, even in bullet form. A good summary looks like this:

System: URL Shortener

Functional (Must Have):
  - POST /shorten  → accept long URL, return short code
  - GET  /{code}   → redirect to original URL (<50 ms)

Functional (Should Have):
  - Custom alias support
  - URL expiry (configurable TTL)

Out of scope: analytics, user accounts, paid features

Non-Functional:
  - Scale: 100 M writes/day, 10 B reads/day (100:1)
  - Latency: p99 redirect < 50 ms
  - Availability: 99.99 %
  - Consistency: eventual reads OK; strong for writes
  - Storage: ~91 TB over 5 years

Assumptions:
  - Avg record size ~500 bytes
  - Peak read QPS ~345,000
  - Global user distribution

This document is your architecture's contract. Every decision you make from here should trace back to one of these lines. If a proposed component does not serve any requirement, remove it.

Interview tip: Spend 5-7 minutes on requirements, then confirm with the interviewer: "Does this capture what you had in mind?" This alignment checkpoint catches misunderstandings early and demonstrates collaborative engineering practice — exactly what senior engineers do on real projects.