Project: Estimate & Sketch a System
Project: Estimate & Sketch a System
You have spent the first nine lessons of this tutorial learning the vocabulary and individual techniques of system design: gathering requirements, back-of-the-envelope estimation, latency and throughput targets, scalability patterns, trade-offs, and first diagrams. This final lesson is a capstone project — you will apply every one of those skills together on a single, realistic problem: designing a social photo-sharing service similar in scale to Instagram at its early-growth phase.
Work through each section yourself before reading the analysis. The habit of committing to an answer before checking the reference solution is what builds real design intuition.
The Problem Statement
You are asked to design the core of a photo-sharing application. Users can:
- Upload photos (JPEG/PNG, up to 10 MB each).
- View a personal home feed of photos from accounts they follow.
- Like and comment on photos.
- Follow and unfollow other users.
Non-functional targets given by the interviewer:
- 50 million daily active users (DAU).
- Each active user uploads an average of 0.1 photos per day and views 50 photos per day.
- The feed must load within 200 ms p95 on a mobile connection.
- Photo uploads can tolerate up to 1 second of latency.
- The system must target 99.9 % availability (less than 9 hours downtime per year).
- Data retention: photos are kept permanently; feed events are kept for 90 days.
Step 1: Back-of-the-Envelope Estimation
Before drawing a single box, commit to numbers. Vague intuitions ("it's a lot of reads") are not actionable; concrete estimates are.
Upload throughput
50 M DAU × 0.1 photos/day = 5 million photo uploads per day.
Spread evenly: 5 M ÷ 86,400 s ≈ 58 uploads/sec average.
Peak (assume 3× average): ≈ 175 uploads/sec.
Read / feed throughput
50 M DAU × 50 feed photos/day = 2.5 billion photo views per day.
Average: 2.5 B ÷ 86,400 ≈ 28,900 reads/sec.
Peak: ≈ 87,000 reads/sec.
Storage
5 M uploads/day × 3 MB average (after compression) = 15 TB new storage per day.
Over 5 years: 15 TB × 365 × 5 ≈ 27 PB of photo storage.
Metadata (URL, owner, timestamp, likes count): ~200 bytes per photo.
5 M × 365 × 5 × 200 B ≈ 1.8 TB of metadata over 5 years — trivial compared to binary storage.
Bandwidth
Uploads: 175 uploads/sec × 3 MB ≈ 525 MB/s inbound.
Feed reads (thumbnail ~100 KB each): 87,000 reads/sec × 100 KB ≈ 8.7 GB/s outbound.
Step 2: Key Observations That Shape the Architecture
The estimates reveal four critical insights:
- Read/write ratio is ~500:1 (87,000 reads vs. 175 writes/sec peak). The system is overwhelmingly read-heavy. Optimize the read path first.
- Photo binary storage is enormous (~27 PB over 5 years). A relational database cannot hold binary blobs at this scale efficiently. A dedicated object store (e.g., S3-compatible) is mandatory.
- 8.7 GB/s outbound bandwidth cannot come from a single origin server. A global CDN must absorb photo delivery traffic.
- Feed generation at 28,900 reads/sec with a 200 ms budget means you cannot afford to join across all follower relationships at query time. Pre-computing feeds (fan-out on write) or caching them aggressively is required.
Step 3: Identify the Services
Group responsibilities into bounded services, each with a single job:
- Upload Service — accepts photo binary from the client, validates, stores to object store, writes metadata to the DB, and publishes an event to the feed pipeline.
- Feed Service — returns a user's home feed, sourced from a pre-computed cache where possible.
- Media Service — generates thumbnails at multiple resolutions and serves signed CDN URLs.
- Social Graph Service — manages follow/unfollow relationships; queried by the feed pipeline to know whose photos to include.
- Interaction Service — handles likes and comments; these are high-volume but tolerate slightly higher latency than the feed.
Step 4: The High-Level Architecture Diagram
Step 5: Verify the Design Against Each Requirement
A design that cannot be traced back to specific requirements is incomplete. Walk through each non-functional requirement:
500:1 read/write ratio — handled
The Feed Cache (Redis, one sorted-set per user) absorbs the majority of read traffic. The CDN absorbs photo-binary delivery. Only cache misses and cache-warming events touch the database or object store.
Feed latency 200 ms p95 — handled
A Redis sorted-set lookup is sub-millisecond. Returning a list of 20 photo URLs from a pre-built feed takes <5 ms in the service, leaving the network and client rendering budget intact for the 200 ms target. The CDN ensures photos load from the geographically nearest edge node rather than the origin.
27 PB photo storage — handled
Object stores (AWS S3, Google Cloud Storage, Azure Blob) are designed for exabyte-scale binary storage with built-in geo-replication. This is precisely the workload they are engineered for; a relational database would be the wrong tool here.
8.7 GB/s outbound bandwidth — handled
No single origin can serve 8.7 GB/s economically or with globally low latency. The CDN distributes that load across hundreds of edge PoPs worldwide. The origin object store only fields cache-miss requests from the CDN — a small fraction of total traffic.
99.9 % availability — partially addressed
Every service tier should run at least two instances behind the load balancer. The Metadata DB needs a primary-replica setup with automated failover. Redis should run in Cluster or Sentinel mode. The CDN provides inherent redundancy. Write these requirements as annotations on the diagram.
Step 6: Identify and Articulate the Key Trade-offs
A strong design is not just a working design — it is one where the designer can name every major trade-off and explain why they made that choice.
Fan-out on write vs. fan-out on read for feed generation
The diagram chooses fan-out on write: when a user uploads a photo, the Feed Worker immediately pushes a reference into every follower's Redis feed cache. This makes reads instant but writes expensive — a celebrity with 10 million followers triggers 10 million cache writes per upload. For an early-growth-phase product where most accounts have <1,000 followers, fan-out on write is the right default. At celebrity scale, a hybrid approach (fan-out on read for accounts above a threshold) is the standard evolution. Name this trade-off explicitly in the interview.
Sharded SQL vs. NoSQL for metadata
The diagram shows a sharded SQL store (e.g., MySQL sharded by user_id). An alternative is a wide-column NoSQL store (Cassandra, DynamoDB). SQL offers richer query flexibility; Cassandra offers easier linear write scalability. At 175 uploads/sec, a well-tuned sharded MySQL cluster is entirely adequate. The choice unlocks if write throughput grows to hundreds of thousands per second — not today's problem.
301 vs. 302 redirect for CDN URLs
Photo URLs returned in the feed should be CDN URLs. Using signed, short-lived URLs (expiring in 1 hour) prevents hotlinking and access-control violations but requires the CDN to validate tokens on each request. This adds a small latency penalty — acceptable given the security benefit. Write this decision as an annotation on the CDN box.
Step 7: Closing the Loop — What to Say After Presenting the Diagram
After presenting any first diagram, immediately flag what is not yet covered and what the next drill-down would be:
- "I have not yet addressed abuse detection and rate limiting — those would sit in the API Gateway layer."
- "The social graph service (follow relationships) is implied but not detailed. At 50 M users with an average of 300 follows each, that is 15 billion edges — a graph database or an adjacency-list table in Cassandra is worth discussing."
- "Notification delivery (push notifications on new likes/follows) requires an async queue and a push-notification gateway — not shown here."
- "Multi-region active-active deployment for the 99.9 % target across geographies is the next tier of design."
Flagging open items proactively demonstrates that you understand the full scope of the problem and are choosing what to defer, not unaware of it.
Practice Prompts
Apply the same seven-step process to the following systems on your own. Time yourself to 20 minutes per problem:
- A real-time collaborative document editor (like Google Docs) — focus on conflict resolution and operational transformation.
- A ride-sharing dispatch system — focus on geospatial indexing and driver-location updates.
- A global e-commerce checkout service — focus on inventory consistency and payment idempotency.