How the Internet Works (for System Design)
How the Internet Works (for System Design)
Before you can design a distributed system that serves millions of users, you need a firm mental model of how data actually travels across the internet. This lesson strips away the abstraction and explains the concrete mechanics: IP addresses, packets, routing, and the full journey a request makes from a browser in London to a data centre in Virginia.
IP Addresses: The Internet's Addressing Scheme
Every device connected to the internet is identified by an IP address. The two versions in use today are:
- IPv4 — 32-bit, written as four decimal octets:
93.184.216.34. Provides roughly 4.3 billion unique addresses — a number that was exhausted in 2011. - IPv6 — 128-bit, written in hex groups:
2606:2800:220:1:248:1893:25c8:1946. Provides 3.4 × 1038 addresses — effectively unlimited.
Within a private network (your home, an AWS VPC, an office LAN) devices use private IP ranges such as 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16. A NAT (Network Address Translation) gateway translates between private and public IPs at the boundary — this is why thousands of home devices share one public IP.
Packets: How Data Travels
The internet does not send data as a continuous stream. It breaks data into fixed-size chunks called packets (typically up to 1,500 bytes on Ethernet — the Maximum Transmission Unit, or MTU). Each packet carries:
- Source IP and Destination IP
- Source port and Destination port (e.g., 443 for HTTPS)
- Sequence number (so the receiver can reassemble in order)
- Payload — the actual chunk of data
- Checksum — for error detection
A 1 MB image becomes roughly 700 packets. Each packet may take a different physical route through the network and arrives independently — the destination reassembles them. This is called packet-switched networking, and it is what makes the internet resilient: if one path fails, routers reroute packets through another.
The Journey of a Request: Step by Step
Let us trace a GET https://api.example.com/users/42 request from start to finish. The path involves at least six distinct hops even for a "simple" API call.
Here is what happens at each stage:
- DNS resolution — the browser has a hostname (
api.example.com) but needs an IP. It queries a DNS resolver (covered in Lesson 2). The resolver returns104.21.80.5. This typically adds 20–120 ms on a cold cache. - TCP connection — the OS opens a TCP connection to port 443. For HTTPS this requires a TLS handshake (1–2 round trips, ~50–100 ms on a trans-Atlantic link).
- Packet routing — the TCP SYN packet leaves the home router. The NAT gateway rewrites the source IP from private (e.g.,
192.168.1.5) to the household's public IP. The packet then hops through roughly 10–20 routers, each consulting its routing table to decide the next hop. Total transit time depends on physical distance — light in fibre travels at ~200,000 km/s, so London→Virginia adds ~37 ms of speed-of-light latency alone. - Load balancer — the packet arrives at the data centre's edge. A load balancer terminates TLS (decrypts), reads the HTTP request, and forwards it to a healthy app server over the internal network.
- Application server — your code runs, queries a database, and builds a response.
- Response — the answer travels back through the same stack, re-encrypted, re-packetized, and delivered to the browser.
Latency, Bandwidth, and Throughput
Three numbers every system designer must internalise:
- Latency — time for a single packet to travel from source to destination. Dominated by physical distance (speed of light) and queueing at each hop. Typical values: same datacenter <1 ms, cross-country 20–40 ms, trans-Atlantic 70–100 ms.
- Bandwidth — maximum data volume per unit time on a link (e.g., 10 Gbps backbone, 100 Mbps home broadband).
- Throughput — the actual data volume delivered per second, which is always ≤ bandwidth and is limited by the slowest link in the chain (Bottleneck Law) and by TCP's congestion control algorithm.
L1 cache hit: ~1 ns | RAM: ~100 ns | SSD random read: ~100 µs | Network same DC: ~0.5 ms | Cross-country: ~30 ms | Trans-Atlantic: ~80 ms | HDD seek: ~10 ms.
These numbers drive every caching and data-placement decision in system design.
The OSI Model — Why It Matters in Practice
The OSI (Open Systems Interconnection) model organises networking into 7 layers. As a system designer, you mainly work with layers 3–7:
Understanding which layer a component operates at tells you a great deal about its behaviour. A Layer 4 load balancer routes based on IP/port and cannot inspect HTTP headers; an Layer 7 load balancer can route based on URL path, host header, or cookie — enabling canary deployments, A/B routing, and content-aware caching.
CIDR, Subnets, and VPC Design
In cloud environments you design your own IP address space using CIDR (Classless Inter-Domain Routing) notation. 10.0.0.0/16 means: the first 16 bits are the network prefix, giving you 216 = 65,536 addresses. A /24 subnet holds 256 addresses (254 usable — the network and broadcast addresses are reserved).
A typical production VPC design separates concerns into subnets:
- Public subnets (
10.0.1.0/24,10.0.2.0/24) — load balancers, NAT gateways, bastion hosts. These have routes to an Internet Gateway. - Private subnets (
10.0.10.0/24,10.0.11.0/24) — application servers, databases. Outbound-only via a NAT gateway; no inbound from the internet.
Key Takeaways
- A request to a remote server crosses 10–20 routing hops and involves DNS, TCP handshake, TLS negotiation, and application-layer processing — every component adds latency.
- Data travels as packets that may take different paths; TCP reassembles them in order.
- Physical distance imposes an irreducible speed-of-light floor on latency — this is why CDNs, edge caches, and multi-region deployments exist.
- Operating at the right OSI layer lets you build smarter infrastructure: L7 load balancers enable routing, auth offloading, and caching that L4 balancers cannot.
- In cloud VPCs, keep backend services in private subnets and expose only the edge (load balancers, CDN) publicly.