Networking & Communication

How the Internet Works (for System Design)

18 min Lesson 1 of 10

How the Internet Works (for System Design)

Before you can design a distributed system that serves millions of users, you need a firm mental model of how data actually travels across the internet. This lesson strips away the abstraction and explains the concrete mechanics: IP addresses, packets, routing, and the full journey a request makes from a browser in London to a data centre in Virginia.

IP Addresses: The Internet's Addressing Scheme

Every device connected to the internet is identified by an IP address. The two versions in use today are:

  • IPv4 — 32-bit, written as four decimal octets: 93.184.216.34. Provides roughly 4.3 billion unique addresses — a number that was exhausted in 2011.
  • IPv6 — 128-bit, written in hex groups: 2606:2800:220:1:248:1893:25c8:1946. Provides 3.4 × 1038 addresses — effectively unlimited.

Within a private network (your home, an AWS VPC, an office LAN) devices use private IP ranges such as 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16. A NAT (Network Address Translation) gateway translates between private and public IPs at the boundary — this is why thousands of home devices share one public IP.

Why this matters for system design: when you deploy services inside a VPC (e.g., AWS, GCP, Azure), your application servers never need public IPs. Only the load balancer or API gateway exposes a public address. This isolates your backend from direct internet exposure.

Packets: How Data Travels

The internet does not send data as a continuous stream. It breaks data into fixed-size chunks called packets (typically up to 1,500 bytes on Ethernet — the Maximum Transmission Unit, or MTU). Each packet carries:

  • Source IP and Destination IP
  • Source port and Destination port (e.g., 443 for HTTPS)
  • Sequence number (so the receiver can reassemble in order)
  • Payload — the actual chunk of data
  • Checksum — for error detection

A 1 MB image becomes roughly 700 packets. Each packet may take a different physical route through the network and arrives independently — the destination reassembles them. This is called packet-switched networking, and it is what makes the internet resilient: if one path fails, routers reroute packets through another.

The Journey of a Request: Step by Step

Let us trace a GET https://api.example.com/users/42 request from start to finish. The path involves at least six distinct hops even for a "simple" API call.

Journey of an HTTP request from client to server Browser (Client) Home Router (NAT) ISP Router (Backbone) Internet (many routers) Load Balancer (TLS termination) App Server (your code) ① DNS lookup priv→pub IP ② routing ③ TLS ④ forward response
A request travels: Browser → NAT Router → ISP → Internet backbone → Load Balancer → App Server, then the response returns the same way.

Here is what happens at each stage:

  1. DNS resolution — the browser has a hostname (api.example.com) but needs an IP. It queries a DNS resolver (covered in Lesson 2). The resolver returns 104.21.80.5. This typically adds 20–120 ms on a cold cache.
  2. TCP connection — the OS opens a TCP connection to port 443. For HTTPS this requires a TLS handshake (1–2 round trips, ~50–100 ms on a trans-Atlantic link).
  3. Packet routing — the TCP SYN packet leaves the home router. The NAT gateway rewrites the source IP from private (e.g., 192.168.1.5) to the household's public IP. The packet then hops through roughly 10–20 routers, each consulting its routing table to decide the next hop. Total transit time depends on physical distance — light in fibre travels at ~200,000 km/s, so London→Virginia adds ~37 ms of speed-of-light latency alone.
  4. Load balancer — the packet arrives at the data centre's edge. A load balancer terminates TLS (decrypts), reads the HTTP request, and forwards it to a healthy app server over the internal network.
  5. Application server — your code runs, queries a database, and builds a response.
  6. Response — the answer travels back through the same stack, re-encrypted, re-packetized, and delivered to the browser.

Latency, Bandwidth, and Throughput

Three numbers every system designer must internalise:

  • Latency — time for a single packet to travel from source to destination. Dominated by physical distance (speed of light) and queueing at each hop. Typical values: same datacenter <1 ms, cross-country 20–40 ms, trans-Atlantic 70–100 ms.
  • Bandwidth — maximum data volume per unit time on a link (e.g., 10 Gbps backbone, 100 Mbps home broadband).
  • Throughput — the actual data volume delivered per second, which is always ≤ bandwidth and is limited by the slowest link in the chain (Bottleneck Law) and by TCP's congestion control algorithm.
Rule of thumb — latency numbers you must know:
L1 cache hit: ~1 ns | RAM: ~100 ns | SSD random read: ~100 µs | Network same DC: ~0.5 ms | Cross-country: ~30 ms | Trans-Atlantic: ~80 ms | HDD seek: ~10 ms.
These numbers drive every caching and data-placement decision in system design.

The OSI Model — Why It Matters in Practice

The OSI (Open Systems Interconnection) model organises networking into 7 layers. As a system designer, you mainly work with layers 3–7:

OSI model layers relevant to system design L7 — Application HTTP, HTTPS, WebSockets, gRPC your API logic lives here L6 — Presentation TLS/SSL encryption, JSON/gzip encoding serialisation & compression L5 — Session TCP sessions, auth sessions connection lifecycle L4 — Transport TCP, UDP — ports, reliability, flow control load balancers often work here L3 — Network IP — addressing & routing routers, VPCs, subnets L2 / L1 — Link & Physical Ethernet, Wi-Fi, fibre optics usually managed by cloud provider System designers focus here
The OSI layers most relevant to system design — L3 (IP routing), L4 (TCP/UDP), L6 (TLS), and L7 (HTTP/gRPC).

Understanding which layer a component operates at tells you a great deal about its behaviour. A Layer 4 load balancer routes based on IP/port and cannot inspect HTTP headers; an Layer 7 load balancer can route based on URL path, host header, or cookie — enabling canary deployments, A/B routing, and content-aware caching.

CIDR, Subnets, and VPC Design

In cloud environments you design your own IP address space using CIDR (Classless Inter-Domain Routing) notation. 10.0.0.0/16 means: the first 16 bits are the network prefix, giving you 216 = 65,536 addresses. A /24 subnet holds 256 addresses (254 usable — the network and broadcast addresses are reserved).

A typical production VPC design separates concerns into subnets:

  • Public subnets (10.0.1.0/24, 10.0.2.0/24) — load balancers, NAT gateways, bastion hosts. These have routes to an Internet Gateway.
  • Private subnets (10.0.10.0/24, 10.0.11.0/24) — application servers, databases. Outbound-only via a NAT gateway; no inbound from the internet.
Common mistake: placing database servers in public subnets "for convenience" during development, and forgetting to move them before production. A database with a public IP and a weak password is a critical breach waiting to happen. Always put datastores in private subnets with security-group rules that only allow traffic from your application servers.

Key Takeaways

  • A request to a remote server crosses 10–20 routing hops and involves DNS, TCP handshake, TLS negotiation, and application-layer processing — every component adds latency.
  • Data travels as packets that may take different paths; TCP reassembles them in order.
  • Physical distance imposes an irreducible speed-of-light floor on latency — this is why CDNs, edge caches, and multi-region deployments exist.
  • Operating at the right OSI layer lets you build smarter infrastructure: L7 load balancers enable routing, auth offloading, and caching that L4 balancers cannot.
  • In cloud VPCs, keep backend services in private subnets and expose only the edge (load balancers, CDN) publicly.