Load Balancing with Nginx
Load Balancing with Nginx
When a single application server can no longer keep up with traffic — or when you need zero-downtime deploys and fault tolerance — you add more servers and put a load balancer in front of them. Nginx does this elegantly in the same process that terminates TLS and serves your static assets. Understanding the upstream algorithms, health-check mechanics, and session-affinity options is the difference between a load balancer that works in staging and one that holds up under Black-Friday traffic.
The upstream Block
Everything starts with an upstream block. You name the pool, list the servers, choose an algorithm, and then proxy_pass to that pool name from any location block.
proxy_http_version 1.1 and clear Connection. HTTP/1.0 (the default) closes the TCP connection after every request. HTTP/1.1 with an empty Connection header tells Nginx to keep the connection to the upstream alive, reusing sockets for subsequent requests — a significant throughput gain at scale.Upstream Balancing Algorithms
Nginx open-source ships with four algorithms. Nginx Plus adds more (least_time, random with two, zone-aware). Choose based on your workload profile:
- round-robin (default) — Each new request goes to the next server in the list, cycling through equally. Works well when requests are homogeneous in cost and servers are identical. Add the
weightparameter to skew distribution toward more powerful nodes. - least_conn — New request goes to the server with the fewest active connections. Correct choice for long-lived connections (WebSockets, streaming, slow database queries) where round-robin would pile up on one server while others sit idle.
- ip_hash — Hashes the client IP (first three octets for IPv4) and always routes that client to the same upstream. A basic form of sticky sessions — no extra cookies. Breaks when clients are behind a shared NAT or a CDN (all traffic hashes to the same origin IP).
- hash $variable [consistent] — Hash on any Nginx variable: URI, cookie value, request header. With
consistentit uses a consistent-hash ring (Ketama), so adding or removing a server only remaps a fraction of keys — useful for proxying to upstream caches.
Passive Health Checks
Nginx open-source performs passive health checks: it watches live traffic and marks a server as unhealthy only after it fails real requests. The key parameters live in the upstream server directive:
max_fails=3— how many consecutive failures before the server is considered down (default 1).fail_timeout=30s— how long to stop sending requests once the threshold is reached, and also the window in whichmax_failsare counted (default 10s).
proxy_next_upstream. Without it, a 502 from a crashing backend pod is returned directly to the user. With it, Nginx transparently retries on another server. Limit retries to non-mutating requests or be careful: retrying a POST that already committed to the database will duplicate the write.Active Health Checks (Nginx Plus / OpenResty / Upstream Check Module)
Passive checks only detect failures on live traffic. Active checks probe upstreams on a background interval, so a server is removed from rotation before a user hits it. In open-source Nginx you achieve this with the ngx_http_upstream_check_module (compiled in) or by fronting with a tool like Consul + consul-template that rewrites the upstream block. Nginx Plus has it natively via the health_check directive:
Sticky Sessions
Stateless applications — where any server can handle any request — are always preferred in cloud-native design. But legacy apps often store session data in process memory, making it mandatory that a given client always hits the same server. The diagram below shows both models:
Nginx Plus provides a sticky cookie directive. In open-source Nginx you use ip_hash or a hash on a cookie value extracted with $cookie_sessionid:
Upstream Keepalive and Connection Pooling
For high-throughput services, TCP connection setup cost adds up. The keepalive directive in the upstream block tells Nginx to cache a pool of idle connections to each upstream, reusing them across requests. This is distinct from client-facing keepalive and dramatically reduces latency on services doing thousands of requests per second.
Observing Load Balancer Behavior
The Nginx stub_status module exposes a minimal status page. For richer upstream-level metrics — active connections per server, health state, requests routed — you need Nginx Plus or a third-party module like nginx-module-vts. In production, most teams ship Nginx logs to a metrics pipeline (Prometheus + nginx-prometheus-exporter, or Datadog) and alert on upstream 5xx rates and response time percentiles rather than polling a status page.
$upstream_addr to your access log. It records which backend server handled each request, making it trivial to confirm distribution is working and to correlate errors with a specific upstream during an incident. Add $upstream_response_time alongside it to spot latency outliers per server.