Web Servers & Reverse Proxies

Project: A Production Nginx Front-End

35 min Lesson 10 of 28

Project: A Production Nginx Front-End

Everything in this tutorial has pointed toward this lesson. You now have the vocabulary — TLS termination, reverse proxying, upstream pools, caching, rate limiting, security headers, and tuning knobs. This project wires all of those pieces together into a single, battle-tested Nginx configuration that you could deploy in front of a real application today. The sample app is a Node.js/Python/PHP process listening on 127.0.0.1:8000, but the pattern is identical for any language or framework.

We will build the configuration in deliberate layers — first making it work, then making it correct, then making it fast and resilient. That progression mirrors how senior engineers actually iterate on production infrastructure.

Project Architecture

Production Nginx layers: TLS termination, rate limiting, caching, and proxy pass to the upstream app pool.

Step 1 — Directory Layout and Prerequisites

Before writing a single directive, establish the file layout. Big-tech environments use sites-available / sites-enabled with symlinks so each vhost is independently managed and can be disabled without touching others.

# Install Nginx and Certbot (Debian/Ubuntu)
apt-get update && apt-get install -y nginx certbot python3-certbot-nginx

# Create directory structure
mkdir -p /etc/nginx/conf.d
mkdir -p /var/cache/nginx/app_cache
mkdir -p /var/log/nginx
mkdir -p /var/www/app/public

# Set cache directory ownership
chown -R www-data:www-data /var/cache/nginx

# Obtain a TLS certificate (DNS must already point to this server)
certbot certonly --nginx -d example.com -d www.example.com \
  --non-interactive --agree-tos -m ops@example.com

Step 2 — Global nginx.conf Baseline

The global file sets worker tuning, logging format, and declares the shared cache zone. The cache zone lives here — not in the server block — because a zone is a shared memory region, not per-vhost.

# /etc/nginx/nginx.conf

user www-data;
worker_processes auto;
worker_rlimit_nofile 65535;
pid /run/nginx.pid;

events {
    worker_connections 4096;
    use epoll;
    multi_accept on;
}

http {
    # --- MIME & basics ---
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;
    sendfile      on;
    tcp_nopush    on;
    tcp_nodelay   on;

    # --- Keep-alive ---
    keepalive_timeout  65;
    keepalive_requests 1000;

    # --- Buffers (tune to match your app response sizes) ---
    client_body_buffer_size    16k;
    client_max_body_size       50m;
    proxy_buffer_size          16k;
    proxy_buffers              4 64k;
    proxy_busy_buffers_size    128k;

    # --- Gzip compression ---
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_min_length 1024;
    gzip_types text/plain text/css application/json application/javascript
               application/xml text/xml image/svg+xml font/woff2;

    # --- Rate-limiting zones (shared memory) ---
    # General API limit: 30 requests/sec per IP, burst=60
    limit_req_zone $binary_remote_addr zone=api:10m rate=30r/s;
    # Login/auth limit: 5 requests/min per IP
    limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;

    # --- Proxy cache zone (disk-backed, 1GB max) ---
    proxy_cache_path /var/cache/nginx/app_cache
                     levels=1:2
                     keys_zone=app_cache:20m
                     max_size=1g
                     inactive=60m
                     use_temp_path=off;

    # --- Structured JSON log format (easy to ship to ELK/Splunk) ---
    log_format json_combined escape=json
      '{'
        '"time":"$time_iso8601",'
        '"ip":"$remote_addr",'
        '"method":"$request_method",'
        '"uri":"$uri",'
        '"status":"$status",'
        '"bytes":"$body_bytes_sent",'
        '"rt":"$request_time",'
        '"upstream_rt":"$upstream_response_time",'
        '"cache":"$upstream_cache_status",'
        '"ua":"$http_user_agent"'
      '}';

    access_log /var/log/nginx/access.log json_combined;
    error_log  /var/log/nginx/error.log warn;

    # --- Hide Nginx version from headers ---
    server_tokens off;

    # --- Include vhost configs ---
    include /etc/nginx/conf.d/*.conf;
}

Why a JSON log format? Plain combined logs require regex parsers in every log aggregator. A structured JSON log ships directly to Elasticsearch, Datadog, or Splunk with zero transformation. Every field — upstream response time, cache hit status, method — becomes a filterable dimension. At scale, this is the difference between a 30-minute incident and a 3-minute one.

Step 3 — The Vhost Configuration

This is the complete /etc/nginx/conf.d/example.conf. Read the inline comments carefully — each directive has a reason that will matter in production.

# /etc/nginx/conf.d/example.conf

# ── Upstream pool ──────────────────────────────────────────────────────────
upstream app_backend {
    server 127.0.0.1:8000;

    # keepalive: reuse TCP connections to the app (avoids 3-way handshake per req)
    keepalive 32;
    keepalive_requests 100;
    keepalive_timeout  60s;
}

# ── HTTP → HTTPS redirect (permanent 301) ─────────────────────────────────
server {
    listen 80;
    listen [::]:80;
    server_name example.com www.example.com;

    # Allow Let\'s Encrypt ACME challenge through before redirecting
    location /.well-known/acme-challenge/ {
        root /var/www/certbot;
    }

    location / {
        return 301 https://example.com$request_uri;
    }
}

# ── www → apex redirect ───────────────────────────────────────────────────
server {
    listen 443 ssl;
    listen [::]:443 ssl;
    http2 on;
    server_name www.example.com;

    ssl_certificate     /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

    return 301 https://example.com$request_uri;
}

# ── Primary HTTPS vhost ────────────────────────────────────────────────────
server {
    listen 443 ssl;
    listen [::]:443 ssl;
    http2 on;
    server_name example.com;

    root /var/www/app/public;

    # ── TLS configuration ──────────────────────────────────────────────
    ssl_certificate     /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
    ssl_trusted_certificate /etc/letsencrypt/live/example.com/chain.pem;

    ssl_protocols             TLSv1.2 TLSv1.3;
    ssl_ciphers               ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384;
    ssl_prefer_server_ciphers off;    # TLS 1.3 ignores this; off is correct
    ssl_session_cache  shared:SSL:10m;
    ssl_session_timeout 1d;
    ssl_session_tickets off;          # Forward secrecy: disable session tickets
    ssl_stapling on;
    ssl_stapling_verify on;
    resolver 1.1.1.1 8.8.8.8 valid=300s;
    resolver_timeout 5s;

    # ── Security headers ───────────────────────────────────────────────
    add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload" always;
    add_header X-Frame-Options           "SAMEORIGIN"                                   always;
    add_header X-Content-Type-Options    "nosniff"                                      always;
    add_header Referrer-Policy           "strict-origin-when-cross-origin"              always;
    add_header Permissions-Policy        "geolocation=(), camera=(), microphone=()"     always;
    add_header Content-Security-Policy   "default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; font-src 'self'; connect-src 'self'; frame-ancestors 'self';" always;

    # ── Proxy cache defaults for this vhost ────────────────────────────
    proxy_cache             app_cache;
    proxy_cache_valid       200 302  10m;
    proxy_cache_valid       404      1m;
    proxy_cache_use_stale   error timeout updating http_500 http_502 http_503 http_504;
    proxy_cache_lock        on;
    add_header              X-Cache-Status $upstream_cache_status;

    # ── Common proxy headers ───────────────────────────────────────────
    proxy_http_version  1.1;
    proxy_set_header    Host              $host;
    proxy_set_header    X-Real-IP         $remote_addr;
    proxy_set_header    X-Forwarded-For   $proxy_add_x_forwarded_for;
    proxy_set_header    X-Forwarded-Proto $scheme;
    proxy_set_header    Connection        "";   # required for keepalive upstream

    # ── Static assets — served directly from disk ──────────────────────
    location ~* \.(css|js|woff2|woff|ttf|ico|png|jpg|jpeg|gif|svg|webp|avif|pdf|zip)$ {
        expires 1y;
        add_header Cache-Control "public, immutable";
        access_log off;       # skip logging for static assets (noise reduction)

        # Serve pre-compressed .gz files if the client accepts gzip
        gzip_static on;
    }

    # ── Auth/login endpoints — strict rate limit ───────────────────────
    location ~* ^/(login|register|password|api/auth) {
        limit_req zone=auth burst=5 nodelay;
        limit_req_status 429;

        proxy_pass http://app_backend;
        proxy_cache off;      # Never cache auth responses
    }

    # ── API endpoints — general rate limit, no caching ────────────────
    location /api/ {
        limit_req zone=api burst=60 nodelay;
        limit_req_status 429;

        proxy_pass http://app_backend;
        proxy_cache off;

        # Skip cache for API — responses are user-specific or mutation-heavy
        proxy_no_cache     1;
        proxy_cache_bypass 1;
    }

    # ── Default: proxy to app with caching enabled ─────────────────────
    location / {
        limit_req zone=api burst=60 nodelay;

        try_files $uri $uri/ @proxy;
    }

    location @proxy {
        proxy_pass http://app_backend;

        # Only cache GET/HEAD; never cache POST/PUT/DELETE
        proxy_cache_methods GET HEAD;

        # Skip cache if request has Authorization or session cookies
        proxy_cache_bypass $http_authorization $cookie_session;
        proxy_no_cache     $http_authorization $cookie_session;
    }

    # ── Healthcheck endpoint (no rate limit, no logging) ──────────────
    location = /healthz {
        access_log off;
        proxy_pass http://app_backend;
        proxy_cache off;
    }

    # ── Block common attack paths ──────────────────────────────────────
    location ~ /\. {
        deny all;             # Block .htaccess, .env, .git, etc.
    }
    location ~* \.(php|asp|aspx|jsp|cgi)$ {
        deny all;             # No PHP execution in /public on a Node/Python app
    }

    # ── Custom error pages ─────────────────────────────────────────────
    error_page 429 /errors/429.html;
    error_page 502 503 504 /errors/50x.html;
    location ^~ /errors/ {
        internal;
        root /var/www/app;
    }

    access_log /var/log/nginx/example_access.log json_combined;
    error_log  /var/log/nginx/example_error.log warn;
}

Step 4 — Validate, Reload, and Smoke-Test

# 1. Syntax check — ALWAYS do this before reloading
nginx -t

# 2. Reload without dropping connections (zero-downtime)
systemctl reload nginx

# 3. Confirm TLS and HTTP/2
curl -I https://example.com

# 4. Check HSTS header is present
curl -sI https://example.com | grep -i strict

# 5. Probe cache behaviour — first request: MISS, second: HIT
curl -sI https://example.com/about | grep X-Cache-Status
curl -sI https://example.com/about | grep X-Cache-Status

# 6. Verify rate limiting fires correctly (hammer auth endpoint)
for i in $(seq 1 20); do
  curl -o /dev/null -sw "%{http_code}\n" -X POST https://example.com/login
done
# Expect: 200 200 200 200 200 429 429 429 ...

# 7. Confirm upstream keepalive is working (look for "keepalive" in status)
curl -s http://127.0.0.1/nginx_status   # requires stub_status module

Production smoke-test checklist after every Nginx config change:

nginx -t passes with no warnings.
curl -I returns HTTP/2 200 (not 301 loop, not 502).
HSTS header is present on the HTTPS response.
X-Cache-Status: HIT on second request for a cacheable URL.
Rate limit returns 429 (not 500) on the auth path.
Static assets return Cache-Control: public, immutable.
curl https://example.com/.env returns 403.

Step 5 — Auto-Renew TLS and Reload Nginx

Certificates expire every 90 days. Certbot installs a systemd timer (certbot.timer) that runs renewal twice daily, but it does not automatically reload Nginx after renewal. You must hook into the post-renewal step.

# /etc/letsencrypt/renewal-hooks/deploy/reload-nginx.sh
#!/bin/bash
set -e
nginx -t && systemctl reload nginx

chmod +x /etc/letsencrypt/renewal-hooks/deploy/reload-nginx.sh

# Test the full renewal flow without actually renewing
certbot renew --dry-run

Common Production Failure Modes

Every pattern in this configuration exists because something broke in production. Here are the most frequent failure modes engineers encounter with this exact setup:

502 Bad Gateway after deploy: The app process restarted but the upstream keepalive pool still holds a connection to the old process FD. Fix: set proxy_next_upstream error timeout invalid_header http_502 so Nginx retries the next backend on failure.
Cache poisoning via Host header injection: An attacker sends a crafted Host header; Nginx caches a response keyed to a different domain. Fix: explicitly set proxy_cache_key "$scheme$host$request_uri" and never include user-supplied headers in the cache key.
HSTS locking out a domain: If you set a long max-age and then need to revert to HTTP (expired cert, misconfiguration), browsers will refuse to connect for the entire HSTS duration. Start with max-age=300 in staging; only set 63072000 (two years) in production when fully stable.
Rate limiting blocking legitimate users behind NAT: A corporate proxy means hundreds of users share one IP. A per-IP rate limit of 30r/s becomes 0.3r/s per user. Mitigate with $http_x_forwarded_for (if a trusted proxy sets it) or increase burst values for API routes used by desktop apps.
Certificate renewal fails silently: Certbot renews the cert but the deploy hook was not made executable. Result: Nginx keeps serving the old certificate until it expires. Always certbot renew --dry-run immediately after adding renewal hooks.

Never set add_header inside a location block and also at the server block level — they do not merge. An add_header inside a location block replaces all inherited headers from the parent server block. If you set HSTS at the server level and then add any add_header inside a location /api/ block, the HSTS header will be silently dropped for all API responses. The fix is to repeat all required security headers in every location block, or to set them only at the server level and never inside location blocks.

What You Have Built

This configuration implements every layer of a production-grade front-end: TLS 1.2/1.3 with OCSP stapling and session resumption, HTTP/2, automated certificate renewal, HSTS preloading, a full security header set, disk-backed proxy caching with stale-while-revalidate semantics, per-route rate limiting with burst headroom, upstream keepalive connection pooling, structured JSON access logs, and a clean separation between static file serving (zero app process involvement) and dynamic proxying. It is the exact pattern used across cloud-native environments at high-traffic companies — parameterized to a different upstream block and a different domain, it becomes the front-end for any application in your stack.