Caching & Compression
Caching & Compression
Two of the highest-leverage performance levers available to a DevOps engineer both live inside Nginx: proxy caching and response compression. Together they eliminate redundant upstream work and slash the bytes shipped across the wire. At big-tech scale — where a single Nginx edge node might field 50,000 requests per second — these are not optional niceties. They are the difference between an origin that handles load comfortably and one that collapses under traffic spikes.
Proxy Caching: How Nginx Stores Upstream Responses
When Nginx sits in front of an upstream (a Node.js service, a PHP-FPM pool, a Rails app, a microservice), it can store the upstream's response on disk or in memory and replay that stored copy to subsequent clients without touching the upstream at all. This is proxy caching, configured with the proxy_cache_path and proxy_cache directives.
The storage declaration goes in the http {} block — it is a shared resource across all server blocks. The levels parameter creates a two-tier directory hierarchy so the OS does not suffer from millions of files in a single directory. keys_zone allocates a named shared-memory segment for the cache metadata index (not the bodies themselves). max_size caps disk consumption and triggers LRU eviction when hit. inactive expires entries that have not been requested recently.
HIT, MISS, BYPASS, EXPIRED, STALE, UPDATING, REVALIDATED, UNCACHEABLE. Watch this header in production to verify your cache is actually working before celebrating.Cache Keys: What Makes a Unique Cache Entry
By default Nginx builds the cache key from the full request URI including query string: $scheme$proxy_host$request_uri. This is correct for most REST APIs. For authenticated endpoints, vary the key to include the user's identity, or — better — skip caching entirely. For mobile/desktop variant content, include the User-Agent bucket.
Cache Invalidation
Cache invalidation is famously hard. In practice there are three approaches at the Nginx layer. First, let entries expire naturally via proxy_cache_valid TTLs — suitable for content that changes on a known schedule. Second, use the ngx_cache_purge module (included in the commercial Nginx Plus build, or compiled into the OSS build from the FRiNX/nginx-cache-purge repo) to issue an HTTP PURGE request. Third, tag entries with Surrogate-Key or Cache-Tag response headers and bulk-purge by tag — a pattern used at Cloudflare, Fastly, and Varnish installations.
For the common case of a CI/CD deploy: script a purge request from your pipeline immediately after the new version goes live. Never rely on TTL expiry when you have just deployed a breaking change.
Gzip Compression
Compression reduces response body size before it leaves Nginx, trading a small amount of CPU for significantly less bandwidth and faster perceived load times. For text-based payloads (HTML, JSON, CSS, JS) you can expect 60–80% size reduction. Nginx's built-in gzip module is enabled by default in most distributions.
gzip_types.Brotli Compression
Brotli (Google, 2015) achieves 15–25% better compression ratios than gzip at equivalent CPU cost for text content. All major browsers have supported it since 2017. On the Nginx side you need the ngx_brotli module, which is not compiled into the default package on most distros. On Ubuntu/Debian you can install libnginx-mod-brotli or compile from source. On Cloudflare-backed deployments, brotli compression is applied at the CDN edge automatically.
brotli --best (or Webpack's CompressionPlugin) during CI produces .br and .gz files alongside originals. With gzip_static on and brotli_static on, Nginx serves these zero-CPU-cost pre-compressed files instead of compressing on the fly — dramatically reducing edge CPU at high traffic volumes. This is standard practice at companies like Shopify and GitHub for their static asset pipelines.Combining Caching and Compression
When both proxy cache and compression are active, Nginx compresses the response after retrieving it from the upstream and before writing it to the cache, but only if the client accepts the encoding. This means the cached copy is already compressed — subsequent HIT responses are served with no CPU work. Verify this with curl -H "Accept-Encoding: gzip" -I https://api.example.com/endpoint and inspect the Content-Encoding and X-Cache-Status headers together.
X-Cache-Status: HIT in response headers. (2) Confirm Content-Encoding: br or gzip. (3) Check nginx -t passes before reload. (4) Monitor /var/log/nginx/error.log for cache permission errors — the worker process must own the cache directory. (5) Set proxy_cache_bypass $cookie_session for any authenticated session endpoints to prevent user data cross-contamination.