Gateway Cross-Cutting Concerns
Gateway Cross-Cutting Concerns
An API Gateway sits at the edge of your system — every request from every client passes through it before reaching any downstream service. That position makes the gateway the natural home for cross-cutting concerns: behaviours that every service needs but should not have to implement individually. The two most impactful of those concerns are authentication / authorisation and rate limiting. This lesson explains the why, the how, and the trade-offs that matter in a real distributed system.
Why Centralise These Concerns?
Without a gateway, each microservice has to verify tokens, check scopes, and enforce quotas on its own. That duplication creates three serious problems:
- Inconsistency: services written by different teams enforce rules differently, leading to security gaps or conflicting quota policies.
- Surface area: every service that accepts a raw token is a target. Compromising any one of them exposes others through token replay.
- Maintenance: rotating a signing key, changing a quota tier, or invalidating a compromised token requires touching every service instead of one.
Centralising at the gateway solves all three: one enforcement point, one place to change policy, and downstream services can trust they only receive pre-validated requests.
Authentication at the Gateway with JWT
The dominant pattern is JWT (JSON Web Token) verification at the gateway. The client obtains a signed JWT from an identity provider (Auth Service, Keycloak, Auth0, etc.), attaches it as a Bearer token, and the gateway verifies the signature and expiry before forwarding the request. Downstream services receive a plain HTTP request with the claims already extracted — they do not need a Security dependency at all.
In Spring Cloud Gateway (reactive, Spring Boot 3), you implement this as a GlobalFilter:
X-User-Id header — a malicious caller could forge it. The gateway should strip any such header from the incoming request and re-add it only after successful JWT verification. This guarantees the value is authoritative.
The JwtTokenValidator utility itself is a thin wrapper around a JWT library (e.g. jjwt or nimbus-jose-jwt). A minimal example using jjwt:
Rate Limiting at the Gateway
Rate limiting prevents a single client — whether a misbehaving integration, a misconfigured retry loop, or a deliberate attack — from overwhelming backend services. Enforcing it at the gateway means a request that exceeds the quota is rejected before it consumes any downstream CPU, database connections, or I/O.
Spring Cloud Gateway ships a built-in RequestRateLimiter filter backed by Redis (using Lua scripts for atomic token-bucket counting). Add the dependency and configure it in application.yml:
The key-resolver bean determines the bucket identity — per user, per IP, or per API key. A resolver that uses the authenticated user ID forwarded by the JWT filter:
When the bucket is empty, the gateway automatically returns HTTP 429 Too Many Requests with two diagnostic headers: X-RateLimit-Remaining and X-RateLimit-Replenish-Rate.
Combining Authentication and Rate Limiting
The two filters compose naturally because GlobalFilter order controls the pipeline. The JWT filter runs first (order -100), stamps the request with X-User-Id, and then the rate-limiter filter uses that header as the bucket key. This means rate limits are always per authenticated identity — not per IP, which is easy to spoof behind a NAT.
A full route configuration that wires both together:
Security Implications and Trade-Offs
- Gateway as single point of failure: centralising auth means a misconfigured or crashed gateway takes down the entire API surface. Run multiple replicas and write thorough integration tests for the auth filter.
- Clock skew and JWT expiry: JWT expiry is verified against the gateway's system clock. If microservices are spread across VMs with unsynchronised clocks, a token could be accepted at the gateway but appear expired to a downstream verifier. Use NTP and add a small clock-skew tolerance (
setAllowedClockSkewSeconds(30)in jjwt). - Token revocation: JWTs are stateless and cannot be revoked before their expiry. Keep access-token TTL short (5–15 minutes) and issue refresh tokens separately. For immediate revocation, maintain a deny-list in Redis that the gateway checks on every request.
- Rate limit granularity: coarse global limits protect infrastructure; fine-grained per-route limits protect individual services from being pinned. Define both.
Summary
Enforcing authentication and rate limiting at the gateway keeps downstream services clean, ensures consistent policy across the whole system, and dramatically reduces your attack surface. The pattern is: verify the JWT in a high-priority GlobalFilter, stamp trusted claims as headers, and let a Redis-backed RequestRateLimiter filter enforce quotas per authenticated identity. In the next lesson you will see how the registry, config server, and gateway collaborate as a complete infrastructure layer.