Service Discovery, Config & Gateway

Gateway Cross-Cutting Concerns

18 min Lesson 8 of 12

Gateway Cross-Cutting Concerns

An API Gateway sits at the edge of your system — every request from every client passes through it before reaching any downstream service. That position makes the gateway the natural home for cross-cutting concerns: behaviours that every service needs but should not have to implement individually. The two most impactful of those concerns are authentication / authorisation and rate limiting. This lesson explains the why, the how, and the trade-offs that matter in a real distributed system.

Why Centralise These Concerns?

Without a gateway, each microservice has to verify tokens, check scopes, and enforce quotas on its own. That duplication creates three serious problems:

Inconsistency: services written by different teams enforce rules differently, leading to security gaps or conflicting quota policies.
Surface area: every service that accepts a raw token is a target. Compromising any one of them exposes others through token replay.
Maintenance: rotating a signing key, changing a quota tier, or invalidating a compromised token requires touching every service instead of one.

Centralising at the gateway solves all three: one enforcement point, one place to change policy, and downstream services can trust they only receive pre-validated requests.

Authentication at the Gateway with JWT

The dominant pattern is JWT (JSON Web Token) verification at the gateway. The client obtains a signed JWT from an identity provider (Auth Service, Keycloak, Auth0, etc.), attaches it as a Bearer token, and the gateway verifies the signature and expiry before forwarding the request. Downstream services receive a plain HTTP request with the claims already extracted — they do not need a Security dependency at all.

In Spring Cloud Gateway (reactive, Spring Boot 3), you implement this as a GlobalFilter:

import org.springframework.cloud.gateway.filter.GatewayFilterChain;
import org.springframework.cloud.gateway.filter.GlobalFilter;
import org.springframework.core.Ordered;
import org.springframework.http.HttpHeaders;
import org.springframework.http.HttpStatus;
import org.springframework.stereotype.Component;
import org.springframework.web.server.ServerWebExchange;
import reactor.core.publisher.Mono;

@Component
public class JwtAuthFilter implements GlobalFilter, Ordered {

    private final JwtTokenValidator validator;   // your JWT utility

    public JwtAuthFilter(JwtTokenValidator validator) {
        this.validator = validator;
    }

    @Override
    public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
        String path = exchange.getRequest().getPath().value();
        if (isPublicPath(path)) {
            return chain.filter(exchange);       // skip auth for /auth/**, /public/**
        }

        String authHeader = exchange.getRequest()
                .getHeaders().getFirst(HttpHeaders.AUTHORIZATION);

        if (authHeader == null || !authHeader.startsWith("Bearer ")) {
            exchange.getResponse().setStatusCode(HttpStatus.UNAUTHORIZED);
            return exchange.getResponse().setComplete();
        }

        String token = authHeader.substring(7);
        try {
            Claims claims = validator.validate(token);   // throws on bad/expired token

            // Forward claims as headers so downstream services can read them
            ServerWebExchange mutated = exchange.mutate()
                .request(r -> r.header("X-User-Id",    claims.getSubject())
                               .header("X-User-Roles", String.join(",", getRoles(claims))))
                .build();

            return chain.filter(mutated);
        } catch (JwtException ex) {
            exchange.getResponse().setStatusCode(HttpStatus.UNAUTHORIZED);
            return exchange.getResponse().setComplete();
        }
    }

    @Override
    public int getOrder() { return -100; }   // run before routing

    private boolean isPublicPath(String path) {
        return path.startsWith("/auth/") || path.startsWith("/public/");
    }
}

Why mutate the request with headers? Downstream services must never trust a user-supplied X-User-Id header — a malicious caller could forge it. The gateway should strip any such header from the incoming request and re-add it only after successful JWT verification. This guarantees the value is authoritative.

The JwtTokenValidator utility itself is a thin wrapper around a JWT library (e.g. jjwt or nimbus-jose-jwt). A minimal example using jjwt:

import io.jsonwebtoken.*;
import io.jsonwebtoken.security.Keys;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Component;
import java.security.Key;
import java.util.Base64;

@Component
public class JwtTokenValidator {

    private final Key signingKey;

    public JwtTokenValidator(@Value("${jwt.secret}") String base64Secret) {
        byte[] keyBytes = Base64.getDecoder().decode(base64Secret);
        this.signingKey = Keys.hmacShaKeyFor(keyBytes);
    }

    public Claims validate(String token) {
        return Jwts.parserBuilder()
                .setSigningKey(signingKey)
                .build()
                .parseClaimsJws(token)    // throws ExpiredJwtException, MalformedJwtException, etc.
                .getBody();
    }
}

Prefer asymmetric keys (RS256 / ES256) in production. With a shared secret (HS256) every service that needs to verify a token must also know the secret and could therefore forge tokens. With RS256 the auth service signs with its private key; the gateway and any other verifier only need the public key.

Rate Limiting at the Gateway

Rate limiting prevents a single client — whether a misbehaving integration, a misconfigured retry loop, or a deliberate attack — from overwhelming backend services. Enforcing it at the gateway means a request that exceeds the quota is rejected before it consumes any downstream CPU, database connections, or I/O.

Spring Cloud Gateway ships a built-in RequestRateLimiter filter backed by Redis (using Lua scripts for atomic token-bucket counting). Add the dependency and configure it in application.yml:

# application.yml
spring:
  cloud:
    gateway:
      routes:
        - id: order-service
          uri: lb://ORDER-SERVICE
          predicates:
            - Path=/orders/**
          filters:
            - name: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 20      # tokens added per second
                redis-rate-limiter.burstCapacity: 40      # max tokens in the bucket
                redis-rate-limiter.requestedTokens: 1     # tokens consumed per request
                key-resolver: "#{@userKeyResolver}"       # Spring SpEL bean reference

The key-resolver bean determines the bucket identity — per user, per IP, or per API key. A resolver that uses the authenticated user ID forwarded by the JWT filter:

import org.springframework.cloud.gateway.filter.ratelimit.KeyResolver;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import reactor.core.publisher.Mono;

@Configuration
public class RateLimitConfig {

    @Bean
    public KeyResolver userKeyResolver() {
        return exchange -> {
            String userId = exchange.getRequest()
                    .getHeaders().getFirst("X-User-Id");
            return Mono.just(userId != null ? userId : "anonymous");
        };
    }

    @Bean
    public KeyResolver ipKeyResolver() {
        return exchange -> Mono.just(
            exchange.getRequest().getRemoteAddress().getAddress().getHostAddress()
        );
    }
}

When the bucket is empty, the gateway automatically returns HTTP 429 Too Many Requests with two diagnostic headers: X-RateLimit-Remaining and X-RateLimit-Replenish-Rate.

In-memory rate limiting does not work in a multi-instance gateway. Each instance would maintain its own bucket, allowing a client to multiply its effective quota by the number of gateway replicas. Redis (or another shared store) is required for correct distributed enforcement. Never skip the Redis dependency in a horizontally scaled deployment.

Combining Authentication and Rate Limiting

The two filters compose naturally because GlobalFilter order controls the pipeline. The JWT filter runs first (order -100), stamps the request with X-User-Id, and then the rate-limiter filter uses that header as the bucket key. This means rate limits are always per authenticated identity — not per IP, which is easy to spoof behind a NAT.

A full route configuration that wires both together:

spring:
  cloud:
    gateway:
      default-filters:
        - name: RequestRateLimiter
          args:
            redis-rate-limiter.replenishRate: 50
            redis-rate-limiter.burstCapacity: 100
            key-resolver: "#{@userKeyResolver}"
      routes:
        - id: product-service
          uri: lb://PRODUCT-SERVICE
          predicates:
            - Path=/products/**
          # JWT check is applied globally via JwtAuthFilter @Component
          # route-level rate limit overrides the default for this service:
          filters:
            - name: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 10
                redis-rate-limiter.burstCapacity: 20
                key-resolver: "#{@userKeyResolver}"

Security Implications and Trade-Offs

Gateway as single point of failure: centralising auth means a misconfigured or crashed gateway takes down the entire API surface. Run multiple replicas and write thorough integration tests for the auth filter.
Clock skew and JWT expiry: JWT expiry is verified against the gateway's system clock. If microservices are spread across VMs with unsynchronised clocks, a token could be accepted at the gateway but appear expired to a downstream verifier. Use NTP and add a small clock-skew tolerance (setAllowedClockSkewSeconds(30) in jjwt).
Token revocation: JWTs are stateless and cannot be revoked before their expiry. Keep access-token TTL short (5–15 minutes) and issue refresh tokens separately. For immediate revocation, maintain a deny-list in Redis that the gateway checks on every request.
Rate limit granularity: coarse global limits protect infrastructure; fine-grained per-route limits protect individual services from being pinned. Define both.

Summary

Enforcing authentication and rate limiting at the gateway keeps downstream services clean, ensures consistent policy across the whole system, and dramatically reduces your attack surface. The pattern is: verify the JWT in a high-priority GlobalFilter, stamp trusted claims as headers, and let a Redis-backed RequestRateLimiter filter enforce quotas per authenticated identity. In the next lesson you will see how the registry, config server, and gateway collaborate as a complete infrastructure layer.