Service Mesh: Istio & Linkerd

Istio Architecture

18 min Lesson 3 of 27

Istio Architecture

Istio is the most feature-complete service mesh in production today. Its architecture is deliberately split into two planes — control plane and data plane — so that policy decisions and traffic enforcement stay decoupled. Understanding this split is the prerequisite to diagnosing every Istio failure you will ever face in production.

The Control Plane: istiod

Before Istio 1.5, the control plane was three separate processes: Pilot (xDS config), Citadel (certificate management), and Galley (config validation). In 1.5 these merged into a single binary called istiod. Running as a Deployment in istio-system, istiod handles three responsibilities:

  • xDS server (Pilot subsystem): Watches Kubernetes API server for Services, Endpoints, VirtualServices, DestinationRules, etc. Translates them into xDS (Envoy Discovery Service) responses and streams them to every sidecar over gRPC long-poll connections. The four main APIs are LDS (listeners), RDS (routes), CDS (clusters), EDS (endpoints).
  • CA / certificate authority (Citadel subsystem): Issues SPIFFE-compliant X.509 certificates to every workload. Each sidecar presents a CSR at startup; istiod signs it and streams the certificate and private key back. Certificates rotate automatically (default 24-hour lifetime with a 30-minute rotation buffer).
  • Config validation (Galley subsystem): Runs an admission webhook that rejects malformed CRDs before they reach the API server. This is why kubectl apply of a syntactically wrong VirtualService fails immediately rather than silently breaking traffic at runtime.

At Google-scale deployments, istiod is typically replicated to 3 replicas with a PodDisruptionBudget of 1, and the HPA fires above 70 % CPU. Its gRPC push model means that a momentary istiod blip does not drop live traffic — sidecars keep their last-known config and continue forwarding while istiod recovers.

istiod is not in the hot path. Once a sidecar has received its xDS config and its mTLS certificate, it enforces policy autonomously. istiod only needs to be reachable to push deltas when config or certs change. This is the fundamental resilience property of the control/data-plane split.

The Data Plane: Envoy Sidecars

Envoy proxy is the universal data-plane component. Every pod in the mesh gets one injected (via the istio-proxy init container + a mutating admission webhook). The sidecar intercepts all inbound and outbound TCP traffic using iptables rules written by the init container on pod startup — port 15001 for outbound, port 15006 for inbound.

Key Envoy responsibilities in the mesh:

  • mTLS termination / origination: Envoy holds the workload's SPIFFE SVID and handles TLS handshakes transparently. Application code writes plain HTTP; Envoy upgrades to mTLS at the boundary.
  • Traffic shaping: VirtualService rules (retries, timeouts, weight-based routing, header matching) are rendered as Envoy route configs pushed via RDS/CDS.
  • Telemetry generation: Every Envoy sidecar emits L7 metrics (requests/s, latency, error rate) to the Prometheus scrape endpoint on port 15090, and sends distributed traces via the configured exporter (Zipkin-compatible, OTLP).
  • Policy enforcement: AuthorizationPolicies are compiled by istiod into Envoy RBAC filter config and pushed as xDS. The sidecar enforces allow/deny at L7, not the application.
# Verify sidecar injection is enabled for a namespace kubectl label namespace production istio-injection=enabled # Inspect what Envoy has received from istiod — the live xDS dump istioctl proxy-config listeners deploy/checkout -n production istioctl proxy-config routes deploy/checkout -n production --name 8080 istioctl proxy-config clusters deploy/checkout -n production # See the cert istiod pushed to a pod istioctl proxy-config secret deploy/checkout -n production # Full config dump (huge — pipe to jq) kubectl exec -n production deploy/checkout -c istio-proxy -- \ curl -s http://localhost:15000/config_dump | jq '.configs[] | select(.["@type"] | test("RouteConfiguration"))'

Istio Control / Data Plane Architecture

Istio control-plane and data-plane architecture Control Plane (istio-system) istiod Pilot (xDS) Citadel (CA) Galley (Val.) Webhook Kubernetes API Server watch CRDs (VS / DR / PA) Data Plane — Pod sidecars Pod: checkout App :8080 Envoy iptables redirect Pod: inventory App :3000 Envoy iptables redirect Pod: payment App :9000 Envoy iptables redirect xDS push mTLS mTLS
Istio control plane (istiod) pushes xDS config to every Envoy sidecar; data-plane traffic flows mTLS between sidecars without touching istiod.

How xDS Config Flows

When you apply a VirtualService, the chain is:

  1. kubectl apply hits the API server; Galley's admission webhook validates the CRD.
  2. istiod's Pilot subsystem receives the watch event from the API server.
  3. Pilot recomputes the xDS snapshot for affected services and pushes incremental updates (delta xDS) to every connected Envoy sidecar via open gRPC streams.
  4. Envoy atomically swaps its listener/route config, acknowledging the push (ACK). If Envoy rejects the config (NACK), istiod logs the error — the sidecar keeps the previous working config.

At a cluster with 500 pods this push completes in under 2 seconds for a typical change. At 5,000 pods, proper istiod horizontal scaling (≥ 3 replicas, tune PILOT_PUSH_THROTTLE) keeps it under 10 seconds.

# Check xDS sync status — any NACK'd sidecars? istioctl proxy-status # Sample output columns: # NAME CLUSTER CDS LDS EDS RDS ECDS ISTIOD VERSION # checkout-6d8b9f-xtz2p ... SYNCED SYNCED SYNCED SYNCED ... istiod-xxx 1.20.3 # payment-7c9b4f-abc1q ... STALE SYNCED SYNCED SYNCED ... istiod-xxx 1.20.3 # ^-- CDS stale means Envoy hasn't ACK'd cluster update yet # Tune push throttle for large clusters (env var on istiod Deployment) kubectl set env deployment/istiod -n istio-system PILOT_PUSH_THROTTLE=200

Certificate Lifecycle

The Citadel subsystem is a full SPIFFE-compliant CA. At pod startup the istio-agent process inside the istio-proxy container generates an ephemeral key pair, creates a CSR with the pod's SPIFFE URI (spiffe://cluster.local/ns/production/sa/checkout), and sends it to istiod over a mutually authenticated gRPC channel. istiod signs it with the mesh root CA. The certificate is written to a tmpfs volume at /var/run/secrets/istio and rotated by the agent before expiry.

Rotate the mesh root CA without downtime using Istio's CA bundle feature: add the new intermediate CA to the trust bundle, let it propagate, re-sign workload certs, then remove the old CA. Never swap the root CA atomically — that breaks mTLS for all pods that haven't re-handshaked yet.

Production Failure Modes

The most common istiod-related incidents at scale:

  • OOM kill of istiod: Each connected Envoy holds an open gRPC stream; 2,000 sidecars ≈ 2–4 GB RSS on istiod. Set memory requests/limits generously (e.g. 2Gi request, 4Gi limit) and watch the pilot_xds_push_time_seconds metric.
  • CRD write thundering herd: Deploying 50 VirtualServices simultaneously triggers 50 push cycles. Use istioctl analyze pre-commit and batch your rollouts.
  • Version skew: Envoy sidecars on version N-2 relative to istiod are supported, but xDS NACK rates climb. Track pilot_xds_push_errors_total after control-plane upgrades.
  • iptables race at pod startup: The init container writes iptables rules before the app starts, but if the sidecar container crashes and is restarted, the iptables rules remain — traffic is redirected to a non-listening port. Monitor PILOT_ENABLE_AMBIENT (ambient mode eliminates this class of bug entirely, though it is still GA-stabilizing in 1.21+).
Never run istiod with a single replica in production. A rolling restart of a singleton istiod causes every sidecar to lose its xDS stream simultaneously. They buffer up to PILOT_DEBOUNCE_MAX (10 s default) of stale config. During a long outage, cert expiry can begin causing mTLS failures in high-churn clusters.
# Confirm istiod replica count and resource usage kubectl get hpa -n istio-system kubectl top pod -n istio-system -l app=istiod # Key Prometheus metrics to alert on # pilot_xds_push_time_seconds{quantile="0.99"} -- p99 push latency; alert > 5s # pilot_xds_push_errors_total -- NACK count; alert > 0 sustained # citadel_server_csr_sign_error_count -- cert signing failures # envoy_cluster_upstream_rq_5xx -- per-service 5xx from sidecars # Check mesh version parity istioctl version