Istio Architecture
Istio Architecture
Istio is the most feature-complete service mesh in production today. Its architecture is deliberately split into two planes — control plane and data plane — so that policy decisions and traffic enforcement stay decoupled. Understanding this split is the prerequisite to diagnosing every Istio failure you will ever face in production.
The Control Plane: istiod
Before Istio 1.5, the control plane was three separate processes: Pilot (xDS config), Citadel (certificate management), and Galley (config validation). In 1.5 these merged into a single binary called istiod. Running as a Deployment in istio-system, istiod handles three responsibilities:
- xDS server (Pilot subsystem): Watches Kubernetes API server for Services, Endpoints, VirtualServices, DestinationRules, etc. Translates them into xDS (Envoy Discovery Service) responses and streams them to every sidecar over gRPC long-poll connections. The four main APIs are LDS (listeners), RDS (routes), CDS (clusters), EDS (endpoints).
- CA / certificate authority (Citadel subsystem): Issues SPIFFE-compliant X.509 certificates to every workload. Each sidecar presents a CSR at startup; istiod signs it and streams the certificate and private key back. Certificates rotate automatically (default 24-hour lifetime with a 30-minute rotation buffer).
- Config validation (Galley subsystem): Runs an admission webhook that rejects malformed CRDs before they reach the API server. This is why
kubectl applyof a syntactically wrong VirtualService fails immediately rather than silently breaking traffic at runtime.
At Google-scale deployments, istiod is typically replicated to 3 replicas with a PodDisruptionBudget of 1, and the HPA fires above 70 % CPU. Its gRPC push model means that a momentary istiod blip does not drop live traffic — sidecars keep their last-known config and continue forwarding while istiod recovers.
The Data Plane: Envoy Sidecars
Envoy proxy is the universal data-plane component. Every pod in the mesh gets one injected (via the istio-proxy init container + a mutating admission webhook). The sidecar intercepts all inbound and outbound TCP traffic using iptables rules written by the init container on pod startup — port 15001 for outbound, port 15006 for inbound.
Key Envoy responsibilities in the mesh:
- mTLS termination / origination: Envoy holds the workload's SPIFFE SVID and handles TLS handshakes transparently. Application code writes plain HTTP; Envoy upgrades to mTLS at the boundary.
- Traffic shaping: VirtualService rules (retries, timeouts, weight-based routing, header matching) are rendered as Envoy route configs pushed via RDS/CDS.
- Telemetry generation: Every Envoy sidecar emits L7 metrics (requests/s, latency, error rate) to the Prometheus scrape endpoint on port 15090, and sends distributed traces via the configured exporter (Zipkin-compatible, OTLP).
- Policy enforcement: AuthorizationPolicies are compiled by istiod into Envoy RBAC filter config and pushed as xDS. The sidecar enforces allow/deny at L7, not the application.
Istio Control / Data Plane Architecture
How xDS Config Flows
When you apply a VirtualService, the chain is:
kubectl applyhits the API server; Galley's admission webhook validates the CRD.- istiod's Pilot subsystem receives the watch event from the API server.
- Pilot recomputes the xDS snapshot for affected services and pushes incremental updates (delta xDS) to every connected Envoy sidecar via open gRPC streams.
- Envoy atomically swaps its listener/route config, acknowledging the push (ACK). If Envoy rejects the config (NACK), istiod logs the error — the sidecar keeps the previous working config.
At a cluster with 500 pods this push completes in under 2 seconds for a typical change. At 5,000 pods, proper istiod horizontal scaling (≥ 3 replicas, tune PILOT_PUSH_THROTTLE) keeps it under 10 seconds.
Certificate Lifecycle
The Citadel subsystem is a full SPIFFE-compliant CA. At pod startup the istio-agent process inside the istio-proxy container generates an ephemeral key pair, creates a CSR with the pod's SPIFFE URI (spiffe://cluster.local/ns/production/sa/checkout), and sends it to istiod over a mutually authenticated gRPC channel. istiod signs it with the mesh root CA. The certificate is written to a tmpfs volume at /var/run/secrets/istio and rotated by the agent before expiry.
Production Failure Modes
The most common istiod-related incidents at scale:
- OOM kill of istiod: Each connected Envoy holds an open gRPC stream; 2,000 sidecars ≈ 2–4 GB RSS on istiod. Set memory requests/limits generously (e.g. 2Gi request, 4Gi limit) and watch the
pilot_xds_push_time_secondsmetric. - CRD write thundering herd: Deploying 50 VirtualServices simultaneously triggers 50 push cycles. Use
istioctl analyzepre-commit and batch your rollouts. - Version skew: Envoy sidecars on version N-2 relative to istiod are supported, but xDS NACK rates climb. Track
pilot_xds_push_errors_totalafter control-plane upgrades. - iptables race at pod startup: The init container writes iptables rules before the app starts, but if the sidecar container crashes and is restarted, the iptables rules remain — traffic is redirected to a non-listening port. Monitor
PILOT_ENABLE_AMBIENT(ambient mode eliminates this class of bug entirely, though it is still GA-stabilizing in 1.21+).
PILOT_DEBOUNCE_MAX (10 s default) of stale config. During a long outage, cert expiry can begin causing mTLS failures in high-churn clusters.