ReplicaSets & Deployments
ReplicaSets & Deployments
Running a single Pod works for experimentation, but it is not production-grade. If the node hosting that Pod crashes, Kubernetes does nothing — the Pod is gone and so is your service. The answer is desired state: you tell Kubernetes how many replicas you want and it continuously makes reality match that declaration. ReplicaSets implement this guarantee; Deployments wrap ReplicaSets with safe rollout and rollback mechanics. Together they are how every stateless workload — APIs, web servers, batch workers — runs at scale in production.
Desired State and the Control Loop
Kubernetes is a level-triggered system. You write a spec that declares what you want (desired state). A controller watches the cluster and acts whenever actual state diverges from desired state. The ReplicaSet controller does exactly one thing: it reconciles spec.replicas (desired) against the number of running Pods whose labels match spec.selector (actual). Too few → create Pods. Too many → delete Pods. The spec is stored in etcd; it survives node crashes, restarts, and network partitions.
Anatomy of a Deployment
A Deployment manifest has three main sections: metadata, a selector, and a Pod template. The Deployment controller creates a ReplicaSet whose Pod template matches the one you specified, then ensures the correct number of Pods are running from that template. Here is a production-realistic manifest:
spec.selector field is immutable after a Deployment is created. If you need to change labels, you must delete and recreate the Deployment. In practice, big-tech teams use a stable label key like app: api-server in the selector and add mutable metadata like version only to the Pod template labels (not to the selector). Changing the selector will reject your kubectl apply with a validation error.Rolling Updates: How Kubernetes Replaces Pods Safely
When you change anything in spec.template — a new image tag, an environment variable, a resource limit — Kubernetes creates a new ReplicaSet for the new Pod template and orchestrates a handoff between the old and new ReplicaSets. The two key levers are maxSurge and maxUnavailable.
The readiness probe is the gatekeeper. Kubernetes considers a Pod "available" only after its readiness probe passes and the Pod has been Ready for at least minReadySeconds. This means a buggy deployment that crashes on startup will stall — the new ReplicaSet never becomes available, maxUnavailable: 0 prevents the old Pods from being terminated, and you have time to notice and roll back before any traffic is lost.
Performing and Watching a Rollout
Rollbacks: Undoing a Bad Deploy
Kubernetes keeps a configurable number of old ReplicaSets around after a rollout completes — controlled by spec.revisionHistoryLimit (default 10). Rolling back restores the previous ReplicaSet's Pod template and scales it back up, repeating the rolling update process in reverse. At Google and similar companies, rollback is a normal operational action, not a crisis — the system is designed to make it fast and safe.
kubectl get replicaset output. High-frequency deploy pipelines (multiple deploys per day) should set spec.revisionHistoryLimit: 3. Keep enough history to cover your typical rollback window — if you release every hour and your MTTR is 30 minutes, 2 revisions is sufficient.Deployment Strategies Beyond RollingUpdate
Kubernetes natively supports two strategies set via spec.strategy.type:
- RollingUpdate (default) — incrementally replaces old Pods with new ones. Zero-downtime when configured correctly (
maxUnavailable: 0). Both old and new code run concurrently during the transition — your API must handle this: backward-compatible schema migrations, no breaking changes within the rollout window. - Recreate — kills all old Pods before creating new ones. Causes a brief downtime window. Use only for workloads that cannot run two versions simultaneously (e.g. single-writer database tools, legacy apps with exclusive file locks).
More sophisticated strategies — canary (route a small percentage of traffic to the new version) and blue/green (maintain two full environments, flip the Service selector) — are layered on top using multiple Deployments with different label selectors, traffic-splitting Ingress controllers, or a service mesh like Istio. These are covered in the Networking lesson.
Scaling Deployments
Scaling is near-instantaneous — Kubernetes updates spec.replicas in etcd and the ReplicaSet controller creates or deletes Pods to match. Manual scaling is useful for incident response; the Horizontal Pod Autoscaler (HPA) automates it based on CPU, memory, or custom metrics and is covered in a later tutorial.
strategy: Recreate on high-traffic Deployments because it is simpler to reason about. During a deploy, Kubernetes terminates all Pods before starting new ones — you get a hard outage equal to your container startup time (often 15-60 seconds for JVM or Python apps). Always use RollingUpdate with maxUnavailable: 0 for user-facing services, and invest in fast startup times so your readiness probe initial delay can be under 5 seconds.The readinessProbe is Non-Negotiable
A rolling update without a readiness probe is dangerous. Without it, Kubernetes adds a new Pod to the Service endpoint list the moment it starts — before your app has finished initializing, running migrations, or warming caches. The first real user request hits an unready Pod and fails. Always configure readinessProbe on every container in every Deployment. The probe should test actual business readiness (database connectivity, cache warm-up, dependent service reachable), not just whether the process is alive.