Kubernetes Workloads & Configuration

Project: A Production-Grade Workload

18 min Lesson 10 of 32

Project: A Production-Grade Workload

Every concept from this tutorial — ConfigMaps, Secrets, liveness/readiness/startup probes, resource requests and limits, and Horizontal Pod Autoscaling — exists to solve a concrete production problem. In isolation each piece is easy to understand; in combination, at scale, the interactions are what trip teams up. This project wires all of them together into a single, deployable, battle-ready workload that you could ship to a real cluster today.

We will build a stateless API service called order-api. It reads non-sensitive runtime configuration from a ConfigMap, mounts database credentials from a Secret, exposes HTTP health endpoints consumed by Kubernetes probes, declares CPU/memory budgets that the scheduler and kernel enforce, and scales horizontally under load. Every decision below mirrors what senior engineers write at companies running thousands of pods per cluster.

Step 1 — Namespace and Supporting Objects

Always isolate workloads in their own namespace. This gives you RBAC boundaries, resource quotas, and clean kubectl get all -n orders output.

# namespace kubectl create namespace orders # ConfigMap — twelve-factor config, not baked into the image kubectl apply -f - <<'EOF' apiVersion: v1 kind: ConfigMap metadata: name: order-api-config namespace: orders data: APP_ENV: "production" LOG_LEVEL: "warn" DB_HOST: "postgres-svc.orders.svc.cluster.local" DB_PORT: "5432" DB_NAME: "orders" MAX_CONN_POOL: "20" CACHE_TTL_SECONDS: "300" EOF # Secret — base64-encoded, never committed to git kubectl apply -f - <<'EOF' apiVersion: v1 kind: Secret metadata: name: order-api-secret namespace: orders type: Opaque stringData: DB_USER: "order_svc" DB_PASSWORD: "s3cr3t-rotate-me" JWT_SIGNING_KEY: "HS256-production-key-min-32-chars" EOF
In real clusters, Secrets come from an external vault (AWS Secrets Manager via ASCP, HashiCorp Vault Agent Injector, or Sealed Secrets). The stringData shortcut shown here is fine for bootstrapping, but rotate credentials immediately after first deploy and integrate a vault-sync controller before you call the service production-ready.

Step 2 — The Deployment Manifest

This is the core of the project. Read through every annotated field — each one addresses a documented production failure mode.

apiVersion: apps/v1 kind: Deployment metadata: name: order-api namespace: orders labels: app: order-api version: "1.0.0" spec: replicas: 2 # HPA will override this at runtime selector: matchLabels: app: order-api strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 # one extra pod during rollout maxUnavailable: 0 # zero downtime — never kill before replacement is Ready template: metadata: labels: app: order-api version: "1.0.0" spec: terminationGracePeriodSeconds: 30 # time to drain in-flight requests securityContext: runAsNonRoot: true runAsUser: 1000 fsGroup: 2000 containers: - name: api image: registry.example.com/order-api:1.0.0 imagePullPolicy: Always ports: - containerPort: 8080 name: http # ── Config from ConfigMap (envFrom = all keys at once) ─────────── envFrom: - configMapRef: name: order-api-config # ── Secrets mounted as individual env vars ─────────────────────── env: - name: DB_USER valueFrom: secretKeyRef: name: order-api-secret key: DB_USER - name: DB_PASSWORD valueFrom: secretKeyRef: name: order-api-secret key: DB_PASSWORD - name: JWT_SIGNING_KEY valueFrom: secretKeyRef: name: order-api-secret key: JWT_SIGNING_KEY # ── Resource budget ─────────────────────────────────────────────── resources: requests: cpu: "250m" memory: "256Mi" limits: cpu: "1000m" # 1 vCPU — throttled, not killed memory: "512Mi" # OOM-killed if exceeded — size carefully # ── Startup probe — give the JVM / migration time to finish ─────── startupProbe: httpGet: path: /healthz/startup port: 8080 initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 24 # 24 * 5s = 2 minutes max startup window successThreshold: 1 # ── Liveness probe — restart the container if it deadlocks ──────── livenessProbe: httpGet: path: /healthz/live port: 8080 initialDelaySeconds: 0 # startup probe gates this; 0 is safe here periodSeconds: 15 timeoutSeconds: 3 failureThreshold: 3 # 3 * 15s = 45 s before restart # ── Readiness probe — gate traffic until DB conn pool is warm ───── readinessProbe: httpGet: path: /healthz/ready port: 8080 initialDelaySeconds: 0 periodSeconds: 5 timeoutSeconds: 2 failureThreshold: 3 # remove from Service endpoints after 15 s # Spread pods across failure domains (requires Kubernetes 1.19+) topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: app: order-api
The most common production incident caused by this manifest: setting maxUnavailable: 1 with only replicas: 2 means a rolling update can briefly leave you with one pod. If that pod is still starting up, all traffic hits a single unready instance. Always set maxUnavailable: 0 on low-replica deployments; the cost is one extra pod slot during rollout.

Step 3 — Service and HPA

# ClusterIP Service — stable DNS for in-cluster consumers apiVersion: v1 kind: Service metadata: name: order-api-svc namespace: orders spec: selector: app: order-api ports: - port: 80 targetPort: 8080 name: http type: ClusterIP --- # Horizontal Pod Autoscaler apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: order-api-hpa namespace: orders spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: order-api minReplicas: 2 maxReplicas: 20 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 60 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 75 behavior: scaleDown: stabilizationWindowSeconds: 300 # wait 5 min before shrinking policies: - type: Percent value: 25 periodSeconds: 60 # remove at most 25% per minute scaleUp: stabilizationWindowSeconds: 0 # scale up immediately policies: - type: Pods value: 4 periodSeconds: 60 # add at most 4 pods per minute
The scaleDown.stabilizationWindowSeconds: 300 is critical. Without it the HPA can rapidly scale from 20 pods back down to 2 the moment a traffic spike subsides — only for the next spike to find a cold cluster with insufficient capacity. Google SREs call this oscillation, and it kills p99 latency. The 5-minute window smooths it out.

Step 4 — Deploy and Verify

# Apply everything in one shot (manifest files in ./manifests/) kubectl apply -f manifests/ -n orders # Watch the rollout — both pods must show 2/2 READY kubectl rollout status deployment/order-api -n orders kubectl get pods -n orders -w # Confirm env from ConfigMap and Secret reached the container kubectl exec -n orders deploy/order-api -- env | grep -E 'APP_ENV|DB_HOST|DB_USER' # Check probe status (look at Events section for any probe failures) kubectl describe pod -n orders -l app=order-api # Inspect HPA — TARGETS column shows current vs desired utilization kubectl get hpa -n orders -w # Simulate load to trigger HPA scale-up kubectl run load-gen --image=busybox -n orders --rm -it --restart=Never -- \ /bin/sh -c "while true; do wget -q -O- http://order-api-svc/api/orders; done" # After the test, watch HPA scale back down (takes ~5 min due to stabilization window) kubectl get hpa order-api-hpa -n orders -w

Architecture Overview

Production-Grade Workload Architecture HPA min:2 max:20 scales Deployment order-api rollingUpdate Pod (Zone A) Startup Probe /healthz/startup Liveness /healthz/live Readiness /healthz/ready Pod (Zone B) Startup Probe /healthz/startup Liveness /healthz/live Readiness /healthz/ready ConfigMap APP_ENV, DB_HOST … Secret DB_USER, DB_PASSWORD … Resources req 250m/256Mi lim 1/512Mi Service (ClusterIP) order-api-svc :80
All workload primitives wired together: HPA drives replica count, ConfigMap and Secret inject config, probes guard traffic, and the Service provides a stable endpoint.

Production Failure Modes to Know

Senior engineers are distinguished by knowing what breaks, not just what works. Here are the failure modes that bite teams most often with this exact setup:

  • Probe endpoint is too expensive. If /healthz/ready executes a database query on every call and Kubernetes polls every 5 seconds across 50 pods, that is 600 DB round-trips per minute from health checks alone. Keep probes lightweight — check an in-memory flag that your app sets after the DB connection pool warms up, not the DB itself.
  • Memory limit too close to request. A limit of 512 Mi with a request of 256 Mi means the node may schedule the pod, the JVM grows past 256 Mi under real load, and the OOM killer terminates the container. Set the limit at least 1.5–2x the request, or use Guaranteed QoS (request == limit) for latency-critical services.
  • HPA cannot scale because metrics-server is missing. Run kubectl top pods; if it errors, metrics-server is not installed. HPA silently fails to scale. Install it before you need it: kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml.
  • Rolling update stalls at 50% because readiness probe uses a secret that was rotated. Old pods continue serving. New pods fail readiness because the Secret the Deployment references was updated but the pod cache is stale. Solution: restart the Deployment (kubectl rollout restart deployment/order-api -n orders) after rotating Secrets.
The combination of maxUnavailable: 0, topologySpreadConstraints across zones, a non-zero HPA minReplicas, and a startup probe with a generous failureThreshold is the canonical zero-downtime deployment pattern at big-tech scale. Each guard is cheap to add and expensive to retrofit after an incident.

What to Carry Forward

This workload is intentionally stateless. The patterns here — envFrom ConfigMap, individual Secret env vars, three-tier probes, request/limit ratio, HPA with asymmetric scale behavior, and topology spread — apply to almost every Kubernetes microservice you will write in your career. The next natural extension is adding a PodDisruptionBudget (minAvailable: 1) so cluster drain operations during node maintenance cannot take down all replicas simultaneously. That, plus a network policy restricting ingress to only the Ingress controller, completes a production-ready perimeter.