Multi-Stage Builds
Multi-Stage Builds
Every production image you ship is also an attack surface. A container that carries a full compiler toolchain, test frameworks, and build caches alongside your application binary is not just wasteful — it is a liability. An attacker who gains code execution inside that container inherits every tool you left behind. Multi-stage builds solve this by separating the environment that compiles your software from the environment that runs it, so the final image contains only the artifacts that belong in production.
The Problem with Single-Stage Images
Before multi-stage builds (Docker 17.05, 2017), teams typically wrote one Dockerfile and either accepted a bloated image or maintained a fragile two-script dance: a build script on the host, then a copy into the image. Both approaches have well-known failure modes at scale:
- Image bloat: A typical Go binary compiles to ~10 MB; the golang:1.22 base image is ~850 MB. Shipping that to 500 nodes on every deploy wastes bandwidth, slows pod startup, and inflates your container registry bill.
- Leaked secrets: Build stages often need credentials — package repository tokens, SSH keys, NPM_TOKEN. A
RUN rm -rf ~/.sshdoes not remove secrets from the image; earlier layers are still present and extractable withdocker history --no-truncordocker save. - Expanded CVE exposure: gcc, make, curl, git, and similar tools carry CVEs continuously. Scanners such as Grype and Trivy will flag them even if they are never reachable at runtime.
How Multi-Stage Builds Work
Each FROM instruction in a Dockerfile starts a new, independent build stage. You can copy artifacts produced in one stage into a later stage using COPY --from=<stage>. Only the final stage is committed to the image tag; all intermediate stages are discarded at build time. The Docker daemon still caches each stage individually, so rebuild times remain fast.
A Production-Grade Go Example
The following Dockerfile reflects patterns used in production Go services at scale. Every instruction has a deliberate reason.
Key decisions to understand and defend in a code review:
CGO_ENABLED=0produces a statically-linked binary with zero shared library dependencies, makingFROM scratchviable.-ldflags="-s -w"strips the symbol table and DWARF debug info, shrinking the binary by 20–30 %.-trimpathremoves local filesystem paths embedded in the binary, avoiding accidental host-path leaks in stack traces.- The
depsstage is intentionally separate from thebuilderstage so that a code-only change does not re-download modules. --mount=type=cacheis a BuildKit cache mount — the Go module cache persists across builds on the same host without ever appearing as a committed layer.
go.mod, package.json, requirements.txt) and install dependencies before copying source code. Because source code changes on every commit, placing a COPY . . instruction before go mod download would bust the cache on every build and defeat the purpose of multi-stage caching.
Node.js / TypeScript Example
Interpreted-language projects still benefit from multi-stage builds: you can transpile TypeScript, run npm ci with dev dependencies, and ship only the compiled JavaScript and production node_modules.
Targeting Specific Stages
Multi-stage Dockerfiles double as a build matrix. You can build intermediate stages directly, which is useful for running tests inside the build environment without polluting the runtime image:
docker buildx build --cache-from type=registry,ref=ghcr.io/org/app:cache --cache-to type=registry,ref=ghcr.io/org/app:cache,mode=max .
Production Failure Modes
Multi-stage builds surface a class of bugs that single-stage builds hide:
- Missing runtime libraries. If you did not use
CGO_ENABLED=0(or equivalent static linking), your binary may depend on glibc or other shared objects that exist in Alpine but not inscratchor distroless. The container starts and immediately exits withnot found. Fix: useldd /out/binaryin the builder stage or switch to a distroless-glibc base. - Missing timezone data.
FROM scratchhas no/usr/share/zoneinfo. If your application callstime.LoadLocation, it will panic at runtime. Fix:COPY --from=builder /usr/share/zoneinfo /usr/share/zoneinfo. - Build args leaking into the final stage. A
ARGdeclared in stage 0 is not automatically available in stage 2. Re-declareARGafter eachFROMwhen you need it — but never pass secrets asARG; use--mount=type=secretinstead.
ADD to pull remote tarballs in multi-stage builds. ADD https://example.com/tool.tar.gz /opt/ is not cached by content hash — it re-fetches on every build. Use RUN curl | tar -xz inside a stage with a --mount=type=cache, or better yet, pin to a specific digest using an explicit FROM for that tool.
Measuring Impact
After building, always verify the improvement with docker image inspect and your vulnerability scanner:
At Google-scale, a 10 x image size reduction translates directly to faster cold-start times on Kubernetes (image pull is often the dominant factor in pod scheduling latency), lower registry storage costs, and a measurably smaller CVE backlog for your security team. Multi-stage builds are not optional hygiene — they are table stakes for any image that ships to production.