Distroless & Minimal Images
Distroless & Minimal Images
Every megabyte you ship in a container image is attack surface, startup latency, and bandwidth cost. At Google scale — where hundreds of thousands of container instances launch every minute — the choice of base image is a first-class security and reliability decision, not an afterthought. This lesson covers the three dominant minimal-image strategies: scratch, Alpine Linux, and Distroless, when each makes sense, and how static binaries enable the most extreme reduction.
The Cost of a Full Base Image
A standard ubuntu:22.04 base image weighs ~77 MB compressed and ships with a shell, a package manager, coreutils, and hundreds of libraries. None of those are needed at runtime for most server processes. They exist purely to make the image author's life easier during debugging — and they make an attacker's life easier too. Every CVE in bash, curl, openssl, or apt is now your CVE, even if your application never invokes those binaries.
Strategy 1 — FROM scratch
scratch is Docker's empty base image: no filesystem, no shell, no libc. A container built FROM scratch contains only what you COPY into it. This is the absolute minimum.
It works perfectly for statically compiled binaries — programs linked with all their dependencies baked in, requiring no shared libraries from the OS. Go is the canonical example: go build with CGO_ENABLED=0 produces a single self-contained ELF binary that runs from scratch with no other files needed except perhaps CA certificates and timezone data.
The resulting image is typically 5–15 MB. There is no shell, no ps, no wget — a successful attacker who escapes your application logic has nowhere to go. The -ldflags="-s -w" flag strips the symbol table and DWARF debug info, shrinking the binary further.
/etc/passwd: Many applications call getpwnam() or rely on a non-root UID being resolvable. A scratch image has no /etc/passwd or /etc/group. Either run as numeric UID (USER 65532 in the Dockerfile) or COPY in minimal passwd/group files from the builder. Kubernetes runAsNonRoot: true checks the numeric UID, so this is safe.
Strategy 2 — Alpine Linux
Alpine is a 5 MB musl-libc-based Linux distribution. Unlike scratch, it has a shell (ash), a package manager (apk), and just enough of a filesystem to make debugging practical. It is the pragmatic choice when your language runtime requires dynamic linking — Python, Node.js, Ruby, Java all need shared libraries that scratch cannot provide without manual surgery.
Key Alpine discipline: always use --no-cache with apk add to avoid writing the package index to the layer, pin versions in requirements.txt precisely, and never leave build-only packages (gcc, musl-dev) in the final stage. Alpine images still carry a shell and apk, which is why they score worse than Distroless in security scans despite being tiny.
Strategy 3 — Distroless
Distroless images, maintained by Google, are the sweet spot for most production workloads. They contain only the language runtime and its dependencies — no shell, no package manager, no coreutils. Available for Go, Python, Java, Node.js, .NET, and a static variant. They are built from Debian packages using Bazel's rules_pkg, so they receive Debian security fixes on the same cadence as Debian's security team.
The :nonroot tag ensures the image runs as UID 65532 by default — no USER instruction required, but compatible with Kubernetes runAsNonRoot. A :debug tag variant exists and includes BusyBox; use it only in emergency debugging scenarios, never in scheduled production builds.
apk to install native system libs at build time but keep it in the build stage only; use Alpine as a final stage only when Distroless has a coverage gap you cannot work around.
Static Binaries in Depth
A statically linked binary embeds all library code at compile time. No ld-linux.so interpreter, no libc.so, no libssl.so — the binary is a self-contained executable. Rust's default linker produces static binaries when targeting x86_64-unknown-linux-musl. For C/C++, pass -static to GCC. For Go, CGO_ENABLED=0 is usually sufficient; when CGO is required (e.g., SQLite via mattn/go-sqlite3), use musl-cross to produce a musl-static binary.
Debugging Distroless in Production
The most common objection to Distroless and scratch is: "how do I debug?" The answer is ephemeral debug containers — a Kubernetes feature that attaches a fully-equipped container to a running pod without modifying the pod spec:
This pattern means you never need a shell in your production image. The debug container shares the process namespace of the target container, giving you access to /proc/<pid>/fd, network sockets, and environment variables without altering the image or the running pod's security posture.
Layer Hygiene and Final Checks
Whatever base you choose, apply these final-stage rules: never install build tools in the runtime stage, always set USER to a non-root UID, remove any credentials or secrets copied in during build, and use --no-cache or --mount=type=cache in build stages to avoid leaking package indices into layers. Use docker image inspect --format '{{json .RootFS.Layers}}' to audit layer count, and docker history --no-trunc to spot accidental large layers before pushing to the registry.