Image Size & Build Hygiene
Image Size & Build Hygiene
A bloated Docker image is not just an aesthetic problem — it directly impacts pull latency on cold deploys, raises vulnerability surface area, inflates container registry storage costs, and slows every CI pipeline run. At Google, Amazon, and Netflix scale, shaving 200 MB off a base image translates to thousands of dollars of saved bandwidth and measurably faster canary rollouts. This lesson covers the four pillars of production-grade image hygiene: choosing the right base image, excluding unnecessary files with .dockerignore, understanding multi-stage builds as a size strategy, and linting Dockerfiles to catch mistakes before they reach production.
Choosing the Right Base Image
The single highest-leverage decision for image size is the base image. The same application running on node:20 versus node:20-alpine versus node:20-slim can differ by 400 MB — that is 400 MB of binaries your users never execute but your registry stores and your cluster pulls.
-alpinevariants are built on Alpine Linux (~5 MB) and usemusl libcinstead ofglibc. They are the default choice for statically compiled languages (Go, Rust) and most Node.js or Python workloads. The trade-off: some native extensions (particularly those that link against glibc directly) fail to build — profile this before committing to Alpine in production.-slimvariants use full Debian but with the majority of non-essential packages removed. They are the safer fallback when Alpine breaks native dependencies — larger than Alpine but fully glibc-compatible.-distrolessimages (from Google) contain only the application runtime and its direct OS dependencies — no shell, no package manager, no utilities. An attacker who achieves code execution inside a distroless container cannot runbash,curl, orapt. Used extensively at Google and increasingly by security-conscious teams everywhere.- Scratch is a completely empty base (zero bytes). Used for single-binary Go or Rust programs that link everything statically — the resulting image is literally just your binary.
node:20-alpine is mutable — it can be silently updated to a new image containing a regression or vulnerability. Pinning to @sha256:... guarantees you run exactly the same bytes in CI and in production. Your image-update workflow (Dependabot or Renovate) should be what bumps the digest, not a surprise on the next docker pull.
The .dockerignore File
Every file in the build context is sent to the Docker daemon before the first RUN instruction executes. On a typical Node.js or Laravel project, the default context (the whole repository) includes node_modules (hundreds of megabytes), .git (tens of megabytes of history), test fixtures, local .env secrets, IDE configuration files, and CI YAML. All of this lands in the daemon's temporary directory, slows the build, and risks leaking secrets into intermediate layers if a COPY . . instruction runs before you realize the scope.
A .dockerignore file at the project root follows the same glob syntax as .gitignore and solves this completely:
.dockerignore can leak secrets into your image layers. If your build context includes .env files and your Dockerfile runs COPY . . early (before a RUN that deletes them), those secrets are baked into the layer and extractable by anyone who can pull the image — even if a subsequent instruction removes the file. Always create .dockerignore before you write a single COPY instruction.
Multi-Stage Builds as a Size Strategy
Multi-stage builds are the most powerful size-reduction technique available. The concept: use a fat builder image with all compiler toolchains, test runners, and dev dependencies to produce your application artifact, then copy only that artifact into a minimal runtime image. The builder stage never ships to users — it disappears after docker build completes.
The following Dockerfile follows the pattern for a Node.js application and produces a final image under 150 MB from a base that starts over 1 GB:
What makes this pattern work at scale:
- Layer cache optimization — copying
package.jsonandpackage-lock.jsonbefore the rest of the source means thenpm cilayer only invalidates when dependencies change, not on every source edit. This is the single most impactful Dockerfile cache optimization and the most commonly omitted one. - Only prod deps ship —
npm ci --omit=devexcludes TypeScript, Jest, ESLint, and every other dev tool. For a typical project that is a 60–80% node_modules size reduction. - Non-root user —
USER appuserin the final stage means a container escape does not give root access to the host. Required by CIS Docker Benchmark and most enterprise security policies. - No build tools in the runtime image — the
buildstage has TypeScript and webpack; the finalproductionstage has neither. An attacker cannot weaponize a compiler they cannot reach.
Linting Dockerfiles with Hadolint
Hadolint is the industry-standard Dockerfile linter — a static analysis tool that checks your Dockerfile against the official best-practice ruleset and the shellcheck rules for embedded shell scripts. It runs in CI pipelines at most major tech companies as a required gate before an image is built.
Common Hadolint rules worth knowing by name:
- DL3006 —
FROMwithout a specific tag (floatinglatest). Always use a pinned version. - DL3007 — Using
latesttag explicitly. Same issue, explicit form. - DL3008 / DL3009 / DL3018 —
apt-get installorapk addwithout pinning package versions. Breaks reproducibility on cache miss. - DL3015 —
apt-get installwithout--no-install-recommends. Pulls in dozens of transitive packages you do not need. - DL3025 — JSON form of
CMD/ENTRYPOINTnot used. Shell form wraps your command insh -c, which means signals (likeSIGTERMfrom Kubernetes) do not reach your process directly — they go to the shell and often get swallowed, causing 30-second grace-period timeouts on every rolling deploy. - SC2086 — (from shellcheck) Unquoted variable in shell — a latent word-splitting bug.
pre-commit or a Makefile target) catches issues in seconds. Running it in CI as a required status check ensures no Dockerfile that fails the standard ever reaches your registry. Many teams also add docker scout cves or trivy image as a second CI gate to catch CVEs in base images after the build — layer hygiene and vulnerability scanning are complementary, not alternatives.
Additional Build Hygiene Practices
Beyond base image selection, .dockerignore, multi-stage builds, and linting, several smaller habits distinguish a production-quality Dockerfile from a demo one:
- Combine
RUNinstructions that belong together (for example,apt-get update && apt-get install && rm -rf /var/lib/apt/lists/*in a singleRUN). EachRUNcreates a layer; splitting them means the update cache and the installed packages live in separate layers, and therm -rfcleanup in a later layer does not actually reduce the image size (the bytes are still in the earlier layer). - Clean package manager caches in the same
RUN—apt-get clean,rm -rf /var/lib/apt/lists/*,pip install --no-cache-dir,npm ci && npm cache clean --force. If you clean in a later layer, the cache bytes are already committed. - Declare
LABELmetadata — at minimumorg.opencontainers.image.source,.version, and.revisionso tooling can trace an image back to its source commit and pipeline run. - Use
COPYnotADDunless you specifically need ADD's URL-fetching or tar-extraction features.COPYis explicit and predictable;ADDhas surprising auto-extraction behavior. - Set a
HEALTHCHECKso Docker and orchestrators can detect a process that is running but not serving traffic — important fordepends_on: condition: service_healthyin Compose and for liveness probes in Kubernetes.