Docker & Containerization

Writing Dockerfiles

18 min Lesson 4 of 30

Writing Dockerfiles

A Dockerfile is the single source of truth for how your container image is assembled. Every production image at a serious company — the Node API, the Python ML worker, the Go microservice — starts here. Writing a Dockerfile badly means slow CI builds, bloated images, unpredictable runtime behaviour, and security surface you can't explain. Writing it well means 10-second cache hits, 20 MB final images, and builds that are reproducible across every developer's laptop and every CI runner.

This lesson walks through every instruction that matters, explains why layer ordering is a first-class concern, and shows you what cache-friendly Dockerfiles actually look like in production.

The Core Instructions

FROM — Choosing Your Base

FROM is always the first instruction. It sets the base image your image builds on top of. Every subsequent instruction adds a new layer on top of that base.

At big tech, three rules govern FROM:

Always pin an exact digest or at minimum an exact tag — never FROM node:latest. Floating tags break reproducibility silently when upstream publishes a new image.
Prefer official minimal bases: node:22-alpine, python:3.12-slim, golang:1.22-bookworm. Alpine and slim variants are dramatically smaller than the default Debian-full images.
For statically-compiled binaries (Go, Rust), prefer FROM scratch — the final stage gets zero OS surface area.

# Bad — unpinned, non-minimal
FROM node:latest

# Better — pinned tag, Alpine base
FROM node:22.3-alpine3.20

# Best (multi-stage) — pin the build stage, scratch or distroless for runtime
FROM node:22.3-alpine3.20 AS builder
FROM gcr.io/distroless/nodejs22-debian12 AS runtime

COPY — Bringing Files In

COPY <src> <dest> copies files from the build context (your local filesystem) into the image layer. ADD also exists but should be avoided unless you specifically need its tar-auto-extraction or remote-URL fetch behaviour — both are footguns. Use COPY by default.

Key flags you will actually use in production:

--chown=user:group — set ownership in a single step rather than a separate RUN chown instruction (which would create an extra layer).
--from=builder — copy from another stage in a multi-stage build (covered in a later lesson).

Always maintain a .dockerignore file alongside your Dockerfile. Without it, COPY . . sends your entire working directory into the build context — including .git/, node_modules/, test fixtures, and local .env files. A well-crafted .dockerignore cuts build-context transfer from gigabytes to kilobytes.

RUN — Executing Build Steps

RUN executes a shell command during the build and commits the result as a new layer. Every RUN instruction is a cache key. The most important production rule: chain related commands into a single RUN instruction, especially when they install, modify, and then clean up packages — because if you split them across multiple RUN instructions, intermediate files are frozen in earlier layers even after you delete them.

# Bad — creates 3 layers; apt cache is frozen in layer 1 even after rm in layer 3
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*

# Correct — single layer, net result is a smaller image
RUN apt-get update \
    && apt-get install -y --no-install-recommends curl ca-certificates \
    && rm -rf /var/lib/apt/lists/*

Shell form vs exec form: RUN apt-get update is shell form (runs via /bin/sh -c). You can also use exec form: RUN ["apt-get", "update"]. Shell form is more readable for chained commands; exec form avoids shell interpretation and is preferred in CMD and ENTRYPOINT.

CMD and ENTRYPOINT — Defining Runtime Behaviour

These two instructions are a source of constant confusion. The rule is simple once you internalise it:

ENTRYPOINT — the fixed executable that always runs. It defines what the container is.
CMD — default arguments that are passed to ENTRYPOINT (or, if there is no ENTRYPOINT, the default command to run). It defines sensible defaults that can be overridden at docker run time.

Both accept shell form and exec form. Always use exec form (["executable", "arg1"]) for CMD and ENTRYPOINT. Shell form wraps your process in /bin/sh -c, making it PID 2 instead of PID 1 — which means signals (SIGTERM, SIGINT) from Docker or Kubernetes are never delivered to your process, causing unclean shutdowns and slow rolling deploys.

# Shell form — BAD for production; your app never receives SIGTERM
ENTRYPOINT node server.js

# Exec form — CORRECT; node is PID 1 and receives signals directly
ENTRYPOINT ["node", "server.js"]

# Typical production pattern: fixed executable + overridable default args
ENTRYPOINT ["python", "-m", "gunicorn"]
CMD ["--workers", "4", "--bind", "0.0.0.0:8000", "myapp.wsgi:application"]
# Override at runtime: docker run myimage --workers 8 --bind 0.0.0.0:9000

A common production pitfall: if you define both ENTRYPOINT and CMD, CMD supplies default arguments to ENTRYPOINT. But if you override CMD at docker run time, your entire CMD is replaced — not merged. Design your argument surface accordingly.

Layer Ordering and Cache-Friendly Dockerfiles

Docker's build cache is keyed on the instruction text and the contents of any files referenced. Once a layer is invalidated, every subsequent layer must be rebuilt. This means layer order is a performance-critical decision, not an aesthetic one.

The golden rule: order instructions from least-frequently-changing to most-frequently-changing.

Layer ordering: copying dependency manifests before source code preserves the npm/pip install cache across code-only changes.

A Production-Grade Dockerfile (Node.js API)

Here is a complete, real-world Dockerfile incorporating every principle above. Study the order and the comments:

# syntax=docker/dockerfile:1.7
FROM node:22.3-alpine3.20 AS builder

# Set working directory
WORKDIR /app

# 1. Copy ONLY dependency manifests first — maximises cache hit rate
COPY package.json package-lock.json ./

# 2. Install deps; npm ci is reproducible (respects lock file exactly)
#    --omit=dev excludes devDependencies at install time for the build layer
RUN npm ci

# 3. Copy source AFTER deps are installed
COPY . .

# 4. Build (transpile, bundle, etc.)
RUN npm run build

# ---- Runtime stage ----
FROM node:22.3-alpine3.20

ENV NODE_ENV=production
WORKDIR /app

# Run as non-root — principle of least privilege
RUN addgroup -S appgroup && adduser -S appuser -G appgroup

# Copy only production deps + built artefact — nothing from devDependencies
COPY --from=builder --chown=appuser:appgroup /app/node_modules ./node_modules
COPY --from=builder --chown=appuser:appgroup /app/dist ./dist

USER appuser
EXPOSE 3000

# Exec form so node is PID 1 and receives SIGTERM cleanly
ENTRYPOINT ["node"]
CMD ["dist/server.js"]

Always run containers as a non-root user. If your container is ever compromised, an attacker running as root inside the container has far more opportunities to escape to the host or exfiltrate secrets from mounted volumes. RUN adduser + USER is a two-line security win.

Other Useful Instructions

WORKDIR /app — sets the working directory for subsequent instructions. Prefer this over RUN cd /app which does not persist.
ENV KEY=value — sets environment variables available at both build time and runtime. Use for NODE_ENV=production, PYTHONUNBUFFERED=1, etc. Do not use ENV for secrets — they are baked into the image and visible via docker history.
ARG — build-time-only variable, not persisted into the image. Safe for things like version numbers: ARG APP_VERSION=1.0.0.
EXPOSE 3000 — documents which port the container listens on. It does not publish the port; that happens at docker run -p or in Docker Compose. Treat it as metadata for operators.
LABEL — attach key-value metadata (maintainer, version, git SHA). Useful for docker inspect and automated inventory systems.

The build cache is per-machine and per-registry. In CI pipelines without a shared cache backend, every build starts cold. Use BuildKit's --cache-from / --cache-to flags or GitHub Actions' cache-to=type=gha to persist layer caches between pipeline runs — this is often the single biggest CI speed-up available.

Summary

A production-quality Dockerfile is defined by four habits: pin your base image, chain RUN commands to collapse layers, order instructions with least-changing first, and always use exec form for ENTRYPOINT and CMD. Every deviation has a concrete cost — either in image size, build time, or runtime reliability. Make these the default, not the exception.