Continuous Integration Fundamentals

Anatomy of a CI Pipeline

18 min Lesson 2 of 28

Anatomy of a CI Pipeline

Every time a developer pushes code, a CI pipeline executes a precisely ordered sequence of checks. That sequence is not arbitrary — it is engineered for one purpose: surface failures as early and as cheaply as possible. Understanding the anatomy of a pipeline means understanding why each stage exists, what it catches, what it costs if you skip it, and how to order stages to minimize wasted compute.

The Four Canonical Stages

Regardless of the tool — GitHub Actions, GitLab CI, Jenkins, CircleCI, Buildkite — a well-designed CI pipeline passes through four logical stages in sequence: Lint → Build → Test → Package. Each stage has a distinct failure mode and a different cost profile.

The four canonical CI stages and their typical durations. Earlier stages are cheaper; failing there saves all downstream compute.

Stage 1 — Lint

Linting runs before compilation. Its job is to reject code that is malformed, violates style rules, or contains known anti-patterns — without executing a single line of application logic. Because it operates on source text alone, lint is the fastest possible signal a pipeline can produce.

At big-tech scale, lint jobs are parallelized across changed files and typically complete in under 30 seconds. Common tools: ESLint for JavaScript/TypeScript, pylint/ruff for Python, golangci-lint for Go, shellcheck for shell scripts, and hadolint for Dockerfiles.

# .github/workflows/ci.yml — lint stage
lint:
  runs-on: ubuntu-24.04
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-node@v4
      with:
        node-version: '22'
        cache: 'npm'
    - run: npm ci --ignore-scripts
    - name: ESLint
      run: npx eslint src/ --max-warnings=0
    - name: TypeScript type-check
      run: npx tsc --noEmit
    - name: Prettier format check
      run: npx prettier --check "src/**/*.{ts,tsx}"

The --max-warnings=0 flag on ESLint is a production-grade choice: warnings become errors. Many teams start with warnings allowed and gradually tighten the gate. Google, Meta, and Amazon enforce zero lint warnings in CI — "fix it now" costs minutes; "fix it later" costs weeks of accumulated debt.

Stage 2 — Build

The build stage compiles or transpiles source code into an executable form. It proves that every import resolves, every type is satisfied, and every asset can be bundled. A failed build means the artifact does not exist — there is nothing to test or ship.

Build reproducibility is the key concern. The same commit must produce the same binary on any runner, on any day. This requires locking dependency versions (package-lock.json, go.sum, Cargo.lock), pinning tool versions, and — for compiled languages — recording compiler version in CI output.

# Build stage in the same workflow
build:
  runs-on: ubuntu-24.04
  needs: lint              # only runs if lint passes
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-node@v4
      with:
        node-version: '22'
        cache: 'npm'
    - run: npm ci --ignore-scripts
    - name: Production build
      run: npm run build
      env:
        NODE_ENV: production
    - name: Upload build artifact
      uses: actions/upload-artifact@v4
      with:
        name: dist-${{ github.sha }}
        path: dist/
        retention-days: 7

Stage 3 — Test

The test stage is typically the longest and most expensive part of the pipeline. It runs the automated test suite against the build artifact. A well-structured test stage splits tests by speed and scope: unit tests run first (milliseconds each, no I/O), integration tests run next (real DB/cache connections, slower), and end-to-end tests run last (full browser automation, minutes).

At scale, large test suites are sharded: split across N parallel runners so total wall-clock time stays bounded. GitHub Actions uses strategy.matrix for this; Buildkite has native parallelism primitives. Coverage is measured and a minimum threshold (typically 80 %) is enforced as a quality gate — a PR that drops coverage below the threshold is rejected.

Run unit tests first within the test stage. If any unit test fails, abort before spending money on integration or E2E. Many teams separate these into three distinct jobs so the dependency graph is explicit and failures are immediately obvious in the PR checks UI.

Stage 4 — Package

Packaging runs only after all tests pass. It creates the deployable artifact: a Docker image, a JAR, a compiled binary, a Helm chart tarball. This is the output the CD system will promote through environments. Because packaging is expensive (image layers, cache warming, registry push), it is gated behind a passing test stage — you never want to push an untested image.

Best practice: tag images with the full git SHA, not latest. This makes every artifact traceable back to the exact commit that produced it, which is essential for rollbacks and incident investigation.

# Package stage — build and push Docker image
package:
  runs-on: ubuntu-24.04
  needs: test              # only runs if all test jobs pass
  permissions:
    contents: read
    packages: write
  steps:
    - uses: actions/checkout@v4
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v3
    - name: Login to GHCR
      uses: docker/login-action@v3
      with:
        registry: ghcr.io
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}
    - name: Download build artifact
      uses: actions/download-artifact@v4
      with:
        name: dist-${{ github.sha }}
        path: dist/
    - name: Build and push image
      uses: docker/build-push-action@v5
      with:
        context: .
        push: true
        tags: |
          ghcr.io/${{ github.repository }}:${{ github.sha }}
          ghcr.io/${{ github.repository }}:latest
        cache-from: type=gha
        cache-to: type=gha,mode=max

Fail-Fast Ordering — the Economics of Stage Order

The ordering Lint → Build → Test → Package is not a convention — it is an economic decision. Each stage is more expensive than the one before it. If a lint error would have caught a problem in 10 seconds, running a 15-minute test suite before the lint check wastes 14 minutes and 50 seconds of compute, multiplied across every developer and every push in the organization.

This principle is enforced with the needs keyword in GitHub Actions (or dependencies/needs in GitLab CI). A stage that declares needs: [lint] will not start until lint succeeds, and will be skipped entirely if lint fails. Fail-fast is the default behavior when stages are properly chained.

A common mistake is running all jobs in parallel unconditionally to "save time." This can push a Docker image to the registry even when tests are still running — or when they are about to fail. Always chain stages with explicit dependencies so no artifact is produced from broken code.

Putting It Together — the Complete Dependency Graph

A realistic pipeline dependency graph: lint and build are sequential; unit and integration tests run in parallel after the build; packaging only runs when all test jobs succeed.

Real pipelines at companies like Shopify, Stripe, and Airbnb follow this same graph structure. The parallelism in the test layer is where most teams recover significant wall-clock time — but the serial gating at lint and build ensures no compute is wasted on obviously broken code.

Measure your pipeline. Track mean pipeline duration and failure rate per stage as engineering metrics. If lint fails more than 5 % of the time, your local developer workflow is broken. If tests are the top failure cause, invest in flake detection. Instrument before you optimize.