GitOps with ArgoCD & Flux

Project: A GitOps Delivery Pipeline

18 min Lesson 10 of 30

Project: A GitOps Delivery Pipeline

The previous nine lessons covered principles, tools, and individual techniques. Now you build the real thing: a complete, production-grade GitOps delivery pipeline for a multi-environment service. This lesson walks through exactly how to design the repositories and ArgoCD Application resources — the decisions that determine whether the system will scale cleanly or collapse under its own complexity.

The scenario mirrors what you encounter at big-tech companies: a payments-api service that must flow through dev → staging → production environments across two Kubernetes clusters (staging and production share a cluster in this example; production is isolated). A separate platform team owns the GitOps config repo; application teams own their app repos. CI/CD is GitHub Actions.

Step 1: Repository Layout Decision

At the start of every GitOps project the first architectural question is: one config repo or many? The answer at scale is almost always a mono-repo for configs combined with separate repos per service for source code. A single config repo gives the platform team a global view of cluster state, simplifies RBAC, and prevents config drift between services. Here is the target repo structure:

# gitops-platform/  — the single config repo (owned by platform team)
.
├── clusters/
│   ├── staging/
│   │   ├── argocd-apps/
│   │   │   └── payments-api.yaml      # ArgoCD Application for staging
│   │   └── cluster-config/
│   │       ├── namespaces.yaml
│   │       └── network-policies.yaml
│   └── production/
│       ├── argocd-apps/
│       │   └── payments-api.yaml      # ArgoCD Application for production
│       └── cluster-config/
│           ├── namespaces.yaml
│           └── network-policies.yaml
│
└── services/
    └── payments-api/
        ├── base/
        │   ├── kustomization.yaml
        │   ├── deployment.yaml
        │   ├── service.yaml
        │   ├── hpa.yaml
        │   └── serviceaccount.yaml
        └── overlays/
            ├── dev/
            │   ├── kustomization.yaml  # patch: replicas=1, resources small
            │   └── image-tag.yaml      # updated by CI (dev branch builds)
            ├── staging/
            │   ├── kustomization.yaml  # patch: replicas=2, staging secrets ref
            │   └── image-tag.yaml      # updated by CI (main branch builds)
            └── production/
                ├── kustomization.yaml  # patch: replicas=5, PodDisruptionBudget
                └── image-tag.yaml      # updated manually via PR (release tag)

Why separate image-tag.yaml files? Keeping the image tag in its own file instead of inline in deployment.yaml means CI can open a single-line PR that is easy for reviewers to approve. It also prevents CI from touching unrelated base manifests. The image-tag.yaml is a Kustomize images: patch — the smallest, most auditable unit of change per promotion.

Step 2: The Base Manifests

The base/ folder holds the common, environment-agnostic definition of the service. Overlays patch only what differs. Below is the realistic base Deployment and the Kustomize image-tag patch file that CI updates:

# services/payments-api/base/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payments-api
  labels:
    app: payments-api
    version: "unknown"          # overlays set real version label
spec:
  replicas: 1                   # overridden in every overlay
  selector:
    matchLabels:
      app: payments-api
  template:
    metadata:
      labels:
        app: payments-api
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
    spec:
      serviceAccountName: payments-api
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 2000
      containers:
        - name: api
          image: ghcr.io/myorg/payments-api:latest   # replaced by image-tag patch
          ports:
            - containerPort: 8080
          readinessProbe:
            httpGet:
              path: /healthz/ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /healthz/live
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 20
          resources:
            requests:
              cpu: "200m"
              memory: "256Mi"
            limits:
              cpu: "1000m"
              memory: "512Mi"
          envFrom:
            - secretRef:
                name: payments-api-secrets     # External Secrets Operator populates this

---
# services/payments-api/overlays/staging/image-tag.yaml
# CI overwrites only the newTag field via:
#   yq e '.images[0].newTag = "'$IMAGE_TAG'"' -i overlays/staging/image-tag.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
images:
  - name: ghcr.io/myorg/payments-api
    newTag: "sha-a1b2c3d"       # CI replaces this on each merge to main

---
# services/payments-api/overlays/staging/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: payments-staging
resources:
  - ../../base
patches:
  - patch: |-
      - op: replace
        path: /spec/replicas
        value: 2
    target:
      kind: Deployment
      name: payments-api
components:
  - image-tag.yaml

Step 3: ArgoCD Application Resources

Each environment gets its own Application manifest stored in the cluster-specific folder of the config repo. ArgoCD watches the folder and self-manages: adding a new .yaml to clusters/staging/argocd-apps/ automatically creates the Application without anyone touching the ArgoCD UI.

# clusters/staging/argocd-apps/payments-api.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: payments-api-staging
  namespace: argocd
  annotations:
    notifications.argoproj.io/subscribe.on-sync-succeeded.slack: devops-deploys
    notifications.argoproj.io/subscribe.on-sync-failed.slack: devops-alerts
  finalizers:
    - resources-finalizer.argocd.argoproj.io   # cascade-delete on app removal
spec:
  project: payments-team                        # AppProject scopes RBAC
  source:
    repoURL: https://github.com/myorg/gitops-platform
    targetRevision: main
    path: services/payments-api/overlays/staging
  destination:
    server: https://kubernetes.default.svc      # same cluster where ArgoCD runs
    namespace: payments-staging
  syncPolicy:
    automated:
      prune: true           # delete resources removed from Git
      selfHeal: true        # revert manual kubectl changes
      allowEmpty: false     # never sync an empty state (safety net)
    syncOptions:
      - CreateNamespace=true
      - PrunePropagationPolicy=foreground
      - RespectIgnoreDifferences=true
    retry:
      limit: 3
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m
  ignoreDifferences:
    - group: apps
      kind: Deployment
      jsonPointers:
        - /spec/replicas     # ignore if HPA manages replica count live

---
# clusters/production/argocd-apps/payments-api.yaml
# Production: NO automated sync — requires human approval via ArgoCD UI or CLI
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: payments-api-production
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: payments-team
  source:
    repoURL: https://github.com/myorg/gitops-platform
    targetRevision: main
    path: services/payments-api/overlays/production
  destination:
    server: https://prod-cluster-api.internal:6443
    namespace: payments-production
  syncPolicy:
    syncOptions:
      - CreateNamespace=true
      - PrunePropagationPolicy=foreground
    retry:
      limit: 2
      backoff:
        duration: 10s
        factor: 2
        maxDuration: 5m
  # No automated: block — production sync is manual or via ArgoCD Sync Windows

Production best practice — disable auto-sync, use Sync Windows: Never set automated: selfHeal: true on production without at least a change-freeze Sync Window. Configure argocd.argoproj.io/sync-wave annotations to control the order resources roll out (e.g., ConfigMap before Deployment, Deployment before HPA). Use ArgoCD Notifications to post a Slack message to your on-call channel on every production sync, including a diff link.

Step 4: The CI Pipeline — Closing the Loop

The CI pipeline (GitHub Actions) builds the image, pushes it to the container registry, then opens a pull request against the config repo updating image-tag.yaml. This is the only job CI has in a GitOps system with respect to deployment.

# .github/workflows/deploy.yml  (in the payments-api APP repo)
name: Build and Promote to Staging

on:
  push:
    branches: [main]

jobs:
  build-push:
    runs-on: ubuntu-latest
    outputs:
      image_tag: ${{ steps.meta.outputs.version }}
    steps:
      - uses: actions/checkout@v4

      - name: Docker meta
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ghcr.io/myorg/payments-api
          tags: |
            type=sha,prefix=sha-,format=short

      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          push: true
          tags: ${{ steps.meta.outputs.tags }}

  promote-staging:
    needs: build-push
    runs-on: ubuntu-latest
    steps:
      - name: Checkout config repo
        uses: actions/checkout@v4
        with:
          repository: myorg/gitops-platform
          token: ${{ secrets.GITOPS_PAT }}
          path: gitops-platform

      - name: Update staging image tag
        working-directory: gitops-platform
        run: |
          TAG="${{ needs.build-push.outputs.image_tag }}"
          yq e '.images[0].newTag = "'$TAG'"' -i \
            services/payments-api/overlays/staging/image-tag.yaml
          git config user.email "ci-bot@myorg.com"
          git config user.name "CI Bot"
          git checkout -b "promote/staging/$TAG"
          git add services/payments-api/overlays/staging/image-tag.yaml
          git commit -m "chore(staging): promote payments-api to $TAG"
          git push origin "promote/staging/$TAG"

      - name: Open PR to config repo
        env:
          GH_TOKEN: ${{ secrets.GITOPS_PAT }}
        run: |
          gh pr create \
            --repo myorg/gitops-platform \
            --title "Promote payments-api ${{ needs.build-push.outputs.image_tag }} to staging" \
            --body "Automated promotion from CI. Merge to deploy." \
            --base main \
            --head "promote/staging/${{ needs.build-push.outputs.image_tag }}"

Step 5: The Full Pipeline Architecture Diagram

End-to-end GitOps pipeline: developer push triggers CI, which opens a PR to the config repo; ArgoCD reconciles staging automatically and production only on manual approval.

Step 6: Critical Production Failure Modes and How to Prevent Them

This pipeline will fail in predictable ways if you do not design against them from the start. Here are the four most common production GitOps failures and the mitigations:

CI pushes directly to main of the config repo. Someone sets up an auto-merge rule to speed up staging deploys. Later a bug in CI auto-promotes the same broken tag to production. Fix: require a human approval PR merge for all environments, even staging. Use GitHub branch protection with required_approvals: 1.
ArgoCD selfHeal fights the HPA. HPA scales your Deployment to 8 replicas under load. ArgoCD notices the desired state in Git says 5 and reverts it. Fix: add an ignoreDifferences block for /spec/replicas on Deployments managed by an HPA. This is shown in the staging Application manifest above.
Secrets committed to the config repo. An engineer adds a Secret manifest with a real password in plain text. Fix: enforce this at the repo level with a pre-commit hook that scans for base64 secrets, and use External Secrets Operator or Sealed Secrets — never plain Secret manifests in Git.
ArgoCD has no resource limits and OOMs during a mass sync. A large platform with hundreds of Applications can overwhelm the ArgoCD Application Controller. Fix: set resource.requests and resource.limits on all ArgoCD components, use ApplicationSets to batch creation, and run the Application Controller with --status-processors 20 --operation-processors 10 tuning for your cluster size.

Never store unencrypted secrets in the GitOps config repo. Even in a private repository, any CI system, any collaborator with read access, and any leaked PAT exposes those secrets permanently — git history is forever. The industry-standard solutions are External Secrets Operator (pulls secrets from AWS Secrets Manager / Vault at runtime) or Sealed Secrets (encrypts the secret with a cluster-specific key so only that cluster can decrypt it). Both integrate cleanly with ArgoCD and Flux.

What You Have Built

By assembling the pieces in this project, you now own a complete GitOps delivery pipeline that reflects production-grade practices:

A mono config repo with clear separation between base manifests, per-environment overlays, and cluster-level ArgoCD Application resources.
CI that only writes to Git — no cluster credentials, no direct kubectl calls from pipelines.
Staging auto-syncs with drift correction; production requires a human sync approval.
Image tag updates are isolated to a single file per environment, making PRs minimal and reviewable.
ArgoCD Application manifests live in the config repo — the platform is self-describing and recoverable from Git alone.

This is the pattern used by teams shipping hundreds of deployments per day. Congratulations on completing the GitOps tutorial. The foundation you have built here — pull-based reconciliation, declarative configs, automated drift correction, and promotion gates — is the same foundation underpinning the deployment infrastructure at the largest engineering organizations in the world.