GitOps with ArgoCD & Flux

Project: A GitOps Delivery Pipeline

18 min Lesson 10 of 30

Project: A GitOps Delivery Pipeline

The previous nine lessons covered principles, tools, and individual techniques. Now you build the real thing: a complete, production-grade GitOps delivery pipeline for a multi-environment service. This lesson walks through exactly how to design the repositories and ArgoCD Application resources — the decisions that determine whether the system will scale cleanly or collapse under its own complexity.

The scenario mirrors what you encounter at big-tech companies: a payments-api service that must flow through dev → staging → production environments across two Kubernetes clusters (staging and production share a cluster in this example; production is isolated). A separate platform team owns the GitOps config repo; application teams own their app repos. CI/CD is GitHub Actions.

Step 1: Repository Layout Decision

At the start of every GitOps project the first architectural question is: one config repo or many? The answer at scale is almost always a mono-repo for configs combined with separate repos per service for source code. A single config repo gives the platform team a global view of cluster state, simplifies RBAC, and prevents config drift between services. Here is the target repo structure:

# gitops-platform/ — the single config repo (owned by platform team) . ├── clusters/ │ ├── staging/ │ │ ├── argocd-apps/ │ │ │ └── payments-api.yaml # ArgoCD Application for staging │ │ └── cluster-config/ │ │ ├── namespaces.yaml │ │ └── network-policies.yaml │ └── production/ │ ├── argocd-apps/ │ │ └── payments-api.yaml # ArgoCD Application for production │ └── cluster-config/ │ ├── namespaces.yaml │ └── network-policies.yaml │ └── services/ └── payments-api/ ├── base/ │ ├── kustomization.yaml │ ├── deployment.yaml │ ├── service.yaml │ ├── hpa.yaml │ └── serviceaccount.yaml └── overlays/ ├── dev/ │ ├── kustomization.yaml # patch: replicas=1, resources small │ └── image-tag.yaml # updated by CI (dev branch builds) ├── staging/ │ ├── kustomization.yaml # patch: replicas=2, staging secrets ref │ └── image-tag.yaml # updated by CI (main branch builds) └── production/ ├── kustomization.yaml # patch: replicas=5, PodDisruptionBudget └── image-tag.yaml # updated manually via PR (release tag)
Why separate image-tag.yaml files? Keeping the image tag in its own file instead of inline in deployment.yaml means CI can open a single-line PR that is easy for reviewers to approve. It also prevents CI from touching unrelated base manifests. The image-tag.yaml is a Kustomize images: patch — the smallest, most auditable unit of change per promotion.

Step 2: The Base Manifests

The base/ folder holds the common, environment-agnostic definition of the service. Overlays patch only what differs. Below is the realistic base Deployment and the Kustomize image-tag patch file that CI updates:

# services/payments-api/base/deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: payments-api labels: app: payments-api version: "unknown" # overlays set real version label spec: replicas: 1 # overridden in every overlay selector: matchLabels: app: payments-api template: metadata: labels: app: payments-api annotations: prometheus.io/scrape: "true" prometheus.io/port: "8080" spec: serviceAccountName: payments-api securityContext: runAsNonRoot: true runAsUser: 1000 fsGroup: 2000 containers: - name: api image: ghcr.io/myorg/payments-api:latest # replaced by image-tag patch ports: - containerPort: 8080 readinessProbe: httpGet: path: /healthz/ready port: 8080 initialDelaySeconds: 5 periodSeconds: 10 livenessProbe: httpGet: path: /healthz/live port: 8080 initialDelaySeconds: 15 periodSeconds: 20 resources: requests: cpu: "200m" memory: "256Mi" limits: cpu: "1000m" memory: "512Mi" envFrom: - secretRef: name: payments-api-secrets # External Secrets Operator populates this --- # services/payments-api/overlays/staging/image-tag.yaml # CI overwrites only the newTag field via: # yq e '.images[0].newTag = "'$IMAGE_TAG'"' -i overlays/staging/image-tag.yaml apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization images: - name: ghcr.io/myorg/payments-api newTag: "sha-a1b2c3d" # CI replaces this on each merge to main --- # services/payments-api/overlays/staging/kustomization.yaml apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization namespace: payments-staging resources: - ../../base patches: - patch: |- - op: replace path: /spec/replicas value: 2 target: kind: Deployment name: payments-api components: - image-tag.yaml

Step 3: ArgoCD Application Resources

Each environment gets its own Application manifest stored in the cluster-specific folder of the config repo. ArgoCD watches the folder and self-manages: adding a new .yaml to clusters/staging/argocd-apps/ automatically creates the Application without anyone touching the ArgoCD UI.

# clusters/staging/argocd-apps/payments-api.yaml apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: payments-api-staging namespace: argocd annotations: notifications.argoproj.io/subscribe.on-sync-succeeded.slack: devops-deploys notifications.argoproj.io/subscribe.on-sync-failed.slack: devops-alerts finalizers: - resources-finalizer.argocd.argoproj.io # cascade-delete on app removal spec: project: payments-team # AppProject scopes RBAC source: repoURL: https://github.com/myorg/gitops-platform targetRevision: main path: services/payments-api/overlays/staging destination: server: https://kubernetes.default.svc # same cluster where ArgoCD runs namespace: payments-staging syncPolicy: automated: prune: true # delete resources removed from Git selfHeal: true # revert manual kubectl changes allowEmpty: false # never sync an empty state (safety net) syncOptions: - CreateNamespace=true - PrunePropagationPolicy=foreground - RespectIgnoreDifferences=true retry: limit: 3 backoff: duration: 5s factor: 2 maxDuration: 3m ignoreDifferences: - group: apps kind: Deployment jsonPointers: - /spec/replicas # ignore if HPA manages replica count live --- # clusters/production/argocd-apps/payments-api.yaml # Production: NO automated sync — requires human approval via ArgoCD UI or CLI apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: payments-api-production namespace: argocd finalizers: - resources-finalizer.argocd.argoproj.io spec: project: payments-team source: repoURL: https://github.com/myorg/gitops-platform targetRevision: main path: services/payments-api/overlays/production destination: server: https://prod-cluster-api.internal:6443 namespace: payments-production syncPolicy: syncOptions: - CreateNamespace=true - PrunePropagationPolicy=foreground retry: limit: 2 backoff: duration: 10s factor: 2 maxDuration: 5m # No automated: block — production sync is manual or via ArgoCD Sync Windows
Production best practice — disable auto-sync, use Sync Windows: Never set automated: selfHeal: true on production without at least a change-freeze Sync Window. Configure argocd.argoproj.io/sync-wave annotations to control the order resources roll out (e.g., ConfigMap before Deployment, Deployment before HPA). Use ArgoCD Notifications to post a Slack message to your on-call channel on every production sync, including a diff link.

Step 4: The CI Pipeline — Closing the Loop

The CI pipeline (GitHub Actions) builds the image, pushes it to the container registry, then opens a pull request against the config repo updating image-tag.yaml. This is the only job CI has in a GitOps system with respect to deployment.

# .github/workflows/deploy.yml (in the payments-api APP repo) name: Build and Promote to Staging on: push: branches: [main] jobs: build-push: runs-on: ubuntu-latest outputs: image_tag: ${{ steps.meta.outputs.version }} steps: - uses: actions/checkout@v4 - name: Docker meta id: meta uses: docker/metadata-action@v5 with: images: ghcr.io/myorg/payments-api tags: | type=sha,prefix=sha-,format=short - name: Build and push uses: docker/build-push-action@v5 with: push: true tags: ${{ steps.meta.outputs.tags }} promote-staging: needs: build-push runs-on: ubuntu-latest steps: - name: Checkout config repo uses: actions/checkout@v4 with: repository: myorg/gitops-platform token: ${{ secrets.GITOPS_PAT }} path: gitops-platform - name: Update staging image tag working-directory: gitops-platform run: | TAG="${{ needs.build-push.outputs.image_tag }}" yq e '.images[0].newTag = "'$TAG'"' -i \ services/payments-api/overlays/staging/image-tag.yaml git config user.email "ci-bot@myorg.com" git config user.name "CI Bot" git checkout -b "promote/staging/$TAG" git add services/payments-api/overlays/staging/image-tag.yaml git commit -m "chore(staging): promote payments-api to $TAG" git push origin "promote/staging/$TAG" - name: Open PR to config repo env: GH_TOKEN: ${{ secrets.GITOPS_PAT }} run: | gh pr create \ --repo myorg/gitops-platform \ --title "Promote payments-api ${{ needs.build-push.outputs.image_tag }} to staging" \ --body "Automated promotion from CI. Merge to deploy." \ --base main \ --head "promote/staging/${{ needs.build-push.outputs.image_tag }}"

Step 5: The Full Pipeline Architecture Diagram

End-to-end GitOps delivery pipeline: app repo to multi-env cluster Developer git push App Repo payments-api GitHub Actions build + push image Container Registry ghcr.io/myorg open PR Config Repo (gitops-platform) services/ overlays/ clusters/ PR review + merge watches & reconciles ArgoCD App: staging + production auto-sync manual sync Staging Cluster namespace: payments-staging replicas=2 | auto-selfHeal=true Production Cluster namespace: payments-production replicas=5 | PDB enforced | manual sync image pull auto GitOps sync manual approval sync
End-to-end GitOps pipeline: developer push triggers CI, which opens a PR to the config repo; ArgoCD reconciles staging automatically and production only on manual approval.

Step 6: Critical Production Failure Modes and How to Prevent Them

This pipeline will fail in predictable ways if you do not design against them from the start. Here are the four most common production GitOps failures and the mitigations:

  1. CI pushes directly to main of the config repo. Someone sets up an auto-merge rule to speed up staging deploys. Later a bug in CI auto-promotes the same broken tag to production. Fix: require a human approval PR merge for all environments, even staging. Use GitHub branch protection with required_approvals: 1.
  2. ArgoCD selfHeal fights the HPA. HPA scales your Deployment to 8 replicas under load. ArgoCD notices the desired state in Git says 5 and reverts it. Fix: add an ignoreDifferences block for /spec/replicas on Deployments managed by an HPA. This is shown in the staging Application manifest above.
  3. Secrets committed to the config repo. An engineer adds a Secret manifest with a real password in plain text. Fix: enforce this at the repo level with a pre-commit hook that scans for base64 secrets, and use External Secrets Operator or Sealed Secrets — never plain Secret manifests in Git.
  4. ArgoCD has no resource limits and OOMs during a mass sync. A large platform with hundreds of Applications can overwhelm the ArgoCD Application Controller. Fix: set resource.requests and resource.limits on all ArgoCD components, use ApplicationSets to batch creation, and run the Application Controller with --status-processors 20 --operation-processors 10 tuning for your cluster size.
Never store unencrypted secrets in the GitOps config repo. Even in a private repository, any CI system, any collaborator with read access, and any leaked PAT exposes those secrets permanently — git history is forever. The industry-standard solutions are External Secrets Operator (pulls secrets from AWS Secrets Manager / Vault at runtime) or Sealed Secrets (encrypts the secret with a cluster-specific key so only that cluster can decrypt it). Both integrate cleanly with ArgoCD and Flux.

What You Have Built

By assembling the pieces in this project, you now own a complete GitOps delivery pipeline that reflects production-grade practices:

  • A mono config repo with clear separation between base manifests, per-environment overlays, and cluster-level ArgoCD Application resources.
  • CI that only writes to Git — no cluster credentials, no direct kubectl calls from pipelines.
  • Staging auto-syncs with drift correction; production requires a human sync approval.
  • Image tag updates are isolated to a single file per environment, making PRs minimal and reviewable.
  • ArgoCD Application manifests live in the config repo — the platform is self-describing and recoverable from Git alone.

This is the pattern used by teams shipping hundreds of deployments per day. Congratulations on completing the GitOps tutorial. The foundation you have built here — pull-based reconciliation, declarative configs, automated drift correction, and promotion gates — is the same foundation underpinning the deployment infrastructure at the largest engineering organizations in the world.