The Secrets Problem
The Secrets Problem
Every production system has secrets: database passwords, API keys, TLS private keys, OAuth client credentials, SSH keys, cloud provider access tokens. The question is never whether your system has secrets — it does — but whether those secrets are managed deliberately or scattered accidentally across every surface your engineers have ever touched.
The majority of real-world breaches do not start with a zero-day exploit or a sophisticated attack. They start with a secret committed to a Git repository, baked into a Docker image, pasted into a CI environment variable with overly broad access, or left in a shell history file on a shared bastion host. This lesson maps the full attack surface so you understand exactly what you are defending against before you design a solution.
What is Secrets Sprawl?
Secrets sprawl is the condition where credentials exist in many places simultaneously, often without a centralized inventory, rotation schedule, or access audit trail. It happens organically: a developer hard-codes a database password to get something working, copies it to a CI environment variable, pastes it into a Slack DM to a colleague, and then forgets every location it landed.
At scale, sprawl looks like this: a 50-engineer company runs a scan of their GitHub history and finds 3,400 secrets across 120 repositories, many of them active credentials for production systems. This is not hypothetical — it is the median result of running a tool like trufflehog or git-secrets on an unmanaged codebase for the first time. Every organisation that has never run one of these scans should assume they are in this state.
The Six Leak Paths
Secrets escape into the wrong hands through a predictable set of vectors. Understanding each one is prerequisite to closing them.
1. Source Code Commits
The most common and most dangerous vector. A developer checks in a .env file, hardcodes a key in a config, or pushes a test file that contains a real production credential. Git remembers everything: even after a git rm and a new commit, the secret lives in the full history, cloneable by anyone with repo access — and if the repo is ever briefly public, indexed by GitHub's secret scanning and by external scrapers within minutes.
The correct approach is to use environment variables or a secrets manager and never let the credential touch the file system in plaintext. But the hard problem is that preventing the commit requires tooling, policy, and culture — not just intent. Even experienced engineers commit secrets under deadline pressure. Pre-commit hooks and CI scanning are not optional at professional scale; they are table stakes.
2. CI/CD Environment Variables
Environment variables in CI systems (GitHub Actions secrets, GitLab CI variables, Jenkins credentials store) feel safe because they are masked in logs. They are not. The masking is cosmetic: it replaces the literal string in log output, but the variable is still available to any step in the pipeline, including third-party actions or plugins you have pulled in without auditing. A malicious or compromised GitHub Action can exfiltrate every secret in your environment with a single outbound HTTP request.
3. Container Images
Docker build processes are a classic credential sink. A developer adds an ARG or ENV to pull a private package, run a database migration, or authenticate an API call during build time. That value is then baked into one or more image layers and is trivially extractable by anyone who can pull the image:
4. Logs and Monitoring Systems
Application logs are the third most common exfiltration point. Request logs that include full URLs capture API keys passed as query parameters (?api_key=abc123). Error reports include stack traces that dump environment variables. Distributed tracing headers can carry auth tokens. All of this lands in Elasticsearch, Datadog, Splunk, or CloudWatch — systems with much weaker access control than your production secrets store.
5. Infrastructure State Files
Terraform state files (terraform.tfstate) contain the full plaintext output of every resource created — including generated passwords, private keys, and database connection strings. A state file stored in an S3 bucket without encryption or proper IAM policies is a complete credential dump of your infrastructure. The same applies to Ansible vault files stored with weak passphrases, CloudFormation outputs written to Parameter Store without encryption, and Helm chart values files checked into Git.
6. Shared Secrets Between People
Slack, email, Notion, 1Password shared vaults, and shared bastion host accounts all represent the human vector: secrets passed person-to-person leave a copy in every system they transited through. There is no reliable way to rotate a secret that has been shared over Slack — you do not know who has it, who exported the conversation, or whether it landed in a third-party Slack app's storage.
The Real-World Cost: Three Canonical Incidents
These are not hypotheticals. They are the pattern of real breaches:
- Uber (2022): Attacker obtained contractor credentials via SMS phishing, then found AWS keys hardcoded in an internal PowerShell script on a network share. Full access to production AWS environment, 57 million records exposed.
- Toyota (2023): GitHub repository accidentally made public for five years contained credentials granting access to a data management server. 215,000 customer records exposed. The credential was never rotated because no one knew it was in the repo.
- Codecov (2021): Supply chain attack modified the Codecov bash uploader script to exfiltrate all environment variables from any CI pipeline that ran it — capturing secrets from thousands of downstream companies including HashiCorp, Twilio, and Rapid7.
git filter-repo or BFG to remove the secret from history is a destructive rewrite that invalidates every clone and fork. The operational cost of rewriting history usually exceeds the cost of rotating the credential. Rotate first, rewrite never (or only if the credential cannot be rotated).
Quantifying Your Attack Surface: Detection First
Before building a secrets management system, you must understand the current state of your secrets sprawl. Run these tools against your codebase and infrastructure before your next sprint planning session:
trufflehog in CI achieves the same coverage. Either way, this scan should run on every PR — not as an afterthought but as a blocking check. A PR that introduces a secret should fail CI before it can be merged.
Why the Standard DevOps Stack Is Not Enough
A common mistake is believing that environment variables are "secure enough." They are not — they are just less convenient to read than a plaintext file. Every process running as the same OS user can read every environment variable. In containers, docker inspect <container> dumps them. Kubernetes Secret resources stored in etcd are base64-encoded, not encrypted by default, and readable by anyone with get secrets RBAC permission across the cluster. Base64 is not encryption.
The next lesson establishes the principles of a proper secrets management system: centralisation, dynamic credentials, least-privilege access, full audit logging, and automatic rotation. Every one of these principles is a direct answer to one of the leak paths mapped in this lesson.