Secrets Management & PKI

Vault Dynamic Secrets

18 min Lesson 4 of 28

Vault Dynamic Secrets

Static credentials — a database password or an AWS access key that lives in a config file or environment variable — have a fundamental flaw: they do not expire. If a credential leaks through a log line, a git commit, a compromised CI runner, or a noisy shoulder-surf, it remains valid indefinitely unless someone manually rotates it. At scale, with hundreds of services, "manually rotate" means "never rotates." Dynamic secrets fix this at the architectural level: Vault generates a brand-new, time-limited credential on demand and automatically destroys it when the lease expires. There is nothing to leak that has lasting value.

Core concept: A dynamic secret is not retrieved from storage — it is created at request time. Vault calls the target system's API (database, AWS STS, Azure AD, etc.), provisions a new principal with the requested permissions, returns the credential, and sets a TTL. When the TTL expires, Vault revokes it. The application gets credentials that have never existed before and will never exist again after the lease ends.

The Static vs. Dynamic Credential Risk Model

Consider a typical microservices deployment. Ten services share one Postgres password stored in a Kubernetes Secret. The blast radius of a single credential compromise is all ten services and every table in the database. The attacker has days or weeks before anyone notices and rotates. With dynamic secrets, each service gets its own credential tied to its own lease. If service A is compromised, only service A's credential exists — and it expires in one hour. The attacker's window collapses from weeks to minutes.

Static credentials create shared, long-lived blast radii. Dynamic secrets isolate each service with short-lived, auto-revoked credentials.

Enabling the Database Secrets Engine

Vault ships with a database secrets engine that supports PostgreSQL, MySQL, MSSQL, MongoDB, Cassandra, Elasticsearch, and more. The pattern is identical across all of them: configure a connection, define roles with SQL templates, and let services request credentials on demand. Here is the complete setup for PostgreSQL, which is the most common production use case:

# 1. Enable the database secrets engine at a path
vault secrets enable -path=database database

# 2. Configure the connection to your PostgreSQL cluster
#    Vault uses this privileged connection to CREATE and DROP roles
vault write database/config/prod-postgres \
  plugin_name=postgresql-database-plugin \
  connection_url="postgresql://{{username}}:{{password}}@postgres.internal:5432/appdb?sslmode=require" \
  allowed_roles="app-readonly,app-readwrite,app-migration" \
  username="vault_admin" \
  password="$VAULT_ADMIN_PW" \
  rotation_statements="ALTER USER {{username}} WITH PASSWORD '{{password}}';"

# 3. Rotate the initial password immediately so Vault is the only entity that knows it
vault write -force database/config/prod-postgres/rotate-root

# 4. Define a role with a SQL creation template and a TTL
vault write database/roles/app-readonly \
  db_name=prod-postgres \
  creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}';
    GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
  revocation_statements="REVOKE ALL ON ALL TABLES IN SCHEMA public FROM \"{{name}}\";
    DROP ROLE IF EXISTS \"{{name}}\";" \
  default_ttl=1h \
  max_ttl=24h

# 5. Request a credential (what an application or CI job does at startup)
vault read database/creds/app-readonly
# Key                Value
# ---                -----
# lease_id           database/creds/app-readonly/8a2f3c1d...
# lease_duration     1h
# lease_renewable    true
# password           A1a-xyz987pqr
# username           v-token-app-rea-abc123-1748000000

Production practice: Always call rotate-root immediately after configuring a database connection. This ensures the privileged account password is unknown to anyone — including the engineer who set it up — and is managed exclusively by Vault. Without this step, the bootstrap password lives in shell history and Vault audit logs in plaintext.

AWS Dynamic Credentials (IAM)

The AWS secrets engine follows the same pattern but calls the AWS STS or IAM API instead of a database. It can generate assumed-role credentials (temporary STS tokens via sts:AssumeRole), federated user tokens, or full IAM users with programmatic access keys. In production, assumed-role credentials are strongly preferred because they are native to AWS IAM, have built-in expiry enforced by STS, and do not require Vault to call iam:DeleteAccessKey for cleanup.

# Enable the AWS secrets engine
vault secrets enable -path=aws aws

# Configure with a Vault IAM user that has sts:AssumeRole permission
vault write aws/config/root \
  access_key="$AWS_ACCESS_KEY_ID" \
  secret_key="$AWS_SECRET_ACCESS_KEY" \
  region=us-east-1

# Create a role that assumes an existing IAM role via STS
vault write aws/roles/deploy-role \
  credential_type=assumed_role \
  role_arns=arn:aws:iam::123456789012:role/DeployerRole \
  default_ttl=15m \
  max_ttl=1h

# The application requests credentials at runtime:
vault read aws/creds/deploy-role
# Key                Value
# ---                -----
# lease_id           aws/creds/deploy-role/9c4d...
# access_key         ASIAXXX...
# secret_key         wJalrXUt...
# security_token     FQoGZXIvYXdzE...   <-- required for assumed-role creds
# lease_duration     15m
# lease_renewable    true

# In a deploy script: inject directly into the environment
eval $(vault read -format=json aws/creds/deploy-role | \
  jq -r '"export AWS_ACCESS_KEY_ID=\(.data.access_key)
export AWS_SECRET_ACCESS_KEY=\(.data.secret_key)
export AWS_SESSION_TOKEN=\(.data.security_token)"')

Production pitfall: AWS assumed-role credentials include a security_token (session token) that MUST be set as AWS_SESSION_TOKEN. Applications that consume IAM user credentials without a session token will fail silently on assumed-role creds — they get 403 errors that look like permission issues rather than credential format errors. Test your application with STS credentials early, not just with long-lived IAM user keys.

Lease Lifecycle: Renewal and Revocation

Every dynamic secret comes with a lease: a lease ID, a TTL, and a renewable flag. Vault tracks all active leases. An application should renew its lease before it expires to avoid being disconnected mid-session. If an application is being decommissioned, it should revoke the lease immediately rather than waiting for natural expiry — this is the principle of fail closed.

# Renew a lease before it expires (e.g., from a sidecar or background goroutine)
vault lease renew database/creds/app-readonly/8a2f3c1d...

# Revoke a specific lease immediately (on app shutdown or incident)
vault lease revoke database/creds/app-readonly/8a2f3c1d...

# Revoke ALL leases under a prefix (nuclear option for a compromised service)
vault lease revoke -prefix database/creds/app-readonly/

# List all active leases for a mount (useful during incident response)
vault list sys/leases/lookup/database/creds/app-readonly/

# Inspect a specific lease
vault write sys/leases/lookup lease_id=database/creds/app-readonly/8a2f3c1d...

The Vault Agent sidecar handles the lease lifecycle automatically in most production deployments. It authenticates to Vault on behalf of the application, writes the credential to a shared tmpfs volume or to the process environment, renews leases proactively at 2/3 of the TTL, and re-fetches credentials when leases cannot be renewed. Applications consume credentials from a well-known path and never talk to Vault directly.

Vault Agent handles authentication and lease renewal as a sidecar; the application container reads credentials from a shared tmpfs volume and has no direct Vault dependency.

Policy Scoping for Dynamic Secret Roles

Each service should have a Vault policy that grants access to only the specific roles it needs. Write the policy in HCL and associate it with the service's authentication method (Kubernetes ServiceAccount, AWS IAM role, AppRole, etc.):

# policy: svc-payments-policy.hcl
# Grants the payments service read-only DB credentials and nothing else
path "database/creds/app-readonly" {
  capabilities = ["read"]
}

path "database/creds/payments-readwrite" {
  capabilities = ["read"]
}

# Allow the service to renew and revoke its own leases
path "sys/leases/renew" {
  capabilities = ["update"]
}

path "sys/leases/revoke" {
  capabilities = ["update"]
}

# Write and bind the policy
vault policy write svc-payments svc-payments-policy.hcl

# Bind to Kubernetes auth: the payments ServiceAccount in the payments namespace
vault write auth/kubernetes/role/payments \
  bound_service_account_names=payments-svc \
  bound_service_account_namespaces=payments \
  policies=svc-payments \
  ttl=1h

Principle of least privilege: A single Vault policy must never grant access to the wildcard path database/creds/*. Each service policy must list only the specific role paths that service legitimately uses. When a service is compromised, a narrow policy limits the attacker to only the database roles that service was entitled to — not every role in the engine.

Why Dynamic Beats Static: Production Evidence

At companies like Lyft, Netflix, and Shopify, the migration from static to dynamic secrets produced measurable outcomes: mean-time-to-detect credential exposure dropped from days to minutes (because leaked creds expire before forensics complete), secret rotation went from a quarterly manual process to continuous and automatic, and audit trails became per-request rather than per-service. The TTL is the forcing function — when every credential expires in an hour, your entire secret management posture is permanently in rotation without any operational overhead.

Start with database secrets. The database engine covers the highest-risk credential type in most organizations. Enable it, configure one PostgreSQL or MySQL connection, migrate one service to dynamic creds, and measure the impact. Once the pattern is proven, roll it to the rest of the fleet. Do not try to migrate all secret types simultaneously — crawl, walk, run.