Advanced Terraform & IaC Patterns

Policy as Code for IaC

18 min Lesson 8 of 28

Policy as Code for IaC

Infrastructure-as-Code gives teams the power to provision anything — and that is exactly the problem. Without guardrails, a developer can open port 22 to the world, spin up an unencrypted RDS instance, or deploy a public S3 bucket holding customer PII, all via a clean terraform apply. Policy as Code (PaC) is the practice of encoding those guardrails as machine-readable, version-controlled policy files and enforcing them automatically inside the CI/CD pipeline — before a single API call reaches your cloud provider. At Google, Meta, and Amazon, PaC gates every infrastructure change. A plan that violates policy is blocked, not just warned; the engineer gets a structured rejection with the exact rule reference and remediation steps.

The Three Major Tools

Three tools dominate the Terraform PaC space, and they occupy different positions in the trust model:

Open Policy Agent (OPA) + Conftest — a general-purpose policy engine from the CNCF. Policies are written in the Rego language. You feed it a Terraform plan JSON and evaluate it against Rego rules. Open source, cloud-agnostic, widely adopted in Kubernetes admission control too.
Sentinel — HashiCorp's proprietary policy framework, built directly into Terraform Cloud/Enterprise. Policies are written in the Sentinel language and run as a native gate between plan and apply. Best-in-class integration with the HCP platform but requires TFC/TFE.
Checkov — a static analysis scanner by Bridgecrew (acquired by Palo Alto). It parses Terraform HCL directly (no plan needed) and checks against 1,000+ built-in policies for AWS, Azure, and GCP misconfigurations. Fast, zero-config, and integrates into any CI pipeline as a linter step.

Use all three, at different stages. Checkov runs in seconds on raw HCL during a developer's local PR check (shift-left). OPA/Conftest evaluates the full plan JSON after terraform plan (pre-apply gate). Sentinel enforces organization-wide hard policy in Terraform Enterprise. Layering tools catches different classes of violation at the cheapest point in the pipeline.

The Gated Pipeline Architecture

The diagram below shows how policy gates sit inside a production-grade Terraform CI pipeline. Each gate is a hard failure — a non-zero exit code fails the pipeline and blocks merge or apply.

A policy-gated Terraform CI pipeline: Checkov blocks on raw HCL; OPA/Conftest blocks on the plan JSON; only a clean plan reaches apply.

Checkov: Shift-Left Static Scanning

Checkov is the fastest feedback loop. Run it locally before you even open a PR, and again in CI on every push. It requires no plan — it reads your .tf files directly.

# Install (pipx keeps it isolated from system Python)
pipx install checkov

# Scan a directory — block the pipeline on HIGH or CRITICAL findings
checkov -d ./modules/vpc \
  --check CKV_AWS_25,CKV_AWS_23 \
  --compact \
  --output cli \
  --hard-fail-on HIGH

# In CI (GitHub Actions excerpt):
# - name: Checkov scan
#   uses: bridgecrewio/checkov-action@master
#   with:
#     directory: .
#     framework: terraform
#     soft_fail: false          # non-zero exit on any failed check
#     skip_check: CKV2_AWS_6   # suppress known accepted risk (document WHY)

# Generate a baseline to suppress existing findings while you fix incrementally:
checkov -d . --create-baseline    # writes .checkov.baseline
checkov -d . --baseline .checkov.baseline  # future runs ignore baselined issues

Commit the .checkov.baseline file. It documents accepted risk and prevents finding-fatigue from inherited debt. Every suppression in the baseline should have a corresponding Jira/GitHub issue reference in a comment so the debt is tracked, not hidden.

OPA + Conftest: Plan-Level Policy

Checkov misses runtime decisions that only appear in the plan: whether a security group rule is 0.0.0.0/0, whether an instance type is on the approved list, whether a tag is missing. OPA evaluates the full terraform show -json plan, giving you access to every resource change including computed values.

# 1. Generate the plan JSON (standard step in all gated pipelines)
terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json

# 2. Write a Rego policy — policy/deny_public_sg.rego
# package main
#
# import future.keywords.if
# import future.keywords.in
#
# deny[msg] if {
#   rc := input.resource_changes[_]
#   rc.type == "aws_security_group_rule"
#   rc.change.after.cidr_blocks[_] == "0.0.0.0/0"
#   rc.change.after.type == "ingress"
#   msg := sprintf(
#     "DENY: %s opens ingress 0.0.0.0/0 — use a specific CIDR",
#     [rc.address]
#   )
# }

# 3. Evaluate with Conftest (wraps OPA for CI-friendly output)
conftest test tfplan.json \
  --policy policy/ \
  --namespace main \
  --output table   # structured output; exits 1 on any deny

# Conftest returns exit code 1 if any deny rule fires.
# In GitHub Actions, that non-zero code fails the step and blocks the workflow.

Keep policies in a dedicated policy/ directory inside your infra-live repo, versioned alongside the infrastructure code. Large teams publish policies to a shared OPA bundle server (S3 + OPA bundle API) so all squads consume the same central ruleset without copying files.

Sentinel: Hard Policy in Terraform Enterprise

If your organization uses Terraform Cloud or Enterprise, Sentinel is the native answer. It runs automatically after every plan in a workspace, before operators can approve an apply. Sentinel policies have three enforcement levels: advisory (log only), soft-mandatory (blockable by a human override), and hard-mandatory (cannot be overridden — ever). Compliance teams love hard-mandatory for rules like "no resource may be created without an Owner tag" or "all S3 buckets must have versioning enabled."

# sentinel.hcl — policy set manifest (committed to your policy repo)
# policy "require-owner-tag" {
#   source = "./policies/require_owner_tag.sentinel"
#   enforcement_level = "hard-mandatory"
# }
#
# policy "approved-instance-types" {
#   source = "./policies/approved_instance_types.sentinel"
#   enforcement_level = "soft-mandatory"
# }

# require_owner_tag.sentinel
# import "tfplan/v2" as tfplan
#
# all_resources = filter tfplan.resource_changes as _, rc {
#   rc.mode is "managed" and
#   (rc.change.actions contains "create" or
#    rc.change.actions contains "update")
# }
#
# main = rule {
#   all all_resources as _, rc {
#     rc.change.after.tags["Owner"] is not null
#   }
# }

Hard-mandatory Sentinel policies block every apply in every workspace in that policy set — including break-glass and incident recovery. Always have a process: either a separate emergency workspace exempt from the policy set, or an on-call engineer with TFE admin rights who can temporarily disable the policy set (logged and audited). Without this escape hatch, a policy bug can block your incident response.

Structuring Policies for Scale

At big-tech scale, policies live in their own repository (infra-policies) with its own CI, versioning, and review process. The key practices:

Test your policies. OPA ships with opa test; Sentinel has a testing framework. A broken policy that fires on valid configs is as damaging as no policy at all — it trains engineers to bypass gates.
Separate WARN from BLOCK. New policies should start as warnings for two weeks. This gives teams time to fix existing violations before the rule becomes a hard failure. Track the warning period in the policy file header as a comment.
Tag every policy with a control ID. Link each Rego/Sentinel rule to a SOC 2, CIS, or internal control number. This makes audit evidence collection a conftest test --output json command rather than a manual spreadsheet exercise.
Version your policy bundles. Consumers (workspaces, pipelines) pin to a specific bundle version. Policy changes go through PRs with CODEOWNERS review from the security team before release.

Policy as Code is not a one-time setup. Your infrastructure evolves, your threat model evolves, and new cloud services appear. Assign a policy review cadence — quarterly at minimum — where the platform team audits which policies are firing, which are being suppressed without documented rationale, and which new resources need coverage. Treat it like dependency updates: schedule it, automate the reporting, and own the debt.

With Checkov catching misconfigurations at the HCL level, OPA evaluating the full plan, and Sentinel enforcing organization-wide hard rules in the platform, you have defense-in-depth for your infrastructure supply chain. The result: engineers move fast, compliance teams sleep at night, and audit evidence is a git log away.