Compliance & Policy as Code

Compliance for Engineers

18 min Lesson 1 of 27

Compliance for Engineers

Compliance is not a legal department problem dropped on engineering at the last minute. In organizations running on cloud infrastructure, the infrastructure team is the compliance team in practice. Understanding what each framework actually demands of your systems lets you build compliance in from day one — instead of retrofitting controls before an audit and hoping nothing breaks in production.

This lesson covers the four frameworks that come up most in cloud-native environments: SOC 2, ISO 27001, PCI DSS, and GDPR. For each, we cut straight to what they require of your infrastructure and how that maps to real engineering work.

SOC 2 — The Cloud-Era Trust Standard

SOC 2 (Service Organization Control 2) is a voluntary US framework defined by the AICPA. Almost every B2B SaaS company pursuing enterprise customers will eventually need a SOC 2 Type II report. It audits your controls against five Trust Services Criteria: Security, Availability, Processing Integrity, Confidentiality, and Privacy. Security is mandatory; the rest depend on what you offer.

What SOC 2 actually requires of infrastructure:

Access control: every human and service principal must have least-privilege access. Audit logs must prove it. AWS IAM permission boundaries, GCP Workload Identity, and Kubernetes RBAC are the common enforcement points.
Change management: all infrastructure changes must go through a documented, reviewed process. Ad-hoc SSH into a prod server and editing configs manually fails this criterion outright.
Availability monitoring: you must demonstrate that you detect and respond to incidents. Prometheus + Alertmanager + PagerDuty with documented runbooks satisfies the evidence requirement.
Encryption: data in transit (TLS 1.2 minimum, prefer 1.3) and at rest (AES-256). RDS encryption at rest must be enabled at creation — it cannot be added retroactively without a snapshot-and-restore cycle.
Vendor management: any third-party service that touches your data needs a SOC 2 report of its own on file. This means tracking your SaaS tool list.

Type I vs Type II: Type I is a point-in-time snapshot — controls are designed. Type II covers a period, typically 6-12 months — controls operated effectively. Enterprise procurement teams require Type II. Build toward it from the start; retroactive evidence collection is painful.

ISO 27001 — The International Standard

ISO 27001 is the global standard for an Information Security Management System (ISMS). It is certification-based rather than audit-report-based, which matters for EU and government customers. The 2022 revision restructured controls into four themes: Organizational, People, Physical, and Technological — the last of which is where DevOps engineers spend most of their time.

Key infrastructure requirements under ISO 27001 Annex A (2022):

A.8.7 — Protection against malware: running workloads must have runtime threat detection. Falco on Kubernetes or GuardDuty on AWS satisfies this at the infrastructure layer.
A.8.9 — Configuration management: systems must be provisioned from defined, version-controlled configurations. This is where Terraform, Ansible, and your GitOps pipeline become compliance artifacts, not just productivity tools.
A.8.15 — Logging: all security-relevant events must be logged and the logs must be retained for a defined period. Ninety days active plus one year cold is a common baseline. CloudWatch Logs with a retention policy, or a centralized Loki/OpenSearch cluster, satisfies this.
A.8.22 — Segregation of networks: workloads of different sensitivity must be isolated. VPC segmentation with explicit allow-list security groups is the standard implementation.

The four major compliance frameworks and how their requirements map to specific infrastructure controls.

PCI DSS — Card Data in Infrastructure

PCI DSS (Payment Card Industry Data Security Standard) applies when your infrastructure stores, processes, or transmits cardholder data. Version 4.0 (released 2022, fully effective 2025) significantly raised the bar on authentication and monitoring. The most important concept for engineers is the Cardholder Data Environment (CDE) — the set of systems that touch card data. Everything outside the CDE should be completely isolated from it.

Critical infrastructure requirements:

Requirement 1 — Network controls: the CDE must be in a separate network segment. In AWS this means a dedicated VPC (or at minimum dedicated subnets) with strict security groups and no default-allow egress. Document the network topology — auditors will ask for it.
Requirement 3 — Stored data: Primary Account Numbers (PANs — the 16-digit card number) must never appear in logs. Debug logging, error traces, and JSON body dumps all risk leaking PANs. Implement log scrubbing at the application layer and verify it with automated tests.
Requirement 6 — Secure development: WAF in front of any CDE-adjacent web surface, automated vulnerability scanning in your CI pipeline.
Requirement 10 — Logging: all access to CDE systems must be logged with timestamps, user identity, and action. Logs must be tamper-evident. CloudTrail with log file validation enabled, or immutable S3 with Object Lock, satisfies this.
Requirement 12.3.2 — Annual risk assessments: infrastructure teams must produce documented threat models for CDE-touching systems, updated annually.

PCI scope creep is a production trap: every system that can reach the CDE — even indirectly — gets pulled into PCI scope. A monitoring agent that polls a CDE host can drag your entire observability stack into scope. Use a dedicated metrics exporter inside the CDE that pushes to an external collector, rather than allowing inbound connections from outside the CDE.

GDPR — Personal Data in Infrastructure

The General Data Protection Regulation applies to any system processing personal data of EU residents, regardless of where your company is headquartered. For infrastructure engineers, GDPR translates into four concrete categories of work:

Data residency: if your GDPR legal basis or DPA requires EU data to stay in the EU, your Terraform region variables are a compliance control. Restrict RDS, S3, and ElastiCache to eu-west-1 or eu-central-1 and use SCPs to prevent cross-region replication. Drift in your IaC state is a GDPR incident waiting to happen.
Right to erasure (Art. 17): your systems must be able to delete all personal data for a given user on request, within 30 days. This is an architecture constraint. If user IDs are embedded in immutable log lines or S3 object keys, you cannot erase them — design for pseudonymisation from the start.
Breach notification (Art. 33): you have 72 hours from discovery to notify the supervisory authority. Your incident response runbook must include a step for determining whether personal data was exposed. SIEM query capability and data flow maps are the enabling infrastructure.
Data minimisation (Art. 5): only collect and retain what you need. Implement automated log field redaction in your log pipeline.

# Vector log transform — redact PII before shipping to Loki (GDPR Art. 5)
[transforms.redact_pii]
type = "remap"
inputs = ["app_logs"]
source = '''
  del(.email)
  del(.user_ip)
  del(.user_agent)
  if exists(.user_id) {
    .user_id = sha2(.user_id, variant: "SHA-512/256")
  }
'''

What All Four Frameworks Share

Despite their different scopes, every framework converges on the same infrastructure baseline:

Encryption everywhere — TLS in transit, AES-256 at rest, key management via a dedicated service (AWS KMS, HashiCorp Vault).
Access control with evidence — who has access to what, and a log proving every time that access was exercised.
Immutable audit logs — shipped off the host, tamper-evident, retained for the required period.
Documented change management — every production change is traceable to an approved ticket, a reviewed PR, and a CI pipeline execution. Ad-hoc changes are prohibited.
Incident response capability — a runbook exists, it has been tested, and the team can execute it under pressure.

# AWS CLI: verify CloudTrail log file validation is enabled (SOC 2 + PCI Req 10)
aws cloudtrail describe-trails \
  --query "trailList[*].{Name:Name,LogValidation:LogFileValidationEnabled}" \
  --output table

# Enable it if not (replace TRAIL-NAME and REGION):
aws cloudtrail update-trail \
  --name TRAIL-NAME \
  --enable-log-file-validation \
  --region us-east-1

# Terraform: enforce S3 bucket encryption + Object Lock for immutable audit logs (PCI + SOC 2)
resource "aws_s3_bucket" "audit_logs" {
  bucket = "company-audit-logs"

  object_lock_configuration {
    object_lock_enabled = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "audit_logs" {
  bucket = aws_s3_bucket.audit_logs.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.audit.arn
    }
    bucket_key_enabled = true
  }
}

resource "aws_s3_bucket_public_access_block" "audit_logs" {
  bucket                  = aws_s3_bucket.audit_logs.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Map controls to multiple frameworks at once: enabling CloudTrail with log file validation satisfies SOC 2 CC7.2, ISO 27001 A.8.15, and PCI DSS Requirement 10 simultaneously. When you instrument your infrastructure with this mindset — controls as shared evidence — you avoid building four separate compliance programs and can get your first SOC 2 report while simultaneously working toward ISO 27001 certification.

Practical First Steps

If you are joining a team that has no compliance program, the most impactful engineering actions in the first 90 days are:

Enable CloudTrail (all regions, all management events) and ship logs to an immutable S3 bucket with KMS encryption.
Run AWS Config with the CIS AWS Foundations Benchmark managed rules — this gives you a continuous compliance posture score with zero custom code.
Audit IAM: remove wildcard policies, rotate access keys older than 90 days, enable MFA for all human accounts, and delete unused roles.
Document your data flows: which services process PII, where does card data go (if at all), which S3 buckets are publicly accessible. You cannot scope a compliance program without this map.
Create a security incident response runbook. It does not need to be perfect — it needs to exist and to have been walked through at least once in a tabletop exercise.

The next lessons will show how to encode all of this as machine-executable policy — starting with Open Policy Agent — so that compliance becomes a property your CI pipeline enforces rather than a checkbox your team fills out manually before an audit.