Foundation: Accounts, Network & IAM
Foundation: Accounts, Network & IAM
Before a single container runs or a pipeline fires, three things must be correct: account structure, network topology, and identity baseline. At Amazon, Google, and Microsoft, platform teams spend weeks on these decisions before any workload reaches them. Get them wrong and you spend the next two years working around the consequences. Get them right and every subsequent layer — Kubernetes, CI/CD, observability, security — slots in cleanly.
The Landing Zone: Multi-Account Strategy
A landing zone is the opinionated, pre-configured multi-account environment that enforces guardrails before any team touches it. AWS Control Tower, GCP Landing Zone Fabric, or Azure Landing Zones each codify years of enterprise best practice into a reproducible baseline.
The core principle is blast-radius isolation through account separation. A credential leak, a misconfigured S3 bucket, or a runaway cost spike in one account cannot cascade to another. A sensible OU hierarchy for a mid-to-large organisation looks like this:
- Root / Management account — AWS Organizations root, consolidated billing, zero workloads. Service Control Policies (SCPs) attach here as the permission ceiling for every account below.
- Security OU — Log Archive account (all CloudTrail, VPC Flow Logs, Config snapshots from every account centralised here) and a Security Tooling account (GuardDuty delegated admin, Security Hub, SIEM ingestion).
- Infrastructure OU — Shared Services account (Transit Gateway hub, Route 53 Resolver, ECR, shared EKS add-ons) and optionally a Network account for centralised ingress/egress through an inspection VPC.
- Workloads OU — one account per team per environment (prod, staging, dev). A team owning three microservices still shares one account per environment; per-service accounts blow past account limits and fragment cost visibility.
- Sandbox OU — individual developer accounts with a hard spend SCP ($200/month), auto-nuked weekly via
aws-nuke.
AWS Control Tower automates account vending and applies mandatory guardrails (called Controls). The Account Factory for Terraform (AFT) wraps this in a GitOps pipeline so every new account is a pull request:
VPC Design: The Network Foundation
Every account that runs workloads gets at least one custom VPC. The default VPC is deleted on account creation — an SCP can enforce this. A production-grade VPC for a capstone platform uses a three-tier subnet model across three Availability Zones (never two — a dual-AZ architecture has a 50% probability of being degraded during any single AZ event).
The tier structure and CIDR plan for the platform VPC in us-east-1 (10.0.0.0/16):
- Public subnets (
/24each, one per AZ) — Internet-facing ALBs, NAT Gateways. Nothing else. No application servers, ever. - Private / Application subnets (
/22each) — EKS nodes, EC2 app servers, Lambda in VPC. Outbound internet via NAT Gateway only. - Data subnets (
/24each) — RDS, ElastiCache, MSK. No route to the internet whatsoever, not even NAT.
c5.4xlarge node has an ENI limit of 58 secondary IPs. A cluster that scales to 50 nodes can consume 2,900 IPs in a single AZ. A /24 (251 usable) will exhaust instantly. Use /22 (1,019 usable) or larger for the application tier. This is the single most common VPC mistake in EKS deployments.
Connectivity between accounts uses AWS Transit Gateway (TGW), not VPC Peering. Peering is a mesh — O(n²) connections as accounts grow. TGW is a hub; route tables on the TGW control which spoke VPCs can reach each other. Prod VPCs must never have a TGW route to dev VPCs.
IAM Identity Baseline
On day zero, the identity baseline has two goals: eliminate long-lived credentials and enforce least-privilege role assumption. Every human identity should authenticate through your IdP (Okta, Entra ID, Google Workspace) via AWS IAM Identity Center (SSO), not IAM users. No engineer should have an access key in ~/.aws/credentials in production.
The key patterns at big-tech scale:
- Permission Sets in IAM Identity Center — map to IAM roles in each account. Define four tiers: ReadOnly, Developer, PowerUser, Administrator. Attach SCPs so even the Administrator permission set cannot disable CloudTrail or modify the landing-zone baseline.
- OIDC trust for CI/CD — GitHub Actions, GitLab CI, and ArgoCD assume IAM roles via OIDC federation. No static secrets in CI ever. The
aws-actions/configure-aws-credentialsaction handles token exchange automatically. - EC2/EKS workload identity — EC2 instance profiles and EKS IRSA (IAM Roles for Service Accounts) give workloads scoped credentials without any key management. Each service account gets its own role with exactly the S3 buckets and DynamoDB tables it needs.
- AWS Organizations SCPs for guardrails — deny
iam:CreateUser, denyiam:CreateAccessKey(except a break-glass automation account), denys3:PutBucketPublicAccessBlockwithBlockPublicAcls=false.
Bringing It Together: Foundation Sequence
The correct provisioning order matters because later layers depend on earlier ones:
- Enable AWS Organizations, set root SCPs (region deny, CloudTrail protect, IAM user deny).
- Bootstrap Control Tower with AFT; vend the Security OU accounts first (Log Archive, Security Tooling).
- Set up IAM Identity Center, connect to IdP, define permission sets — before any human logs into a workload account.
- Vend workload accounts via AFT; each customisation baseline runs Terraform that creates the VPC, subnets, TGW attachment, Flow Logs, Config rules, and the OIDC IAM role.
- Establish CIDR registry — a simple DynamoDB table or Terraform state in a central S3 bucket that records every VPC CIDR allocation. Without this, two teams will eventually pick overlapping ranges and discover it only when they try to peer.
With accounts isolated, networks segmented into tiers, and identity routed through the IdP with no long-lived credentials anywhere, you have eliminated the most common root causes of cloud security incidents. Lesson 3 builds the Kubernetes platform on top of this foundation.