EC2: Instances & AMIs
EC2: Instances & AMIs
Amazon EC2 (Elastic Compute Cloud) is the compute backbone of AWS. Every container cluster node, every CI runner, every self-managed database, and every legacy app you will ever migrate to cloud will eventually become an EC2 instance. Understanding the instance model at depth — not just "pick a t3.medium and SSH in" — is what separates engineers who fight fires from engineers who architect systems that never catch fire.
Instance Types: Choosing the Right CPU-to-Memory Ratio
EC2 instance types follow a naming convention: family + generation + size, with optional capability suffixes. Example: m7g.2xlarge = general-purpose family (m), 7th generation, Graviton3 ARM (g), 2xlarge size (8 vCPU / 32 GiB RAM).
- General Purpose (m, t) — balanced CPU/memory.
m7i/m7gfor production workloads;t4g/t3for bursty dev/staging. Thetfamily uses CPU credits — attractive for cost but a silent killer for latency-sensitive production services when credits drain. - Compute Optimized (c) — high CPU, lower RAM per core.
c7g/c7ifor encoding, game servers, CI build agents, and CPU-heavy microservices. - Memory Optimized (r, x, u) — high RAM.
r7gfor in-memory databases (Redis), large JVM heaps, Elasticsearch;x2idnfor SAP HANA-class workloads. - Storage Optimized (i, d, h) — NVMe local SSDs with very high IOPS/throughput.
i4ifor Kafka brokers, Cassandra, and high-throughput OLTP where EBS latency is a bottleneck. - Accelerated Computing (p, g, inf, trn) — GPU/custom silicon.
p4defor ML training;g5for inference and graphics;inf2/trn1for AWS Inferentia/Trainium chips at lower $/inference than GPUs.
m7g, c7g, r7g) deliver roughly 40% better price/performance than x86 equivalents for most cloud-native workloads. Migrate ARM-compatible containers and services first — the wins are immediate and the operational overhead is minimal.
Amazon Machine Images (AMIs)
An AMI is an immutable, region-scoped snapshot of a root volume plus launch permissions and block device mappings. Every instance launch references exactly one AMI. There are three AMI sources:
- AWS-managed — Amazon Linux 2023 (AL2023), Ubuntu, Windows Server, etc. AL2023 is the recommended baseline for new builds: it ships with newer packages, SELinux enforcing by default, and a predictable support window.
- AWS Marketplace — pre-hardened commercial images (CIS benchmarks, security appliances). Subject to software licensing costs on top of EC2 pricing.
- Custom (golden AMI) — your organization's artifact. Built with Packer or EC2 Image Builder, baked with your agents (CloudWatch, SSM, Datadog), your security baselines, and pre-installed dependencies. This is the production standard at every serious company.
User Data: Bootstrap Without SSH
User Data is a bash script (or cloud-init YAML) that EC2 runs as root on first boot, before your application starts. It is the bridge between your golden AMI and a fully configured running instance. A well-designed golden AMI minimizes user data to environment-specific configuration only — pulling secrets, writing runtime env files, starting the application systemd unit. User data that installs packages from scratch on every boot is a sign that the AMI pipeline is underdeveloped.
cloud-init with always frequency or an EC2 Systems Manager Run Command document instead.
Instance Lifecycle
EC2 instances move through a well-defined state machine. Every state transition has billing and operational implications that matter at scale:
- pending — instance is being provisioned. EC2 is acquiring capacity, loading the AMI, and running user data. Billing has not started.
- running — instance is operational. Billing is active per second (Linux) or per hour (Windows).
- stopping / stopped — instance is shut down. The root EBS volume and all attached EBS volumes persist. You pay for EBS storage but not for instance compute hours. The instance keeps its instance ID, private IP, and Elastic IP (if attached). Useful for saving cost on non-production instances overnight.
- shutting-down / terminated — instance is permanently deleted. By default, the root EBS volume is also deleted (
DeleteOnTermination=true). Additional attached volumes are NOT deleted by default — watch for EBS orphan cost. Instance ID is gone forever. - rebooting — OS-level restart. Does not change the underlying host, does not incur a new billing period, and does not change IP addresses.
Purchasing Models
How you pay for EC2 capacity is as consequential as what you run on it. Three models dominate production workloads:
- On-Demand — pay per second with no commitment. Correct for unpredictable or short-lived workloads and for new workloads whose usage pattern is unknown.
- Reserved Instances / Savings Plans — commit to 1 or 3 years in exchange for 30–72% discounts. Compute Savings Plans are the modern preference — they apply automatically to any EC2 family, region, or OS, unlike old-style RIs which are instance-type-specific. At steady-state, your baseline capacity should always be covered by a Savings Plan.
- Spot Instances — bid on unused EC2 capacity at up to 90% discount. AWS can reclaim with a 2-minute warning. Production-appropriate for stateless, fault-tolerant workloads: CI/CD runners, batch ML training, rendering farms, and (with proper Spot diversification) auto-scaling groups behind a load balancer. Never use Spot for stateful singletons.
capacity-optimized allocation strategy, at least three instance families, and at least two AZs. Use Mixed Instances Policy to blend On-Demand base + Spot overflow.
Instance Metadata Service (IMDS)
Every running EC2 instance has a local HTTP endpoint at 169.254.169.254 that serves metadata about itself: instance ID, region, AZ, IAM role credentials, user data, and more. Applications running on EC2 use this to self-identify without any external configuration. Always use IMDSv2 (token-gated) in production — IMDSv1 is exploitable via SSRF and is the root cause of numerous high-profile cloud credential leaks.
ec2-imdsv2-check) and block IMDSv1 at the Launch Template level by setting HttpTokens: required. If you are running EKS, also set HttpPutResponseHopLimit: 2 so that pod-level IMDS calls work correctly through the container network namespace hop.