Cloud Fundamentals: AWS Core Services

EC2: Instances & AMIs

18 min Lesson 3 of 30

EC2: Instances & AMIs

Amazon EC2 (Elastic Compute Cloud) is the compute backbone of AWS. Every container cluster node, every CI runner, every self-managed database, and every legacy app you will ever migrate to cloud will eventually become an EC2 instance. Understanding the instance model at depth — not just "pick a t3.medium and SSH in" — is what separates engineers who fight fires from engineers who architect systems that never catch fire.

Instance Types: Choosing the Right CPU-to-Memory Ratio

EC2 instance types follow a naming convention: family + generation + size, with optional capability suffixes. Example: m7g.2xlarge = general-purpose family (m), 7th generation, Graviton3 ARM (g), 2xlarge size (8 vCPU / 32 GiB RAM).

General Purpose (m, t) — balanced CPU/memory. m7i/m7g for production workloads; t4g/t3 for bursty dev/staging. The t family uses CPU credits — attractive for cost but a silent killer for latency-sensitive production services when credits drain.
Compute Optimized (c) — high CPU, lower RAM per core. c7g/c7i for encoding, game servers, CI build agents, and CPU-heavy microservices.
Memory Optimized (r, x, u) — high RAM. r7g for in-memory databases (Redis), large JVM heaps, Elasticsearch; x2idn for SAP HANA-class workloads.
Storage Optimized (i, d, h) — NVMe local SSDs with very high IOPS/throughput. i4i for Kafka brokers, Cassandra, and high-throughput OLTP where EBS latency is a bottleneck.
Accelerated Computing (p, g, inf, trn) — GPU/custom silicon. p4de for ML training; g5 for inference and graphics; inf2/trn1 for AWS Inferentia/Trainium chips at lower $/inference than GPUs.

At big-tech scale, Graviton-based instances (m7g, c7g, r7g) deliver roughly 40% better price/performance than x86 equivalents for most cloud-native workloads. Migrate ARM-compatible containers and services first — the wins are immediate and the operational overhead is minimal.

Amazon Machine Images (AMIs)

An AMI is an immutable, region-scoped snapshot of a root volume plus launch permissions and block device mappings. Every instance launch references exactly one AMI. There are three AMI sources:

AWS-managed — Amazon Linux 2023 (AL2023), Ubuntu, Windows Server, etc. AL2023 is the recommended baseline for new builds: it ships with newer packages, SELinux enforcing by default, and a predictable support window.
AWS Marketplace — pre-hardened commercial images (CIS benchmarks, security appliances). Subject to software licensing costs on top of EC2 pricing.
Custom (golden AMI) — your organization's artifact. Built with Packer or EC2 Image Builder, baked with your agents (CloudWatch, SSM, Datadog), your security baselines, and pre-installed dependencies. This is the production standard at every serious company.

Golden AMI pipeline: a base image is hardened by Packer/Image Builder into an immutable versioned AMI used across ASGs, EKS node groups, and CI fleets.

User Data: Bootstrap Without SSH

User Data is a bash script (or cloud-init YAML) that EC2 runs as root on first boot, before your application starts. It is the bridge between your golden AMI and a fully configured running instance. A well-designed golden AMI minimizes user data to environment-specific configuration only — pulling secrets, writing runtime env files, starting the application systemd unit. User data that installs packages from scratch on every boot is a sign that the AMI pipeline is underdeveloped.

#!/bin/bash
# Minimal production user-data — golden AMI already has agents installed
set -euo pipefail

REGION="us-east-1"
ENV="production"

# Pull runtime config from SSM Parameter Store (no credentials needed — instance role)
DB_URL=$(aws ssm get-parameter \
  --name "/myapp/${ENV}/db_url" \
  --with-decryption \
  --query Parameter.Value \
  --output text \
  --region "${REGION}")

# Write env file consumed by the systemd unit (EnvironmentFile=)
mkdir -p /etc/myapp
cat > /etc/myapp/env <<ENVEOF
DB_URL=${DB_URL}
ENVIRONMENT=${ENV}
REGION=${REGION}
ENVEOF
chmod 600 /etc/myapp/env

# The app binary is pre-baked in the AMI — just enable and start it
systemctl enable --now myapp

# Signal ASG lifecycle hook that boot succeeded
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
aws autoscaling complete-lifecycle-action \
  --lifecycle-action-result CONTINUE \
  --instance-id "${INSTANCE_ID}" \
  --lifecycle-hook-name "instance-launching" \
  --auto-scaling-group-name "${ASG_NAME:-unknown}" \
  --region "${REGION}" 2>&1 || true

User Data runs only on first boot of a new instance. If you stop and start an instance (different from reboot), it gets a new physical host but the same root EBS volume — user data does NOT re-run by default. If your bootstrap logic must re-run, use cloud-init with always frequency or an EC2 Systems Manager Run Command document instead.

Instance Lifecycle

EC2 instances move through a well-defined state machine. Every state transition has billing and operational implications that matter at scale:

pending — instance is being provisioned. EC2 is acquiring capacity, loading the AMI, and running user data. Billing has not started.
running — instance is operational. Billing is active per second (Linux) or per hour (Windows).
stopping / stopped — instance is shut down. The root EBS volume and all attached EBS volumes persist. You pay for EBS storage but not for instance compute hours. The instance keeps its instance ID, private IP, and Elastic IP (if attached). Useful for saving cost on non-production instances overnight.
shutting-down / terminated — instance is permanently deleted. By default, the root EBS volume is also deleted (DeleteOnTermination=true). Additional attached volumes are NOT deleted by default — watch for EBS orphan cost. Instance ID is gone forever.
rebooting — OS-level restart. Does not change the underlying host, does not incur a new billing period, and does not change IP addresses.

EC2 instance lifecycle: pending and running are billable compute states; stopped retains EBS; terminated is irreversible.

Purchasing Models

How you pay for EC2 capacity is as consequential as what you run on it. Three models dominate production workloads:

On-Demand — pay per second with no commitment. Correct for unpredictable or short-lived workloads and for new workloads whose usage pattern is unknown.
Reserved Instances / Savings Plans — commit to 1 or 3 years in exchange for 30–72% discounts. Compute Savings Plans are the modern preference — they apply automatically to any EC2 family, region, or OS, unlike old-style RIs which are instance-type-specific. At steady-state, your baseline capacity should always be covered by a Savings Plan.
Spot Instances — bid on unused EC2 capacity at up to 90% discount. AWS can reclaim with a 2-minute warning. Production-appropriate for stateless, fault-tolerant workloads: CI/CD runners, batch ML training, rendering farms, and (with proper Spot diversification) auto-scaling groups behind a load balancer. Never use Spot for stateful singletons.

A common production failure: an ASG configured with a single Spot instance type and a single Availability Zone. When AWS reclaims that pool — and it will — the entire capacity disappears simultaneously. Always configure Spot ASGs with capacity-optimized allocation strategy, at least three instance families, and at least two AZs. Use Mixed Instances Policy to blend On-Demand base + Spot overflow.

Instance Metadata Service (IMDS)

Every running EC2 instance has a local HTTP endpoint at 169.254.169.254 that serves metadata about itself: instance ID, region, AZ, IAM role credentials, user data, and more. Applications running on EC2 use this to self-identify without any external configuration. Always use IMDSv2 (token-gated) in production — IMDSv1 is exploitable via SSRF and is the root cause of numerous high-profile cloud credential leaks.

# IMDSv2 — always use the token-based flow
TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

# Query instance identity
INSTANCE_ID=$(curl -s -H "X-aws-ec2-metadata-token: ${TOKEN}" \
  http://169.254.169.254/latest/meta-data/instance-id)

AZ=$(curl -s -H "X-aws-ec2-metadata-token: ${TOKEN}" \
  http://169.254.169.254/latest/meta-data/placement/availability-zone)

INSTANCE_TYPE=$(curl -s -H "X-aws-ec2-metadata-token: ${TOKEN}" \
  http://169.254.169.254/latest/meta-data/instance-type)

echo "Instance ${INSTANCE_ID} | Type ${INSTANCE_TYPE} | AZ ${AZ}"

# Enforce IMDSv2 on launch (do this in your Launch Template)
# aws ec2 modify-instance-metadata-options \
#   --instance-id i-0abc123 \
#   --http-tokens required \
#   --http-endpoint enabled

Enforce IMDSv2 organization-wide with an AWS Config rule (ec2-imdsv2-check) and block IMDSv1 at the Launch Template level by setting HttpTokens: required. If you are running EKS, also set HttpPutResponseHopLimit: 2 so that pod-level IMDS calls work correctly through the container network namespace hop.