Serverless functions are deceptively simple to ship by hand — a zip and a console click. At scale that model collapses: you need reproducible environments, safe progressive rollouts, and a full audit trail. This lesson covers the three dominant IaC toolchains for serverless workloads, how AWS Lambda versioning and aliases unlock production-safe canary releases, and the deployment patterns that separate mature serverless shops from amateur ones.
The IaC Landscape for Serverless
Three tools dominate the space, each with a different philosophy:
AWS SAM (Serverless Application Model) — an open-source extension of CloudFormation. AWS::Serverless::Function transforms expand into Lambda + IAM + event-source-mapping stacks. Closest to CloudFormation; best when your team already owns a CF estate. Supports local invocation via sam local invoke and HTTP emulation via sam local start-api.
Serverless Framework (SLS) — provider-agnostic YAML DSL. Compiles down to native CloudFormation (on AWS) but hides much of the boilerplate. Ecosystem of 1,000+ plugins. Historically the de-facto choice; still dominant in polyglot shops that target Azure/GCP as well.
AWS CDK — infrastructure as real programming languages (TypeScript/Python/Go). The aws-lambda and aws-lambda-nodejs constructs handle bundling (via esbuild), layer management, and event wiring. Best for teams that want type-safety, loops, and reusable constructs.
Which to choose in 2025? AWS shops building net-new services are converging on CDK. Existing SAM templates are worth keeping — SAM and CDK interoperate. Serverless Framework is still the right call when you need multi-cloud portability or a quick prototype with community plugins.
SAM in Production
A minimal but production-realistic SAM template for an API-backed Lambda looks like this:
# template.yaml
AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Globals:
Function:
Runtime: nodejs20.x
Architectures: [arm64] # Graviton2 — ~20% cheaper, same throughput
MemorySize: 512
Timeout: 29 # 1 s under API GW limit
Environment:
Variables:
LOG_LEVEL: !Ref LogLevel
Layers:
- !Ref CommonUtilsLayer
Tracing: Active # X-Ray
Parameters:
LogLevel:
Type: String
Default: info
AllowedValues: [debug, info, warn, error]
Resources:
OrderApi:
Type: AWS::Serverless::Api
Properties:
StageName: !Ref AWS::StackName
TracingEnabled: true
AccessLogDestination:
DestinationArn: !GetAtt ApiAccessLogGroup.Arn
OrderHandler:
Type: AWS::Serverless::Function
Properties:
CodeUri: src/order/
Handler: index.handler
AutoPublishAlias: live # SAM creates a new version + updates alias on every deploy
DeploymentPreference:
Type: Canary10Percent5Minutes # 10 % canary, promote after 5 min if alarms clean
Alarms:
- !Ref OrderErrorAlarm
Hooks:
PreTraffic: !Ref PreTrafficHook
PostTraffic: !Ref PostTrafficHook
Events:
CreateOrder:
Type: Api
Properties:
Path: /orders
Method: POST
RestApiId: !Ref OrderApi
OrderErrorAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: OrderHandler-Errors
MetricName: Errors
Namespace: AWS/Lambda
Dimensions:
- Name: FunctionName
Value: !Ref OrderHandler
- Name: Resource
Value: !Sub "${OrderHandler}:live" # tracks the alias, not $LATEST
Statistic: Sum
Period: 60
EvaluationPeriods: 1
Threshold: 1
ComparisonOperator: GreaterThanOrEqualToThreshold
CommonUtilsLayer:
Type: AWS::Serverless::LayerVersion
Properties:
ContentUri: layers/common/
CompatibleRuntimes: [nodejs20.x]
RetentionPolicy: Retain # never auto-delete old layer versions
Metadata:
BuildMethod: nodejs20.x
ApiAccessLogGroup:
Type: AWS::Logs::LogGroup
Properties:
RetentionInDays: 30
The AutoPublishAlias: live line is the key production primitive — SAM publishes a new immutable version on every sam deploy and shifts the alias. Combined with DeploymentPreference, CodeDeploy manages the canary shift without any extra tooling.
Versioning and Aliases: the Production Mental Model
Lambda versions are immutable snapshots: code + configuration + environment variables. Once published, they cannot change. Aliases are mutable pointers. This separation gives you three production capabilities:
Canary releases — an alias can split traffic by weight. Route 10 % to version 42 and 90 % to version 41. CloudWatch alarms watch alias-scoped metrics. If errors spike, CodeDeploy rolls the alias weight back to 0 on the canary. No Lambda redeployment, no cold start surge from a fresh deploy.
Rollback in seconds — aws lambda update-alias --function-name OrderHandler --name live --function-version 41 is atomic and takes milliseconds. Compare with a container rollout that must drain pods.
Environment pinning for downstream consumers — event source mappings and API GW integrations reference the alias ARN, not $LATEST. They never break across deployments.
API Gateway targets the alias ARN; CodeDeploy shifts weighted traffic to the canary version and auto-rolls back if the CloudWatch alarm fires.
CDK Pattern: Canary with TypeScript
The CDK equivalent is more concise and type-safe. The aws-codedeploy module exposes the same CodeDeploy primitives that SAM's DeploymentPreference generates under the hood:
// lib/order-stack.ts
import * as lambda from "aws-cdk-lib/aws-lambda";
import * as lambdaNodeJs from "aws-cdk-lib/aws-lambda-nodejs";
import * as codedeploy from "aws-cdk-lib/aws-codedeploy";
import * as cloudwatch from "aws-cdk-lib/aws-cloudwatch";
import { Duration } from "aws-cdk-lib";
const fn = new lambdaNodeJs.NodejsFunction(this, "OrderHandler", {
entry: "src/order/index.ts",
runtime: lambda.Runtime.NODEJS_20_X,
architecture: lambda.Architecture.ARM_64,
memorySize: 512,
timeout: Duration.seconds(29),
bundling: { minify: true, sourceMap: true },
});
// Publish an immutable version on every deploy
const version = fn.currentVersion;
// Alias "live" points to the latest version
const alias = new lambda.Alias(this, "LiveAlias", {
aliasName: "live",
version,
});
// Error alarm scoped to the alias
const errorAlarm = new cloudwatch.Alarm(this, "OrderErrors", {
metric: alias.metricErrors({ period: Duration.minutes(1) }),
threshold: 1,
evaluationPeriods: 1,
});
// CodeDeploy canary: 10 % for 5 minutes, roll back on alarm
new codedeploy.LambdaDeploymentGroup(this, "OrderDeploymentGroup", {
alias,
deploymentConfig: codedeploy.LambdaDeploymentConfig.CANARY_10PERCENT_5MINUTES,
alarms: [errorAlarm],
autoRollback: { deploymentInAlarm: true, failedDeployment: true },
});
currentVersion is a CDK footgun. The fn.currentVersion property creates a new Version resource on every synth that detects a code or config change — exactly what you want. But if you hard-code fn.addVersion("v1"), it only publishes once. Use currentVersion exclusively and let CDK detect the drift.
Serverless Framework: Multi-Stage Canary
For teams on Serverless Framework, the serverless-plugin-canary-deployments plugin wires CodeDeploy in the same way:
Both SAM and CDK support lifecycle hooks — Lambda functions that CodeDeploy invokes before and after shifting traffic. This is where you run smoke tests against the new version in isolation:
PreTrafficHook — invoke the new version directly (not via alias) and assert that a known test payload returns HTTP 200 with the expected schema. Call codedeploy:PutLifecycleEventHookExecutionStatus to report pass/fail. A failure here aborts the canary before a single production request hits the new code.
PostTrafficHook — after full promotion, verify downstream side-effects: DynamoDB record shapes, SNS message counts, SQS queue depth. Use this to catch "silent corruption" bugs that don't surface as Lambda errors.
Do not call the alias in your hook. If your PreTrafficHook invokes the live alias it may hit the old version (weighted at 90 %). Always invoke the specific version ARN — it is available as the FUNCTION_VERSION_ARN environment variable that CodeDeploy injects into the hook.
Version Retention and Cleanup
Lambda retains every published version until you delete it. A busy CI/CD pipeline publishes 20–30 versions per day per function. At scale, leftover versions accumulate into hundreds per function, exhaust the 75 GB regional code-storage quota, and slow down ListVersionsByFunction paginations. Best practice: set a RemovalPolicy (CDK) or RetentionPolicy: Delete (SAM) on old versions, or run a weekly cleanup Lambda via the aws-lambda SDK. Keep at least the two most recent promoted versions for rollback.
Pipeline Integration
A production-grade CI/CD pipeline for serverless looks like this in GitHub Actions:
# .github/workflows/deploy.yml
name: Deploy Order Service
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
permissions:
id-token: write # OIDC for AWS auth — no long-lived keys
contents: read
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: npm
- name: Configure AWS credentials (OIDC)
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/GithubActionsDeployRole
aws-region: us-east-1
- run: npm ci
- name: Run unit tests
run: npm test
- name: SAM build
run: sam build --use-container # reproducible, no local runtime needed
- name: SAM deploy (canary)
run: |
sam deploy \
--stack-name order-service-prod \
--s3-bucket my-sam-artifacts-prod \
--capabilities CAPABILITY_IAM \
--no-fail-on-empty-changeset \
--parameter-overrides LogLevel=info \
--confirm-changeset false
- name: Monitor CodeDeploy rollout
run: |
DEPLOY_ID=$(aws deploy list-deployments \
--application-name order-service-prod-OrderHandler \
--query "deployments[0]" --output text)
aws deploy wait deployment-successful --deployment-id "$DEPLOY_ID"
The final step blocks the pipeline until CodeDeploy either fully promotes or rolls back. If rollback occurs, the pipeline fails, the commit is flagged, and the on-call engineer gets the PagerDuty alert — standard SRE loop you already know from Kubernetes rollouts.
SAM vs CDK deploy commands.sam deploy packages and uploads artifacts then delegates to CloudFormation. cdk deploy synthesizes a CloudFormation template and does the same. Both are idempotent and diff-aware — a deploy with no changes produces a no-op changeset and exits cleanly.