Serverless & Event-Driven Operations

Serverless IaC & Deployment

18 min Lesson 6 of 28

Serverless IaC & Deployment

Serverless functions are deceptively simple to ship by hand — a zip and a console click. At scale that model collapses: you need reproducible environments, safe progressive rollouts, and a full audit trail. This lesson covers the three dominant IaC toolchains for serverless workloads, how AWS Lambda versioning and aliases unlock production-safe canary releases, and the deployment patterns that separate mature serverless shops from amateur ones.

The IaC Landscape for Serverless

Three tools dominate the space, each with a different philosophy:

  • AWS SAM (Serverless Application Model) — an open-source extension of CloudFormation. AWS::Serverless::Function transforms expand into Lambda + IAM + event-source-mapping stacks. Closest to CloudFormation; best when your team already owns a CF estate. Supports local invocation via sam local invoke and HTTP emulation via sam local start-api.
  • Serverless Framework (SLS) — provider-agnostic YAML DSL. Compiles down to native CloudFormation (on AWS) but hides much of the boilerplate. Ecosystem of 1,000+ plugins. Historically the de-facto choice; still dominant in polyglot shops that target Azure/GCP as well.
  • AWS CDK — infrastructure as real programming languages (TypeScript/Python/Go). The aws-lambda and aws-lambda-nodejs constructs handle bundling (via esbuild), layer management, and event wiring. Best for teams that want type-safety, loops, and reusable constructs.
Which to choose in 2025? AWS shops building net-new services are converging on CDK. Existing SAM templates are worth keeping — SAM and CDK interoperate. Serverless Framework is still the right call when you need multi-cloud portability or a quick prototype with community plugins.

SAM in Production

A minimal but production-realistic SAM template for an API-backed Lambda looks like this:

# template.yaml AWSTemplateFormatVersion: "2010-09-09" Transform: AWS::Serverless-2016-10-31 Globals: Function: Runtime: nodejs20.x Architectures: [arm64] # Graviton2 — ~20% cheaper, same throughput MemorySize: 512 Timeout: 29 # 1 s under API GW limit Environment: Variables: LOG_LEVEL: !Ref LogLevel Layers: - !Ref CommonUtilsLayer Tracing: Active # X-Ray Parameters: LogLevel: Type: String Default: info AllowedValues: [debug, info, warn, error] Resources: OrderApi: Type: AWS::Serverless::Api Properties: StageName: !Ref AWS::StackName TracingEnabled: true AccessLogDestination: DestinationArn: !GetAtt ApiAccessLogGroup.Arn OrderHandler: Type: AWS::Serverless::Function Properties: CodeUri: src/order/ Handler: index.handler AutoPublishAlias: live # SAM creates a new version + updates alias on every deploy DeploymentPreference: Type: Canary10Percent5Minutes # 10 % canary, promote after 5 min if alarms clean Alarms: - !Ref OrderErrorAlarm Hooks: PreTraffic: !Ref PreTrafficHook PostTraffic: !Ref PostTrafficHook Events: CreateOrder: Type: Api Properties: Path: /orders Method: POST RestApiId: !Ref OrderApi OrderErrorAlarm: Type: AWS::CloudWatch::Alarm Properties: AlarmName: OrderHandler-Errors MetricName: Errors Namespace: AWS/Lambda Dimensions: - Name: FunctionName Value: !Ref OrderHandler - Name: Resource Value: !Sub "${OrderHandler}:live" # tracks the alias, not $LATEST Statistic: Sum Period: 60 EvaluationPeriods: 1 Threshold: 1 ComparisonOperator: GreaterThanOrEqualToThreshold CommonUtilsLayer: Type: AWS::Serverless::LayerVersion Properties: ContentUri: layers/common/ CompatibleRuntimes: [nodejs20.x] RetentionPolicy: Retain # never auto-delete old layer versions Metadata: BuildMethod: nodejs20.x ApiAccessLogGroup: Type: AWS::Logs::LogGroup Properties: RetentionInDays: 30

The AutoPublishAlias: live line is the key production primitive — SAM publishes a new immutable version on every sam deploy and shifts the alias. Combined with DeploymentPreference, CodeDeploy manages the canary shift without any extra tooling.

Versioning and Aliases: the Production Mental Model

Lambda versions are immutable snapshots: code + configuration + environment variables. Once published, they cannot change. Aliases are mutable pointers. This separation gives you three production capabilities:

  1. Canary releases — an alias can split traffic by weight. Route 10 % to version 42 and 90 % to version 41. CloudWatch alarms watch alias-scoped metrics. If errors spike, CodeDeploy rolls the alias weight back to 0 on the canary. No Lambda redeployment, no cold start surge from a fresh deploy.
  2. Rollback in secondsaws lambda update-alias --function-name OrderHandler --name live --function-version 41 is atomic and takes milliseconds. Compare with a container rollout that must drain pods.
  3. Environment pinning for downstream consumers — event source mappings and API GW integrations reference the alias ARN, not $LATEST. They never break across deployments.
Lambda Versioning and Alias Traffic Splitting API Gateway POST /orders Alias: live 90% → v41 10% → v42 (canary) v41 (stable) Immutable snapshot v42 (canary) New code 90% 10% CloudWatch Alarm: Errors on alias scope Auto-Rollback alias weight → 0% Lambda Alias Traffic Splitting — Canary Release
API Gateway targets the alias ARN; CodeDeploy shifts weighted traffic to the canary version and auto-rolls back if the CloudWatch alarm fires.

CDK Pattern: Canary with TypeScript

The CDK equivalent is more concise and type-safe. The aws-codedeploy module exposes the same CodeDeploy primitives that SAM's DeploymentPreference generates under the hood:

// lib/order-stack.ts import * as lambda from "aws-cdk-lib/aws-lambda"; import * as lambdaNodeJs from "aws-cdk-lib/aws-lambda-nodejs"; import * as codedeploy from "aws-cdk-lib/aws-codedeploy"; import * as cloudwatch from "aws-cdk-lib/aws-cloudwatch"; import { Duration } from "aws-cdk-lib"; const fn = new lambdaNodeJs.NodejsFunction(this, "OrderHandler", { entry: "src/order/index.ts", runtime: lambda.Runtime.NODEJS_20_X, architecture: lambda.Architecture.ARM_64, memorySize: 512, timeout: Duration.seconds(29), bundling: { minify: true, sourceMap: true }, }); // Publish an immutable version on every deploy const version = fn.currentVersion; // Alias "live" points to the latest version const alias = new lambda.Alias(this, "LiveAlias", { aliasName: "live", version, }); // Error alarm scoped to the alias const errorAlarm = new cloudwatch.Alarm(this, "OrderErrors", { metric: alias.metricErrors({ period: Duration.minutes(1) }), threshold: 1, evaluationPeriods: 1, }); // CodeDeploy canary: 10 % for 5 minutes, roll back on alarm new codedeploy.LambdaDeploymentGroup(this, "OrderDeploymentGroup", { alias, deploymentConfig: codedeploy.LambdaDeploymentConfig.CANARY_10PERCENT_5MINUTES, alarms: [errorAlarm], autoRollback: { deploymentInAlarm: true, failedDeployment: true }, });
currentVersion is a CDK footgun. The fn.currentVersion property creates a new Version resource on every synth that detects a code or config change — exactly what you want. But if you hard-code fn.addVersion("v1"), it only publishes once. Use currentVersion exclusively and let CDK detect the drift.

Serverless Framework: Multi-Stage Canary

For teams on Serverless Framework, the serverless-plugin-canary-deployments plugin wires CodeDeploy in the same way:

# serverless.yml service: order-service frameworkVersion: "4" provider: name: aws runtime: nodejs20.x region: us-east-1 architecture: arm64 plugins: - serverless-esbuild - serverless-plugin-canary-deployments custom: esbuild: bundle: true minify: true canarySettings: type: Canary10Percent5Minutes alarms: - name: OrderHandlerErrors namespace: AWS/Lambda metric: Errors threshold: 1 statistic: Sum period: 60 evaluationPeriods: 1 functions: createOrder: handler: src/order/index.handler memorySize: 512 timeout: 29 deploymentSettings: type: ${self:custom.canarySettings.type} alias: live preTrafficHook: preTrafficHook alarms: ${self:custom.canarySettings.alarms} events: - httpApi: path: /orders method: POST

Pre- and Post-Traffic Hooks

Both SAM and CDK support lifecycle hooks — Lambda functions that CodeDeploy invokes before and after shifting traffic. This is where you run smoke tests against the new version in isolation:

  • PreTrafficHook — invoke the new version directly (not via alias) and assert that a known test payload returns HTTP 200 with the expected schema. Call codedeploy:PutLifecycleEventHookExecutionStatus to report pass/fail. A failure here aborts the canary before a single production request hits the new code.
  • PostTrafficHook — after full promotion, verify downstream side-effects: DynamoDB record shapes, SNS message counts, SQS queue depth. Use this to catch "silent corruption" bugs that don't surface as Lambda errors.
Do not call the alias in your hook. If your PreTrafficHook invokes the live alias it may hit the old version (weighted at 90 %). Always invoke the specific version ARN — it is available as the FUNCTION_VERSION_ARN environment variable that CodeDeploy injects into the hook.

Version Retention and Cleanup

Lambda retains every published version until you delete it. A busy CI/CD pipeline publishes 20–30 versions per day per function. At scale, leftover versions accumulate into hundreds per function, exhaust the 75 GB regional code-storage quota, and slow down ListVersionsByFunction paginations. Best practice: set a RemovalPolicy (CDK) or RetentionPolicy: Delete (SAM) on old versions, or run a weekly cleanup Lambda via the aws-lambda SDK. Keep at least the two most recent promoted versions for rollback.

Pipeline Integration

A production-grade CI/CD pipeline for serverless looks like this in GitHub Actions:

# .github/workflows/deploy.yml name: Deploy Order Service on: push: branches: [main] jobs: deploy: runs-on: ubuntu-latest permissions: id-token: write # OIDC for AWS auth — no long-lived keys contents: read steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: 20 cache: npm - name: Configure AWS credentials (OIDC) uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: arn:aws:iam::123456789012:role/GithubActionsDeployRole aws-region: us-east-1 - run: npm ci - name: Run unit tests run: npm test - name: SAM build run: sam build --use-container # reproducible, no local runtime needed - name: SAM deploy (canary) run: | sam deploy \ --stack-name order-service-prod \ --s3-bucket my-sam-artifacts-prod \ --capabilities CAPABILITY_IAM \ --no-fail-on-empty-changeset \ --parameter-overrides LogLevel=info \ --confirm-changeset false - name: Monitor CodeDeploy rollout run: | DEPLOY_ID=$(aws deploy list-deployments \ --application-name order-service-prod-OrderHandler \ --query "deployments[0]" --output text) aws deploy wait deployment-successful --deployment-id "$DEPLOY_ID"

The final step blocks the pipeline until CodeDeploy either fully promotes or rolls back. If rollback occurs, the pipeline fails, the commit is flagged, and the on-call engineer gets the PagerDuty alert — standard SRE loop you already know from Kubernetes rollouts.

SAM vs CDK deploy commands. sam deploy packages and uploads artifacts then delegates to CloudFormation. cdk deploy synthesizes a CloudFormation template and does the same. Both are idempotent and diff-aware — a deploy with no changes produces a no-op changeset and exits cleanly.