Architecture Patterns

Serverless & Functions-as-a-Service

18 min Lesson 9 of 10

Serverless & Functions-as-a-Service

Serverless computing does not mean there are no servers — it means you stop managing them. You write a function, deploy it to a cloud provider (AWS Lambda, Google Cloud Functions, Azure Functions, Cloudflare Workers), and the provider handles provisioning, scaling, patching, and OS maintenance entirely. You are billed per invocation and per millisecond of execution, not for idle capacity. At zero traffic, you pay zero. At a million events per second, the platform scales automatically.

The core abstraction is the Function-as-a-Service (FaaS) unit: a stateless, short-lived handler that is invoked by an event, does its work, and exits. Each invocation is independent — no shared in-process state, no long-running threads to manage.

How Event-Triggered Invocation Works

Every FaaS function is wired to an event source. The runtime listens on your behalf and calls your function when an event fires. Common event sources include:

HTTP / API Gateway — an HTTPS request hits an API Gateway endpoint, which maps it to a function. AWS API Gateway + Lambda is the most common pattern.
Message queues / streams — a message lands in an SQS queue or a Kinesis/Kafka stream; the platform delivers a batch of records to your function.
Storage events — a file is uploaded to S3/GCS; the bucket fires an event and your function receives the object metadata.
Scheduled triggers (cron) — AWS EventBridge Scheduler or Cloud Scheduler calls your function on a cron expression (e.g. nightly report generation).
Database change streams — DynamoDB Streams or Firestore triggers deliver change records in near-real-time.

Serverless event-triggered architecture: multiple event sources invoke stateless function instances; the platform scales concurrency automatically and bills only for execution time.

The Cold Start Problem

When a function has not been called recently, the platform must spin up a new container, load the runtime (Node.js, Python, JVM, etc.), and initialize your code before executing the handler. This cold start penalty typically adds 100 ms–2 s of extra latency. Subsequent calls reuse the warm container and pay no penalty. Key facts:

Node.js and Python cold starts: typically 100–300 ms on AWS Lambda.
JVM (Java/Kotlin) cold starts: 1–3 s because JVM initialization is heavy. AWS SnapStart can reduce this to ~200 ms by restoring a pre-initialized snapshot.
Cloudflare Workers (V8 isolates, not containers): cold start under 5 ms — a fundamentally different execution model.
Mitigation strategies: keep functions small (less to load), use provisioned concurrency (pre-warmed instances, charged even when idle), or design flows so cold-start paths are non-critical.

Cold starts disqualify serverless for latency-critical, synchronous paths — e.g. the main checkout flow of an e-commerce site where p99 latency must be under 100 ms. Use always-on containers or pre-warmed instances for those paths. Cold starts are acceptable for background jobs, async processing, and webhook handlers where a few hundred milliseconds is tolerable.

Statelessness and the Execution Model

Every FaaS invocation is isolated. You cannot store state in a global variable and expect the next invocation to see it — it may run on a different instance entirely. The correct pattern is:

State belongs outside the function — in a database (RDS, DynamoDB), a cache (ElastiCache/Redis), or object storage (S3). The function reads what it needs, processes, writes results, and exits.
Expensive one-time setup (DB connection, SDK client initialization) goes at the module level, outside the handler. On a warm container the platform reuses the module, so the setup runs once. On a cold start it runs once before the first invocation. This is a safe optimization, not a stateful pattern.

Key constraint: AWS Lambda enforces a maximum execution timeout of 15 minutes. If your workload runs longer, break it into smaller steps orchestrated by a state machine (AWS Step Functions) or a queue fan-out pattern. Long-running jobs belong in containers or dedicated workers, not FaaS.

When Serverless Fits — and When It Does Not

Serverless good fit vs. poor fit — the pattern excels at event-driven, variable-traffic workloads but struggles with latency-critical paths and long-running tasks.

Cost Model: Where Serverless Saves and Where It Bites

AWS Lambda pricing (us-east-1, as of 2024): $0.20 per million invocations and $0.0000166667 per GB-second of memory. The first 1 million invocations per month are free. A function using 512 MB running for 200 ms costs roughly $0.0000017 per call. One million such calls costs $1.70. Compare that to a t3.small EC2 instance at ~$15/month that handles the same million calls but sits idle 99.9% of the time.

The inversion happens at high sustained load. If your service runs at 10,000 RPS continuously 24/7, a small container fleet is cheaper than Lambda. The crossover point (Lambda vs. containers) is typically around 5,000–20,000 sustained RPS depending on function duration and memory.

Real-world pattern — hybrid architecture: Many production systems use serverless for the long tail of event-driven work (image processing, notifications, ETL, webhooks) while keeping the hot synchronous API path on always-on containers or Kubernetes pods. The async workloads often represent 80% of invocation count but only 5% of latency sensitivity — a natural fit for FaaS economics.

Observability in Serverless Systems

Debugging serverless is harder than debugging a monolith because requests fan out across ephemeral, individually logged instances. Mandatory practices:

Structured logging — emit JSON logs with a requestId (the cloud provider injects this) so you can correlate all log lines for a single invocation.
Distributed tracing — propagate a trace context (W3C traceparent header or AWS X-Ray trace ID) through every async hop. Without this, a chain of five Lambda functions triggered by events is completely opaque.
Dead Letter Queues (DLQ) — configure an SQS DLQ on every event source. Failed invocations land there for inspection and replay instead of silently disappearing.

Architecture principle: serverless does not eliminate operational complexity — it shifts it from infrastructure management (patching, scaling) to event plumbing and observability. Well-run serverless systems require rigorous tracing, alerting on DLQ depth, and timeout budgeting across the entire invocation chain.