Serverless vs Containers: Choosing the Right Compute Model
Compare serverless (Lambda, Cloud Run) and containers (ECS, EKS, Fargate) on cold starts, pricing, scaling, vendor lock-in, and local development. Learn when to use each compute model.
Infrastructure engineer with 10+ years building production systems on AWS, GCP,…

$14,000 a Month to Run a Lambda That Should Have Been a Container
A series-A startup I advised had rebuilt their core API on AWS Lambda in 2023, following the "serverless first" advice that was fashionable at the time. Eighteen months later their AWS bill told a different story: their single largest line item was Lambda invocations and API Gateway requests — $14,200 a month for a service handling roughly 400 requests per second. That same workload on two c7g.xlarge Fargate tasks behind an ALB would have cost $180 a month. They were paying 78x more because the workload was sustained, not spiky, and Lambda is a per-invocation billing model that becomes brutal at steady load.
The flip side is equally common. Another team containerised a webhook receiver that ran three times a day. Their $60-a-month minimum-task Fargate deployment handled fewer requests per month than a single Lambda free-tier allowance. That should have been a 12-line Lambda function costing zero dollars.
Serverless versus containers is not a religion. It is a workload-shape question. Sustained high RPS, long-running, complex local dev? Containers win. Event-driven, sporadic, single-cloud glue? Lambda wins. Most real systems use both, and the mistake is always the same: picking one model and forcing it everywhere. This guide is the decision framework I use when advising teams, with real 2026 pricing at 10,000 req/day and 10 million req/day so you can estimate which way your workload tips.
Last updated: April 2026 — verified Lambda pricing (SnapStart now GA on Python and Node.js, not just Java), Fargate Spot 70%-off pricing, Cloud Run gen2 cold-start figures, and Azure Container Apps Dynamic Sessions GA.
Head-to-Head Comparison
| Factor | Serverless (Lambda) | Containers (ECS/EKS) |
|---|---|---|
| Cold starts | 100ms-10s depending on runtime and size | None (always running) |
| Execution limit | 15 minutes (Lambda) | No limit |
| Memory limit | 10 GB (Lambda) | Limited by instance size |
| Scaling | Instant, automatic, to thousands of concurrent executions | Minutes (new tasks/pods need to start) |
| Scale to zero | Yes (no cost when idle) | No (minimum tasks running unless using Fargate Spot or Karpenter) |
| Local development | Emulators (SAM, serverless-offline) -- imperfect parity | Docker Compose -- near-perfect parity |
| Vendor lock-in | High (event triggers, IAM, service integrations) | Low (Docker images run anywhere) |
| Debugging | CloudWatch Logs, X-Ray -- limited | Full access to container logs, shell, metrics |
| Networking | VPC optional, adds cold start latency | Full VPC control |
Where Serverless Wins
Event-Driven Processing
Lambda excels at reacting to events: an image uploaded to S3, a message arriving in SQS, a row inserted in DynamoDB, or an API Gateway request. Each event triggers a function invocation independently. You don't need to manage polling, concurrency, or idle resources.
// Lambda handler: resize image on S3 upload
import { S3Event } from 'aws-lambda';
import sharp from 'sharp';
export const handler = async (event: S3Event) => {
const bucket = event.Records[0].s3.bucket.name;
const key = event.Records[0].s3.object.key;
const image = await s3.getObject({ Bucket: bucket, Key: key });
const resized = await sharp(image.Body as Buffer)
.resize(800, 600)
.toBuffer();
await s3.putObject({
Bucket: bucket,
Key: `thumbnails/${key}`,
Body: resized,
});
};
Low-Traffic APIs and Microservices
If your API handles fewer than 100,000 requests per month, Lambda + API Gateway costs essentially nothing. A container running 24/7 to handle a few requests per hour is pure waste.
Scheduled Tasks and Cron Jobs
EventBridge triggers a Lambda function on a schedule. No need to keep a container running just to execute a job every 6 hours.
Where Containers Win
Long-Running Processes
WebSocket servers, streaming applications, background workers that process continuously -- anything that runs for more than 15 minutes can't use Lambda. Containers have no execution time limit.
High-Throughput APIs
At scale, containers are cheaper per request than Lambda. A c6g.large Fargate task running 24/7 costs about $60/month and handles thousands of requests per second. The equivalent Lambda usage at sustained high traffic would cost significantly more.
Complex Applications
Applications with heavy dependencies, large binaries, or complex startup sequences work better in containers. A 500 MB Lambda deployment package hits the size limit and suffers terrible cold starts. A container has no such constraint.
Local Development Fidelity
Docker Compose gives you near-perfect parity between local and production. Lambda emulators (SAM local, serverless-offline) are approximations that miss edge cases in IAM, event source mapping, and cold start behavior.
For reference: serverless computing is an execution model where the cloud provider manages server provisioning, scaling, and maintenance. You deploy functions or containers; the provider runs them on demand, scales to zero when idle, and charges only for actual execution time. Containers package an application and its dependencies into an isolated, portable image that runs on infrastructure you scale and size (or on serverless-container platforms like Fargate and Cloud Run). The major platforms:
- AWS Lambda -- functions triggered by events (API Gateway, S3, SQS)
- Google Cloud Run -- containers that scale to zero
- Azure Container Apps -- similar to Cloud Run, with Dapr
- AWS Fargate -- containers without managing EC2
The Middle Ground: Serverless Containers
Fargate, Cloud Run, and Azure Container Apps blur the line. You get container packaging with serverless scaling:
| Service | Provider | Scale to Zero | Pricing |
|---|---|---|---|
| AWS Fargate | AWS | No (minimum 1 task) | Per vCPU-second + per GB-second |
| Google Cloud Run | GCP | Yes | Per request + per vCPU-second + per GB-second |
| Azure Container Apps | Azure | Yes | Per vCPU-second + per GB-second |
Cloud Run is the closest thing to "Lambda for containers." It scales to zero, supports any language or binary, handles HTTP and gRPC, and has no execution time limit. If I were starting a new project on GCP, Cloud Run would be my default compute.
Pro tip: Fargate Spot gives you the cost benefits of spot instances for containers without managing EC2. It's 70% cheaper than regular Fargate and works well for fault-tolerant workloads like batch processing and CI runners.
Pricing Comparison: Real Workloads
Low-Traffic API (10,000 requests/day)
| Option | Monthly Cost |
|---|---|
| Lambda + API Gateway | ~$3 |
| Fargate (1 task, 0.25 vCPU) | ~$9 |
| ECS on EC2 (t3.micro) | ~$8 |
High-Traffic API (10 million requests/day)
| Option | Monthly Cost |
|---|---|
| Lambda + API Gateway | ~$450 |
| Fargate (4 tasks, 1 vCPU each) | ~$120 |
| ECS on EC2 (c6g.xlarge x2) | ~$100 |
At low traffic, serverless is cheaper because you pay nothing when idle. At high sustained traffic, containers win because you're paying for capacity rather than per-invocation.
A Practical Decision Framework
- Event-driven, sporadic, or low-traffic? Use Lambda.
- Sustained high traffic? Use containers (Fargate or ECS/EKS on EC2).
- Runs longer than 15 minutes? Use containers.
- Needs scale-to-zero with container flexibility? Use Cloud Run or Azure Container Apps.
- Complex local dev workflow? Use containers with Docker Compose.
- Glue code between AWS services? Use Lambda -- it's purpose-built for this.
Failure Modes: What Breaks When You Pick the Wrong Model
Both models have characteristic ways they fail in production. Knowing them helps you pick right the first time.
Lambda: The 15-Minute Timeout Finds You Eventually
A nightly batch job starts at 45 seconds of runtime, grows with data, and three months later silently starts failing at the 15-minute mark. CloudWatch logs show Task timed out. Because the failure is invisible to upstream systems (the SQS queue is configured to retry), the batch silently drops records for two weeks before anyone notices. Long-running work belongs in Step Functions, Fargate, or Batch -- never in a single Lambda.
Lambda: Cold Starts in a VPC Eat Your P99
Lambdas not in a VPC get 100-200 ms cold starts. Put the same Lambda in a VPC (to reach a private RDS, for example) and cold starts jump to 400-800 ms because AWS must attach an ENI. Pre-2019 it was worse (up to 10 seconds). Even with ENI sharing, a VPC-attached Lambda that handles user-facing traffic needs Provisioned Concurrency or a cache layer in front.
Containers: Scaling Event Storms When a Deploy Fails
An ECS service with min=2, max=50, desired=2 suddenly spikes to max=50 because a bad deploy caused every task to fail health checks, and the deploy controller launched replacements. The bill for the hour is $200 and the root cause is the health check, not traffic. Set deploymentConfiguration.minimumHealthyPercent: 100 and maximumPercent: 150 so a failing deploy cannot runaway.
Containers: Image Pull on Cold Fargate Tasks
Fargate fetches the image from ECR every time a task starts. A 1.2 GB image adds 15-40 seconds to cold-start latency. Multi-stage builds (dropping the image to 150 MB) cut that to 3-5 seconds. If your Fargate scale-out latency is hurting, check image size before blaming the platform.
Serverless Containers (Cloud Run): Concurrency Misconfiguration
Cloud Run's default container concurrency is 80 -- a single instance can handle 80 concurrent requests. If your container is not actually thread-safe (a Python app with global state, say), 80 concurrent requests corrupt each other. Set concurrency to 1 if your app is not concurrency-safe, and accept the higher cost.
A Hybrid Reference Architecture
The production systems I have built that survived three years of growth without a rewrite use both models. This is the layout that works.
- Core API (Fargate or EKS): authentication, business logic, anything touching the primary database. Sustained traffic, sub-second SLO, container packaging gives you local-dev parity and predictable cost.
- Async processing (Lambda + SQS): email sends, webhook fan-out, PDF generation, thumbnail creation. Spiky, bursty, parallel. Lambda's scale-to-zero is the correct economic model.
- Scheduled work (Lambda + EventBridge): nightly jobs under 15 minutes, polling tasks, cache warmers. Zero idle cost, zero infrastructure to forget about.
- Long-running jobs (Step Functions + Fargate tasks): any workflow longer than 15 minutes, any job that needs more than 10 GB of RAM, any pipeline with complex retry logic. Step Functions as the orchestrator, Fargate as the runner.
- ML inference (SageMaker endpoints or Cloud Run): GPUs on Lambda are limited and expensive; dedicated inference endpoints or serverless-container platforms win here.
Pro tip: resist the temptation to pick a single model for the whole company. The productivity tax of learning two deployment patterns is real but small. The tax of forcing the wrong model on a misfit workload compounds for the life of the system.
What Changed in 2026: Platform Updates Worth Knowing
If your decision framework was set in 2023, three platform shifts are worth re-checking before your next compute pick:
- Lambda SnapStart for Python and Node.js (GA Q1 2026). Java got it in 2022; the runtime list now covers the languages most APIs actually use. Cold starts on a typical Express handler drop from ~280 ms to ~45 ms with no code changes — flip a switch, redeploy. Provisioned Concurrency stops being the only escape hatch.
- Cloud Run gen2 runtime. Google replaced the gVisor sandbox with a faster container runtime. Small Node.js cold starts went from ~1.2 s to ~400 ms; large Java apps that used to be unusable on Cloud Run are now serviceable. The min-instance pricing dropped 25% as part of the same change.
- Fargate Spot at 70% off, multi-AZ. Used to be region-flaky; capacity availability is now solid in most major regions. Batch and CI workloads that were borderline on regular Fargate pricing tip clearly toward containers when Spot is on the table.
- Azure Container Apps Dynamic Sessions (GA 2026). Per-request sandboxed code execution, useful for AI agent workloads running untrusted generated code. Lambda has nothing equivalent yet.
The strategic implication: 2026 serverless is meaningfully cheaper-to-run than 2024 serverless, and 2026 containers got more competitive on cold starts via Cloud Run gen2. Re-run the decision tree if your last evaluation predates these.
Frequently Asked Questions
What's the difference between serverless and containers?
Containers package your app and dependencies into an image that runs on infrastructure you control (ECS, EKS, Kubernetes, VMs). Serverless (AWS Lambda, Cloud Functions, Azure Functions) abstracts the runtime entirely -- you write a function and the provider scales, patches, and bills per-invocation. Containers give control and cost predictability at high volume. Serverless wins on operational simplicity and scale-to-zero pricing for spiky workloads.
What are serverless containers like Fargate and Cloud Run?
Serverless containers combine both models: you package your app as a Docker image but the platform handles capacity, scaling, and patching. AWS Fargate (with ECS or EKS), Google Cloud Run, and Azure Container Apps are the main options. You pay per vCPU-second and memory-GB-second. Cold starts exist but are shorter than Lambda (1-5 seconds for Cloud Run vs 100-500ms for Lambda). Good middle ground for teams that want container portability with serverless ops.
What is a cold start and how bad is it really?
A cold start happens when Lambda spins up a new execution environment for your function. For Node.js and Python, cold starts are typically 100-500ms. For Java and.NET, they can reach 2-10 seconds. Provisioned Concurrency eliminates cold starts by keeping environments warm, but it costs money. For most APIs, occasional 200ms cold starts are imperceptible to users.
Can I run a database inside a container?
You can, but you probably shouldn't in production. Containers are ephemeral by design -- when a container restarts, local data is lost unless you mount persistent storage. Managed database services (RDS, Cloud SQL, Azure Database) handle replication, backups, and failover far better than a self-managed database in a container.
Is vendor lock-in with Lambda a real concern?
It depends on your architecture. The Lambda function code itself is portable -- it's just a function. The lock-in is in the event sources (S3 triggers, SQS, DynamoDB streams), IAM permissions, and service integrations. If your business logic is cleanly separated from the Lambda handler, migrating to another platform is a wrapper change, not a rewrite.
Should I use ECS or EKS for containers on AWS?
Use ECS if you're AWS-only and want simplicity. ECS is fully managed, integrates deeply with AWS services, and requires no cluster management overhead. Use EKS if you need Kubernetes compatibility (multi-cloud portability, Kubernetes ecosystem tooling) or your team already knows Kubernetes. EKS adds operational complexity that isn't justified unless you specifically need Kubernetes features.
Can I combine Lambda and containers in the same application?
Absolutely, and most mature applications do. A common pattern: containers handle the core API (long-running, high-throughput), Lambda processes asynchronous events (S3 uploads, SQS messages, scheduled tasks), and Step Functions orchestrate complex workflows that span both. Use the right tool for each job rather than forcing one model everywhere.
What about Lambda container images?
Lambda supports container images up to 10 GB, letting you use the same Docker build process for both Lambda and container deployments. This is useful for functions with large dependencies (ML models, heavy libraries). Cold starts are longer with large images, but it eliminates the 250 MB zip deployment limit and lets you use familiar Docker tooling.
Use Both, Thoughtfully
The best architectures combine serverless and containers based on each workload's characteristics. Don't adopt one model exclusively. Use Lambda for event-driven and sporadic workloads where scale-to-zero saves money. Use containers for high-throughput services, long-running processes, and complex applications where local development fidelity matters. The compute model should follow the workload, not the other way around.
Written by
Abhishek Patel
Infrastructure engineer with 10+ years building production systems on AWS, GCP, and bare metal. Writes practical guides on cloud architecture, containers, networking, and Linux for developers who want to understand how things actually work under the hood.
Related Articles
SQLite at the Edge: When libSQL Beats Postgres
SQLite at the edge via libSQL embedded replicas and Cloudflare D1 delivers 2-5ms reads worldwide versus 20-100ms for Postgres read replicas. Real benchmarks, pricing comparisons, production failure modes, and a decision framework for when edge SQLite wins and when Postgres-with-replicas is still the right call.
15 min read
SecurityBest Vulnerability Scanners for Containers (2026): Snyk vs Trivy vs Grype vs Aqua
Benchmarked comparison of Snyk, Trivy, Grype, and Aqua against 100 production images. Real 2026 pricing, false-positive rates, scan times, and a decision matrix for picking the right scanner.
15 min read
DevOpsWebContainers and StackBlitz: Browser-Native Dev Environments in 2026
Real Node.js compiled to WebAssembly running inside the browser tab. What works (Next.js dev, npm install, SQLite via WASM), what doesn't (native modules, Postgres, Python), and the use cases that actually changed in 2026: docs, interviews, AI agent sandboxes, SDK onboarding.
12 min read
Enjoyed this article?
Get more like this in your inbox. No spam, unsubscribe anytime.