Versus
Matt S
Matt S
Platform engineer at Fortem10 min read
fargate-vs-lambdalambda-vs-fargate-costlambda-vs-fargate-breakeven

AWS Fargate vs Lambda: When Does Lambda Stop Being Cheaper?

Lambda is not categorically cheaper than Fargate, and Fargate is not categorically cheaper than Lambda. There is a crossover point, and it is set mostly by how long each invocation runs — not by how much traffic you get. Most comparisons stop at a feature table. This one gives you the breakeven in dollars, the hidden costs that move it, and what the new Lambda MicroVMs (launched June 22, 2026) change — and what they don't.

TL;DR
  • ·Lambda wins on short, spiky, event-driven work. Fargate wins on long-running, steady services. The line is duration × frequency, not raw traffic.
  • ·Real breakeven: a 200ms API crosses ~6–8M invocations/mo; a 2s background job crosses ~1M/mo; at 5s+ duration Lambda almost never wins at scale.
  • ·Per-request charges are only ~20–40% of a serverless bill. API Gateway ($3.50/M), CloudWatch Logs ($50–150/mo), NAT, and provisioned concurrency are the rest.
  • ·June 22, 2026: Lambda MicroVMs lifted the runtime limit from 15 min to 8 hours (16 vCPU / 32 GB, Firecracker) — but they target isolated sandboxes for AI and untrusted code, not always-on web services.
Quick answer

Lambda is cheaper for short, spiky workloads; Fargate is cheaper for long-running, steady ones. The crossover is set by execution duration: a 200ms API endpoint stays cheaper on Lambda up to roughly 6–8M invocations/month (including API Gateway and CloudWatch), while a 2s background job crosses to Fargate at about 1M/month. At 5s+ average duration, Lambda almost never wins at scale. Fargate compute runs $0.04048/vCPU-hr + $0.004445/GB-hr; Fargate Spot is ~68% cheaper. A monthly Lambda bill above ~$1,000 is a strong signal that at least one workload belongs on Fargate.

Ready to use — copy this today

The two cost formulas, side by side. Swap your own numbers in and you get the monthly figure for each service — then compare. Lambda bills GB-seconds plus a per-request fee; Fargate bills allocated vCPU and memory for the hours the task runs.

text
# ---- Lambda monthly cost ----
#   memory_gb     = function memory / 1024     e.g. 0.5
#   duration_s    = avg execution seconds      e.g. 0.2
#   invocations   = requests per month         e.g. 5_000_000
#
gb_seconds   = memory_gb * duration_s * invocations
compute      = gb_seconds * 0.00001667         # $/GB-second
requests     = invocations * 0.00000020        # $0.20 per 1M requests
api_gateway  = invocations * 0.0000035         # $3.50 per 1M (if used)
lambda_total = compute + requests + api_gateway

# ---- Fargate monthly cost (one always-on task) ----
#   vcpu = 0.5   mem_gb = 1   hours = 730 (24/7) or ~217 (business hrs)
fargate_total = (vcpu * 0.04048 + mem_gb * 0.004445) * hours
# Fargate Spot: multiply the compute rates by ~0.319 (≈68% off)

# ---- Worked example: 0.5 GB, 0.2s, 5M invocations ----
# Lambda  : 500,000 GB-s -> $8.34 compute + $1.00 requests + $17.50 API GW = ~$26.84/mo
# Fargate : (0.5*0.04048 + 1*0.004445) * 730                              = ~$18.02/mo  (24/7)
# At 5M invocations of a 200ms API, one always-on Fargate task already wins.

Rates: Lambda $0.00001667/GB-s + $0.20/1M requests; API Gateway $3.50/1M; Fargate $0.04048/vCPU-hr + $0.004445/GB-hr (Linux/x86, us-east-1, verified June 2026). Real systems need more than one Fargate task for redundancy — adjust hours and task count for your setup.

Where Lambda wins, where Fargate wins

Lambda bills per millisecond of execution and fits spiky, event-driven work; Fargate bills for allocated vCPU and memory and wins on long-running, steady services. The split is duration, not app type.

The framing “serverless is cheaper” hides the mechanism underneath. Lambda charges for the time your code runs, rounded to the millisecond, times the memory you assigned. When code runs rarely and briefly, you pay almost nothing between invocations. When it runs constantly, you pay for each of those milliseconds — and there are 2.6 billion of them in a month.

Fargate is the inverse. You pay for a task's vCPU and memory for as long as it exists, whether it serves one request or ten thousand per second. Idle time is wasted money; saturated time is a bargain.

Reach for Lambda
  • · S3-triggered file processing
  • · Webhook and HTTP handlers with bursty traffic
  • · Scheduled (cron) jobs that run briefly
  • · Queue and stream consumers with variable load
  • · Anything that needs to scale from zero instantly
Reach for Fargate
  • · Long-lived microservices and APIs
  • · Services holding connections (WebSocket, gRPC)
  • · Batch and ETL jobs past the 15-minute mark
  • · Steady traffic where utilization stays high
  • · Workloads needing precise CPU/memory control
Key insight

A “serverless” service that runs 24/7 under steady load is paying Lambda's premium for elasticity it never uses. Elasticity is only free when your traffic is spiky. If your invocation graph is a flat line, you are buying the wrong abstraction.

The real cost breakeven

Breakeven is set by function duration: a 200ms API crosses around 6–8M invocations a month, a 2s background job around 1M. The longer the function runs, the sooner Fargate wins.

Invocation count is the number most teams reach for, but it is the wrong axis on its own. Duration multiplies it. A 200ms function and a 2s function at the same invocation count have a 10× difference in GB-seconds — and GB-seconds are what Lambda bills. That is why the breakeven for a longer function lands at a fraction of the invocations.

WorkloadLambda configCompute only+ API GW & logs
API endpoint512 MB · 200 ms~10M / mo~6–8M / mo
Background processor1024 MB · 2 s~1.5M / mo~1M / mo
Data pipeline2048 MB · 500 ms~5M / mo~4M / mo
Breakeven = invocations/month above which Fargate is cheaper. Based on a 2026 third-party analysis of production workloads (LeanOps). The longer the function runs, the lower the breakeven.

Put a concrete profile through it. At 5 million invocations of a 200ms function with 512 MB, Lambda's compute plus an API Gateway in front already exceeds the cost of a single always-on Fargate task doing the same work. The Fargate bar is stacked: its own compute, plus the slice of the shared NAT Gateway it actually uses.

$27/mo
$22/mo
$10/mo
Lambda + API Gateway
5M × 200ms × 0.5 GB
Fargate (1 always-on task)
0.5 vCPU + 1 GB · 24/7
Fargate Spot
same task · ~68% off
Fargate computeShared NAT share
Monthly cost — 200ms API at 5M invocations/moFargate −19%
The NAT Gateway is a per-VPC resource that almost every AWS account already runs and shares across all tasks and environments in the VPC — so a single service carries only its slice (~$4/mo here), not the full ~$66/mo. Loading the whole NAT onto one Fargate task would overstate its cost. Lambda outside a VPC needs no NAT at all; Lambda inside a VPC shares the same gateway, so this overhead is roughly a wash between the two. Fargate Spot ($0.01291/vCPU-hr + $0.001417/GB-hr) is for fault-tolerant, stateless workloads, not strict-uptime prod APIs — shown for the cost picture, not as a drop-in here.

That comparison uses one Fargate task for clarity. Production needs at least two for redundancy, plus the rest of the fixed overhead an ECS environment carries — what a real Fargate environment costs once you count the ALB and CloudWatch alongside that NAT share. Fold it in and the crossover shifts, but the direction holds: the longer and busier the workload, the more Fargate pulls ahead.

These thresholds come from a 2026 third-party analysis of production workloads, not from AWS. Treat them as a starting estimate and confirm with your own numbers using the formulas above — your memory size, duration, and whether you front Lambda with API Gateway all move the line.

The hidden costs that move the line

Per-request charges are only 20–40% of a serverless bill. API Gateway ($3.50/M, often more than the Lambda itself), CloudWatch Logs ($50–150/mo at 1M+/day), NAT, and provisioned concurrency make up the rest.

The Lambda line item on your bill is the part most teams model. The rest hides in adjacent services that the function can't run without. A fair comparison has to count them, because Fargate either avoids them or pays them differently.

API Gateway — $3.50 per million requests

Most HTTP Lambdas sit behind API Gateway. At high request volume, that per-request fee routinely exceeds the Lambda compute cost itself. A Fargate service behind an Application Load Balancer pays a flat ~$22/month instead, regardless of request count.

CloudWatch Logs — $50–150/mo at 1M+ invocations/day

Every invocation writes a log stream. At a million-plus invocations a day, ingestion alone runs $50–150/month. Both platforms log, but Lambda's per-invocation granularity multiplies the line count fast.

Provisioned concurrency — billed whether or not it runs

The standard fix for cold starts keeps warm instances on standby and charges for them around the clock — the always-on cost model you chose Lambda to avoid.

CPU tied to memory

Lambda scales CPU with the memory setting. A CPU-bound function forces you to over-provision memory you don't need to get more cores. Fargate lets you set vCPU and memory independently.

Key insight

If your monthly Lambda-related bill clears ~$1,000, moving the heaviest function group to Fargate is likely your highest-ROI infrastructure task this quarter. The savings rarely come from the function line alone — they come from dropping the API Gateway and CloudWatch surcharges that ride along with it.

The twist: Lambda MicroVMs move the boundary (June 2026)

On June 22, 2026 AWS shipped Lambda MicroVMs: up to 8 hours, 16 vCPU, 32 GB, Firecracker isolation. It removes the 15-minute limit — but it targets isolated sandboxes for AI and untrusted code, not always-on web services.

For years, the 15-minute timeout was the cleanest reason to leave Lambda: if a job ran longer, you moved it to Fargate or Batch. MicroVMs change that specific fact. Each session runs in its own dedicated MicroVM — Firecracker virtualization, no shared kernel, no shared resources with other sessions — and can hold state across user interactions for up to eight hours.

Max runtime
8 hours
vs 15 min standard
Max vCPU
16
per MicroVM
Max memory
32 GB
per MicroVM
Max disk
32 GB
per MicroVM

“Each session runs in its own dedicated MicroVM with no shared kernel and no shared resources between users, so untrusted code supplied by one user is contained to their execution environment.”

AWS News Blog: Lambda MicroVMs, June 2026

The intended use cases tell you who this is for: AI coding assistants, interactive code environments, data analytics platforms, vulnerability scanners, and game servers that run user-supplied scripts. The common thread is running code you don't trust in a hard isolation boundary, with full lifecycle control over each session.

The part most takes will get wrong: MicroVMs are not Lambda becoming Fargate. A MicroVM is a lifecycle-managed session you launch, use, and tear down — not an always-on listener answering a steady stream of HTTP requests. For a long-lived web service or API, the right tool is still Fargate. What MicroVMs displace is the pattern where teams spun up a Fargate task as a sandbox to run untrusted or AI-generated code — that niche now has a purpose-built home.

If your isolation need is bigger than a single sandbox — a full copy of a service with its dependencies — that is still a container problem, closer to cloning a full environment instead of a single sandbox than to a MicroVM session.

On cost: AWS prices MicroVMs across three dimensions — compute (per-second, on baseline and peak usage), snapshot operations and storage, and data transfer. AWS has not published a flat per-second rate, so there is no clean number to drop into the breakeven math yet. MicroVMs is available in US East (N. Virginia and Ohio), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland).

Cold start and latency reality

Lambda cold starts run from tens to hundreds of milliseconds, worse inside a VPC or with large packages. Provisioned concurrency removes them — but then you pay around the clock, which erodes Lambda's main cost edge.

A cold start is the time Lambda needs to spin up a new execution environment when no warm one is available. For a small function it's tens of milliseconds; in a VPC, or with a heavy runtime and large deployment package, it climbs into the hundreds. For a user-facing API that occasionally idles, that tail latency is the complaint that shows up in your dashboards.

Fargate has its own startup cost — 30 to 90 seconds to launch a task — but it pays that once. After that the task stays warm and per-request latency is steady, because there is no per-invocation environment to create.

The catch is what fixing Lambda's cold start does to the bill. Provisioned concurrency keeps instances warm and charges for them whether or not a request arrives. That is the always-on cost model — the one Lambda was supposed to let you skip. Once you're paying for warm capacity full time, you've recreated Fargate's economics without Fargate's resource control.

The decision: a checklist

Pick Lambda for spiky or unpredictable traffic, short tasks under ~1s, and low volume. Pick Fargate for steady load, functions 5s+, fine CPU/memory control, or a Lambda bill above ~$300–1,000/mo.

DimensionAWS LambdaAWS Fargate
Billing unitPer ms of execution (GB-seconds)Per second of allocated vCPU + GB
Best traffic shapeSpiky, unpredictable, event-drivenSteady, sustained, always-on
Short tasks (<1s)Wins on costOverkill
Long tasks (5s+)Loses fast at scaleWins on cost
Runtime ceiling15 min (8 hr via MicroVMs)Unbounded
Resource controlCPU tied to memoryvCPU + memory set independently
Cold startTens–hundreds of ms30–90s once, then steady
Scale to zeroNative, instantManual (scheduling / desiredCount 0)

Reduced to a few if/then rules:

LAMBDATraffic is spiky or unpredictable, and idle periods are real.
LAMBDAEach invocation is short (under ~1s) and total volume is modest.
FARGATELoad is steady, or invocations average 5 seconds or more.
FARGATEYou need independent CPU and memory control, or runtime past 15 minutes for a service (not a sandbox).
FARGATEYour Lambda-related bill (compute + API Gateway + logs) is past ~$300–1,000/mo for one workload.
MICROVMYou need a hard isolation boundary to run untrusted or AI-generated code for up to 8 hours.

Most teams don't pick one and stop. A typical setup keeps spiky glue on Lambda, runs the steady services on Fargate, and now has MicroVMs for the sandbox case. The mistake isn't mixing them — it's leaving a workload on the wrong one after its traffic shape has changed.

If you read this, you might also want to know

What if my workload is partly spiky and partly steady?

Split it. Run the steady core — the part that handles baseline traffic 24/7 — on Fargate, and keep the spiky overflow or event-driven glue on Lambda. The mistake is forcing one runtime onto a workload with two traffic shapes. A Fargate service for the baseline plus Lambda for bursts is usually cheaper than either alone scaled to cover both.

Where does EC2 fit between Lambda and Fargate on cost?

EC2 sits below Fargate on raw compute price at high, steady utilization, because Reserved Instances and Savings Plans cut 30–50% — but you take on AMI patching, scaling, and capacity planning. The order at steady state is roughly EC2 < Fargate < Lambda on cost, and roughly Lambda < Fargate < EC2 on operational burden. Fargate is the middle: more expensive than tuned EC2, far less to operate.

Are MicroVMs cheaper than running a Fargate task as a sandbox?

There's no clean answer yet — AWS prices MicroVMs on per-second compute plus snapshot storage and data transfer, with no published flat rate. The likely advantage is operational, not only dollar: MicroVMs give per-session isolation and lifecycle control out of the box, where a Fargate sandbox makes you build task launch, teardown, and isolation yourself. Compare on total effort, not the compute line alone.

Common questions

Running a fleet of always-on Fargate environments?

Once you've chosen Fargate, the next bill comes from idle environments running 24/7. Fortem schedules and right-sizes them so you stop paying for compute no one's using. 20 minutes, no Terraform changes.

Response within 4 hours, weekdays.

Worth reading