How to Optimize AWS ECS Costs Beyond Reserved Instances
- ·Fargate compute is only half your ECS bill. ALBs, NAT Gateway, and Container Insights account for 30–52% of total spend in verified fleet benchmarks.
- ·The S3 gateway endpoint is free — add it today. Every container image pull currently routes through NAT at $0.045/GB unless you have one.
- ·ARM64/Graviton Fargate is $0.03238 vs $0.04048/vCPU-hr — a flat 20% reduction on all compute, no architectural change required.
- ·AWS Compute Optimizer covers Fargate (free since Dec 2022). One CLI command returns right-sizing recommendations for every service in your fleet.
Three commands you can run right now — before changing a single task definition.
aws ec2 create-vpc-endpoint \
--vpc-id $VPC_ID \
--service-name com.amazonaws.us-east-1.s3 \
--route-table-ids $RTB_ID \
--type Gatewayaws compute-optimizer get-ecs-service-recommendations \
--query 'ecsServiceRecommendations[*].{
Service:serviceArn,
Finding:finding,
vCPU:recommendationOptions[0].containerRecommendations[0].containerName
}' \
--output tableaws ecs list-clusters | jq -r '.clusterArns[]' | \
xargs -I{} aws ecs describe-clusters \
--clusters {} \
--include SETTINGS \
--query 'clusters[0].{
Name:clusterName,
Insights:settings[?name==`containerInsights`].value|[0]
}' \
--output tableYou've done Spot and scheduling. Here's where the next 30% hides.
Fargate compute is only half the ECS bill. Teams that stop at Spot and scheduling still pay $0.045/GB in NAT data processing, $0.07/metric/month for Container Insights they forgot was on, and 20% more per vCPU than Graviton would cost. Verified average: compute-only estimates undercount total spend by 30–52%.
CloudBurn ran the numbers on a real Fargate fleet and found compute-only estimates of $181.77 against an actual bill of $276.27 — a 52% gap driven entirely by ALB base charges, NAT data processing, and CloudWatch metrics. The gap compounds across environments: 10 environments that look like a $1,800/month Fargate bill are actually closer to $2,700.
This article picks up where Spot and scheduling leave off. If you haven't covered those yet, start with how to cut Fargate compute costs with Spot and scheduling — those two moves alone cut 60–70% before touching anything else. Come back here for the second layer.
The five levers below — NAT/VPC endpoints, Graviton, Container Insights, ALB consolidation, and Compute Optimizer — are independent. You can apply any one of them this week without touching the others. Each section includes the dollar math for a 10-service fleet so you can rank them by impact before you start.
NAT Gateway is quietly billing $0.045/GB on every image pull
Every container image pull and AWS API call from a private subnet runs through NAT at $0.045/GB data processing. A 403MB image pulled 32k times (crash-looping health check) costs ~$566 via NAT — $0.35 with an S3 gateway endpoint. The gateway endpoint is free. Add it before anything else.
The crash-loop story is instructive. One team deployed a container with a health check misconfiguration — the task started, failed, restarted, and repeated 32,000 times over several days. The 403MB image pulled each time at $0.045/GB NAT data processing: 403 MB × 32,000 pulls ÷ 1,024 = 12,594 GB × $0.045 = ~$567. With the free S3 gateway endpoint routing ECR layer pulls through AWS backbone instead of NAT, the same traffic costs $0.35. The endpoint takes 90 seconds to add.
| Option | Hourly | Data processing |
|---|---|---|
| NAT Gateway | $0.045/hr | $0.045/GB |
| Interface endpoint (ECR, CW, etc.) | $0.01/hr/AZ | $0.01/GB |
| Gateway endpoint (S3, DynamoDB) | Free | Free |
The nuance that trips teams up: interface endpoints are not always cheaper than NAT. One interface endpoint costs $0.01/hr/AZ = ~$7.20/month per AZ. Add the five endpoints typically required for private ECS tasks (ECR API, ECR DKR, CloudWatch Logs, Secrets Manager, STS) across 3 AZs: that's 5 × 3 × $7.20 = $108/month in endpoint hourly charges alone — before counting data processing. The fourtheorem team documented exactly this: a setup that looked cheaper with endpoints ($43.84) flipped to $197/month once all required endpoints were counted, versus $100/month with NAT.
“Each service that is not deployable to a VPC requires a new VPC Endpoint… the bills stack up quickly!”
— fourtheorem, Amazon ECS Hidden Costs
The S3 gateway endpoint is always free and takes 90 seconds to add — do it unconditionally. Interface endpoints require explicit break-even math: calculate monthly NAT data charges vs. (number of endpoints × AZs × $7.20). At low data volumes, NAT is cheaper.
Check whether you already have the S3 gateway endpoint in place:
aws ec2 describe-vpc-endpoints \
--filters \
"Name=service-name,Values=com.amazonaws.us-east-1.s3" \
"Name=vpc-endpoint-type,Values=Gateway" \
--query 'VpcEndpoints[*].{State:State,VPC:VpcId}'Empty output means you don't have one. The create command is in the Ready-to-use block above.
Switch to Graviton and take 20% off every Fargate task
ARM64 Fargate costs $0.03238/vCPU-hr vs $0.04048 for x86 — exactly 20% less, same memory pricing. For 10 services × 3 tasks × 2 vCPU running 730 hours, that's $142/month saved with no infrastructure change — just rebuild images for linux/arm64.
The math is clean. For a 10-service fleet where each service runs 3 tasks at 2 vCPU:
That's at 2 vCPU per task. At 0.5 vCPU (smaller tasks), the same formula yields ~$89/month — still worth it for zero architectural work. Memory pricing is identical between x86 and ARM64, so all savings come from compute.
Graviton + Spot is the highest-impact combination on Fargate. The 20% Graviton discount stacks with the ~70% Spot discount for a combined ~76% reduction versus x86 on-demand. For dev environments that already run on Spot, switching to ARM64 pulls another 20% out of the remaining compute spend.
What needs to change: rebuild Docker images with docker buildx build --platform linux/arm64, then update the ECS task definition's runtimePlatform.cpuArchitecture to ARM64. Most Python, Node.js, Go, Java, and Ruby stacks rebuild without any code changes. Test before flipping production — native code compiled for x86, some C-extension Python packages, and kernel-module-dependent workloads need verification.
For the full Spot setup and capacity provider strategy, see how to cut ECS Fargate costs by 65%. Graviton and Spot configure independently — you can add Graviton to an existing Spot setup today.
Container Insights Enhanced: $0.07/metric/month, and it multiplies with fleet size
Enhanced Container Insights for ECS charges $0.07 per metric per month. AWS's own example: 1 cluster, 5 services, 20 tasks, 50 containers = 2,264 metrics = $158.48/month. At 10 environments that's over $1,500/month for observability you may not be actively using.
The metric count compounds with fleet size because each container generates multiple metrics: CPU utilization, memory utilization, network bytes in/out, storage read/write, task count. AWS's CloudWatch pricing page gives the worked example directly: 1 cluster with 5 services, 20 tasks, 50 containers generates 2,264 metrics. At $0.07 each: $158.48/month per cluster.
Enhanced Container Insights is opt-in per cluster — it is not on by default for all clusters. The problem is that teams often enable it when debugging a production issue and never disable it on the dev/staging clusters where they also turned it on. Standard CloudWatch metrics (CPU, memory at task level) still work without Enhanced — they're the base tier and are billed differently. Enhanced adds per-container-level metrics, ECS-specific dimensions, and storage metrics.
Check which clusters have Enhanced enabled, then disable it on non-production clusters:
# Check current status per cluster
aws ecs describe-clusters \
--clusters YOUR_CLUSTER_NAME \
--include SETTINGS \
--query 'clusters[0].settings'
# Disable Enhanced on a cluster
aws ecs update-cluster-settings \
--cluster YOUR_CLUSTER_NAME \
--settings name=containerInsights,value=disabledFor controlling CloudWatch log costs across ECS at fleet scale, see controlling CloudWatch log costs for ECS— that covers log group retention, FireLens vs awslogs, and the log-volume math by service type.
One ALB per environment, not one per service
Each ALB costs $16–20/month in base hourly charges before LCUs. Teams that provision one ALB per microservice per environment run 50–100+ ALBs. One team reduced from 270 ALBs to 9 by switching to host-based routing — one ALB per environment, listener rules route to services.
The Signiant engineering team documented their ALB consolidation in detail: they had 270 ALBs across their infrastructure, running at roughly $16–20 each per month. Switching to a shared ALB model with host-based routing ( *.service.env.internal → listener rules → target groups) reduced that to 9 ALBs — 261 base charges eliminated. At $18/month average: $4,698/month removed from the bill.
The anti-pattern is common: a Terraform module creates one ECS service and one ALB together, so teams end up with an ALB per service per environment by default. Shared ALBs require slightly more routing configuration but the cost argument is clear past 3–4 services per environment.
ALB consolidation also reduces IPv4 address charges. Since February 2024, AWS charges $0.005/hr per public IPv4 address — each ALB typically holds one. At 270 ALBs: 270 × $0.005 × 730 hr = $985/month in IPv4 charges alone.
A shared ALB setup in Terraform:
# Shared ALB — one per environment
resource "aws_lb" "env" {
name = "${var.env_name}-alb"
internal = false
load_balancer_type = "application"
subnets = var.public_subnet_ids
security_groups = [aws_security_group.alb.id]
}
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.env.arn
port = 443
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
certificate_arn = var.acm_cert_arn
default_action {
type = "fixed-response"
fixed_response {
content_type = "text/plain"
message_body = "no route"
status_code = "404"
}
}
}
# Per-service listener rule — host-based routing
resource "aws_lb_listener_rule" "api" {
listener_arn = aws_lb_listener.https.arn
priority = 100
action {
type = "forward"
target_group_arn = aws_lb_target_group.api.arn
}
condition {
host_header {
values = ["api.${var.env_name}.example.com"]
}
}
}
resource "aws_lb_listener_rule" "worker" {
listener_arn = aws_lb_listener.https.arn
priority = 110
action {
type = "forward"
target_group_arn = aws_lb_target_group.worker.arn
}
condition {
host_header {
values = ["worker.${var.env_name}.example.com"]
}
}
}Caveat: WebSocket services and services that require conflicting port bindings may need their own ALB. Otherwise, one ALB per environment handles up to 100 listener rules (the default soft limit, extendable via quota request).
Compute Optimizer runs a free fleet right-sizing pass — and it's scriptable
AWS Compute Optimizer has supported Fargate since December 2022 at no charge. get-ecs-service-recommendations returns CPU and memory recommendations at both task and container level. Script it across all services and diff against current task definitions to find over-provisioned tasks fleet-wide.
Compute Optimizer launched ECS Fargate support on December 23, 2022 — it's free, and most teams haven't used it. The tool analyzes CloudWatch utilization metrics from the trailing 14 days and returns recommendations at two levels: the task definition (overall CPU/memory) and the individual container (container-level CPU/memory shares within the task). For over-provisioned long-running services, the claimed savings are 30–70% of compute spend.
The gotcha that wastes time:Compute Optimizer won't generate recommendations for a service if a target-tracking Auto Scaling policy is attached to CPU or memory for that service. If you check a service and get no recommendations, verify whether it has an ASG policy. Recommendations require at least 24 hours of CloudWatch and ECS utilization data in the trailing 14-day window.
Fleet script — loop over all clusters and return recommendations for every service:
#!/bin/bash
# Get Compute Optimizer recommendations for all ECS services
# Requires: AWS CLI v2, jq
REGION="${AWS_DEFAULT_REGION:-us-east-1}"
echo "Fetching all clusters..."
CLUSTERS=$(aws ecs list-clusters --region "$REGION" --query 'clusterArns[]' --output text)
for CLUSTER in $CLUSTERS; do
CLUSTER_NAME=$(basename "$CLUSTER")
echo ""
echo "=== $CLUSTER_NAME ==="
SERVICES=$(aws ecs list-services --cluster "$CLUSTER" --region "$REGION" --query 'serviceArns[]' --output text)
for SERVICE_ARN in $SERVICES; do
RESULT=$(aws compute-optimizer get-ecs-service-recommendations --service-arns "$SERVICE_ARN" --region "$REGION" --query 'ecsServiceRecommendations[0].{
Finding: finding,
CurrentCPU: currentServiceConfiguration.cpu,
CurrentMem: currentServiceConfiguration.memory,
RecommendedCPU: recommendationOptions[0].containerRecommendations[0].memorySizeConfiguration.cpu,
RecommendedMem: recommendationOptions[0].containerRecommendations[0].memorySizeConfiguration.memory
}' --output json 2>/dev/null)
if [ -n "$RESULT" ] && [ "$RESULT" != "null" ]; then
SERVICE_NAME=$(basename "$SERVICE_ARN")
echo "$SERVICE_NAME: $RESULT" | jq -c .
fi
done
doneTo use this in CI: run the script, parse the JSON output, and compare recommended vs current CPU/memory in each task definition. Flag as a PR comment or Slack alert when drift exceeds 20%. Teams that do this catch over-provisioning at deploy time before it accumulates.
“If you've estimated your ECS costs based only on Fargate compute pricing, you're probably underestimating by 30–50%.”
— CloudBurn, AWS Fargate Pricing: Real Costs
If you read this, you might also want to know
How do I know if my ECS tasks are right-sized without Compute Optimizer?
Check CloudWatch Container Insights → CPU and Memory Utilization per service. Look at p95 over a 2-week window. If p95 CPU stays below 30% of the task allocation, drop to the next Fargate size (e.g. 1 vCPU → 0.5 vCPU). If p95 memory stays below 40%, halve the memory allocation. Standard Container Insights (not Enhanced) gives you this at the task level for free.
Can I mix x86 and ARM64 tasks in the same ECS cluster?
Yes — ECS clusters are architecture-agnostic. Each task definition specifies its own runtimePlatform.cpuArchitecture. You can migrate services one at a time: flip a dev task definition to ARM64, test for a week, then move staging and production. No cluster changes required.
How do I set up AZ-aware task placement to avoid cross-AZ charges?
Use the spread placement strategy with field=attribute:ecs.availability-zone to distribute tasks across AZs, and ensure your target groups use the same AZ as the tasks routing through them. Cross-AZ traffic in the same region costs $0.01/GB each direction. For high-throughput services, co-locating the ALB target group and the task in the same AZ eliminates this charge.
What happens to in-flight requests when a Fargate Spot task is interrupted?
ECS sends SIGTERM to the container and fires an EventBridge event with stopCode SPOT_INTERRUPTION. You have up to 2 minutes before the task is forcibly stopped. Set stopTimeout to 120 seconds in your task definition to use the full window. Configure your ALB to deregister the target before SIGTERM (the connection draining period handles this). In-flight requests that complete within the draining window are served normally.