How to Debug AWS Fargate Containers with ECS Exec?
You moved to Fargate. No more SSH. No more docker exec. Your container is failing and you can't get inside. ECS Exec — AWS's answer to docker exec for Fargate — has been here since 2021. Here's how to set it up, the 5 IAM permissions that catch everyone, and the commands that actually work.
- 01ECS Exec uses SSM Session Manager — bind-mounts an agent into your container, no sidecar needed
- 02Requires 3 things: --enable-execute-command on the service, IAM task role with SSM permissions, and the SSM Session Manager plugin on your local CLI
- 03The #1 failure point is IAM — the task role needs ssmmessages permissions, not just ecs:ExecuteCommand
- 0420-minute idle timeout, 1 session per container, root user — know the limits before you rely on it in production
- 05CloudTrail logs every ExecuteCommand call. S3 and CloudWatch can capture command output for compliance
Why ECS Exec exists — the Fargate debugging gap
Before ECS Exec (launched March 2021), debugging a Fargate container meant you couldn't get a shell at all — there are no EC2 instances to SSH into. Fargate runs your tasks on AWS-managed infrastructure. ECS Exec was the #1 most requested feature on the AWS Containers Roadmap for good reason.
The 5 errors that catch everyone
Every team hits these. The error messages are cryptic. The fixes are specific.
# Update the service to enable ECS Exec
aws ecs update-service \
--cluster your-cluster \
--service your-service \
--enable-execute-command \
--force-new-deployment
# Or for a standalone task:
aws ecs run-task \
--cluster your-cluster \
--task-definition your-task \
--enable-execute-command{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"ssmmessages:CreateControlChannel",
"ssmmessages:CreateDataChannel",
"ssmmessages:OpenControlChannel",
"ssmmessages:OpenDataChannel"
],
"Resource": "*"
}]
}# macOS
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/mac/sessionmanager-bundle.zip" -o "session.zip" && \
unzip session.zip && sudo ./sessionmanager-bundle/install -i /usr/local/sessionmanagerplugin -b /usr/local/bin/session-manager-plugin
# Linux
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/linux_64bit/session-manager-plugin.rpm" -o "plugin.rpm" && \
sudo yum install -y plugin.rpm# Option A: Add NAT Gateway to route traffic to internet
# Option B: Create VPC endpoints for SSM (recommended for private subnets)
aws ec2 create-vpc-endpoint \
--vpc-id vpc-xxx \
--service-name com.amazonaws.region.ssmmessages \
--subnet-ids subnet-xxx# In your task definition, set:
"linuxParameters": {
"initProcessEnabled": true
}
# And remove or set to false:
"readonlyRootFilesystem": falseThe happy path — step by step
# macOS
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/mac/sessionmanager-bundle.zip" -o "session.zip"
unzip session.zip
sudo ./sessionmanager-bundle/install -i /usr/local/sessionmanagerplugin -b /usr/local/bin/session-manager-plugin
# Verify
session-manager-plugin --versionThis is the policy the container needs to call SSM. Attach it to your ECS task role (NOT the execution role — that's for pulling images and writing logs).
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ssmmessages:CreateControlChannel",
"ssmmessages:CreateDataChannel",
"ssmmessages:OpenControlChannel",
"ssmmessages:OpenDataChannel"
],
"Resource": "*"
}
]
}aws ecs update-service \
--cluster my-cluster \
--service my-service \
--enable-execute-command \
--force-new-deploymentaws ecs describe-tasks \
--cluster my-cluster \
--tasks $(aws ecs list-tasks --cluster my-cluster --service my-service --query 'taskArns[0]' --output text)
# Look for:
# "enableExecuteCommand": true
# "lastStatus": "RUNNING" under ExecuteCommandAgent# Interactive shell
aws ecs execute-command \
--cluster my-cluster \
--task YOUR_TASK_ID \
--container nginx \
--command "/bin/bash" \
--interactive
# Single command
aws ecs execute-command \
--cluster my-cluster \
--task YOUR_TASK_ID \
--container nginx \
--command "env | grep DATABASE" \
--interactiveProduction setup — logging, audit, security
ECS Exec is powerful — and you need controls around it in production. Three layers: logging (what commands ran), auditing (who ran them), and access control (who CAN run them).
Configure at the cluster level. Two destinations: S3 for durable retention, CloudWatch for real-time search. CloudTrail separately logs the ExecuteCommand API call (who and when). Together they give you full visibility: CloudTrail = who executed. S3/CloudWatch = what they ran.
aws ecs update-cluster \
--cluster my-cluster \
--configuration executeCommandConfiguration='{
"logging": "OVERRIDE",
"logConfiguration": {
"cloudWatchLogGroupName": "/aws/ecs/my-cluster-exec",
"s3BucketName": "my-exec-logs",
"s3KeyPrefix": "exec-output"
}
}'Use IAM condition keys on ecs:ExecuteCommand. This policy allows exec only on tasks tagged environment=development in a specific cluster. Production tasks are blocked — even if someone has the right IAM role.
{
"Effect": "Allow",
"Action": "ecs:ExecuteCommand",
"Resource": [
"arn:aws:ecs:us-east-1:123456789:cluster/my-cluster",
"arn:aws:ecs:us-east-1:123456789:task/my-cluster/*"
],
"Condition": {
"StringEquals": {
"ecs:ResourceTag/environment": "development"
}
}
}Add a Deny policy that blocks exec on any container named production-app — regardless of IAM role. This is the safety net. Even if someone tags a task wrong, the container name catches it.
{
"Effect": "Deny",
"Action": "ecs:ExecuteCommand",
"Resource": "*",
"Condition": {
"StringEquals": {
"ecs:container-name": "production-app"
}
}
}What ECS Exec can't do
FAQ
If you read this, you might also want to know
Can I use ECS Exec on EC2 launch type?
Yes — it works identically on EC2 and Fargate. On EC2, you need the latest ECS-optimized AMI with the SSM agent pre-installed. If you're using a custom AMI, you need to install the SSM agent and session-manager-plugin yourself. The IAM setup is the same for both launch types.
How do I log ECS Exec sessions for compliance?
Three pieces: (1) CloudTrail captures the ExecuteCommand API call — who, when, which task; (2) S3 or CloudWatch Logs capture command output — configure at cluster level with logging=OVERRIDE; (3) KMS encryption for the data channel — add kmsKeyId to your executeCommandConfiguration. Together they satisfy SOC 2 audit requirements.
Is there a way to debug without ECS Exec — like a sidecar approach?
Some teams run a debug sidecar container (e.g. alpine with curl/netcat) in the same task for diagnostics. This works without enabling ECS Exec but requires modifying your task definition. ECS Exec is simpler because it doesn't change your task definition — it's a runtime feature, not a deployment change.
Does ECS Exec work with ECS Anywhere?
Yes — ECS Exec is supported on external instances (ECS Anywhere) for both Linux and Windows containers. The SSM agent must be installed on the external instance alongside the ECS agent. The IAM and setup requirements are the same.
Operating the fleet
is the rest.
ECS Exec solves one problem — getting a shell. Fleet scheduling, cost visibility, environment cloning, and developer self-service are the next ones. Fortem handles the fleet so you don't have to debug at 2am.