Use CaseJune 11, 2026·7 min read

How to Debug AWS Fargate Containers with ECS Exec?

You moved to Fargate. No more SSH. No more docker exec. Your container is failing and you can't get inside. ECS Exec — AWS's answer to docker exec for Fargate — has been here since 2021. Here's how to set it up, the 5 IAM permissions that catch everyone, and the commands that actually work.

Matt S
Matt S
Platform engineer · Fortem
TL;DR
  • 01ECS Exec uses SSM Session Manager — bind-mounts an agent into your container, no sidecar needed
  • 02Requires 3 things: --enable-execute-command on the service, IAM task role with SSM permissions, and the SSM Session Manager plugin on your local CLI
  • 03The #1 failure point is IAM — the task role needs ssmmessages permissions, not just ecs:ExecuteCommand
  • 0420-minute idle timeout, 1 session per container, root user — know the limits before you rely on it in production
  • 05CloudTrail logs every ExecuteCommand call. S3 and CloudWatch can capture command output for compliance

Why ECS Exec exists — the Fargate debugging gap

ECS on EC2 (before)ECS on Fargate (with ECS Exec)
SSH into EC2 instanceaws ecs execute-command (no SSH needed)
docker exec -it container bash/bin/bash inside container via SSM
Open ports, manage SSH keysNo ports, no keys — IAM controls access
Locate instance in ASG firstDirect to task ID — always routable
Security: instance-level accessSecurity: per-task, per-container IAM

Before ECS Exec (launched March 2021), debugging a Fargate container meant you couldn't get a shell at all — there are no EC2 instances to SSH into. Fargate runs your tasks on AWS-managed infrastructure. ECS Exec was the #1 most requested feature on the AWS Containers Roadmap for good reason.

Key insight
ECS Exec is not a sidecar container or a separate service. It bind-mounts the SSM agent binaries into your existing container at runtime. Your task definition doesn't change — the ECS agent handles the plumbing transparently.

The 5 errors that catch everyone

Every team hits these. The error messages are cryptic. The fixes are specific.

01ExecuteCommandAgent not RUNNING
Cause: You forgot --enable-execute-command when creating or updating the service. ECS Exec must be explicitly turned on per service or per standalone task.
Fix
bash
# Update the service to enable ECS Exec
aws ecs update-service \
    --cluster your-cluster \
    --service your-service \
    --enable-execute-command \
    --force-new-deployment

# Or for a standalone task:
aws ecs run-task \
    --cluster your-cluster \
    --task-definition your-task \
    --enable-execute-command
Verify: aws ecs describe-tasks --cluster your-cluster --tasks task-id — check that enableExecuteCommand is true and ExecuteCommandAgent status is RUNNING
02AccessDeniedException — User is not authorized
Cause: Your task IAM role doesn't have the SSM permissions needed for the agent to open a session.
Fix
bash
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "ssmmessages:CreateControlChannel",
      "ssmmessages:CreateDataChannel",
      "ssmmessages:OpenControlChannel",
      "ssmmessages:OpenDataChannel"
    ],
    "Resource": "*"
  }]
}
Verify: Attach this policy to the task role (NOT the execution role). The SSM agent runs inside the container — it's the task that needs the permissions, not the service launching it.
03TargetNotConnected — Session Manager plugin not found
Cause: The SSM Session Manager plugin is not installed on your local machine.
Fix
bash
# macOS
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/mac/sessionmanager-bundle.zip" -o "session.zip" && \
unzip session.zip && sudo ./sessionmanager-bundle/install -i /usr/local/sessionmanagerplugin -b /usr/local/bin/session-manager-plugin

# Linux
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/linux_64bit/session-manager-plugin.rpm" -o "plugin.rpm" && \
sudo yum install -y plugin.rpm
Verify: session-manager-plugin --version
04Timeout — session never connects
Cause: Your Fargate task has no route to the SSM service endpoint. Either the task is in a private subnet with no NAT gateway, or VPC endpoints for SSM are missing.
Fix
bash
# Option A: Add NAT Gateway to route traffic to internet
# Option B: Create VPC endpoints for SSM (recommended for private subnets)

aws ec2 create-vpc-endpoint \
    --vpc-id vpc-xxx \
    --service-name com.amazonaws.region.ssmmessages \
    --subnet-ids subnet-xxx
Verify: Check task networking: aws ecs describe-tasks — the task must be able to reach ssmmessages.region.amazonaws.com. If you're in a private subnet with no NAT, you MUST have the VPC endpoint.
05Session starts but commands fail — 'cannot create directory'
Cause: Your container's root filesystem is read-only (readonlyRootFilesystem: true). The SSM agent needs to create directories and files inside the container to function.
Fix
bash
# In your task definition, set:
"linuxParameters": {
  "initProcessEnabled": true
}
# And remove or set to false:
"readonlyRootFilesystem": false
Verify: SSM agent writes to /var/lib/amazon/ssm/. If the root FS is read-only, ECS Exec won't work. There's no workaround — the agent needs writable storage.

The happy path — step by step

Step 1 — Install the Session Manager plugin
bash
# macOS
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/mac/sessionmanager-bundle.zip" -o "session.zip"
unzip session.zip
sudo ./sessionmanager-bundle/install -i /usr/local/sessionmanagerplugin -b /usr/local/bin/session-manager-plugin

# Verify
session-manager-plugin --version
Step 2 — Create the task IAM role

This is the policy the container needs to call SSM. Attach it to your ECS task role (NOT the execution role — that's for pulling images and writing logs).

json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ssmmessages:CreateControlChannel",
        "ssmmessages:CreateDataChannel",
        "ssmmessages:OpenControlChannel",
        "ssmmessages:OpenDataChannel"
      ],
      "Resource": "*"
    }
  ]
}
Step 3 — Enable ECS Exec on your service
bash
aws ecs update-service \
    --cluster my-cluster \
    --service my-service \
    --enable-execute-command \
    --force-new-deployment
Step 4 — Verify the agent is ready
bash
aws ecs describe-tasks \
    --cluster my-cluster \
    --tasks $(aws ecs list-tasks --cluster my-cluster --service my-service --query 'taskArns[0]' --output text)

# Look for:
# "enableExecuteCommand": true
# "lastStatus": "RUNNING" under ExecuteCommandAgent
Step 5 — Execute
bash
# Interactive shell
aws ecs execute-command \
    --cluster my-cluster \
    --task YOUR_TASK_ID \
    --container nginx \
    --command "/bin/bash" \
    --interactive

# Single command
aws ecs execute-command \
    --cluster my-cluster \
    --task YOUR_TASK_ID \
    --container nginx \
    --command "env | grep DATABASE" \
    --interactive

Production setup — logging, audit, security

ECS Exec is powerful — and you need controls around it in production. Three layers: logging (what commands ran), auditing (who ran them), and access control (who CAN run them).

1.Layer 1 — Log command output to S3 and CloudWatch

Configure at the cluster level. Two destinations: S3 for durable retention, CloudWatch for real-time search. CloudTrail separately logs the ExecuteCommand API call (who and when). Together they give you full visibility: CloudTrail = who executed. S3/CloudWatch = what they ran.

bash
aws ecs update-cluster \
    --cluster my-cluster \
    --configuration executeCommandConfiguration='{
      "logging": "OVERRIDE",
      "logConfiguration": {
        "cloudWatchLogGroupName": "/aws/ecs/my-cluster-exec",
        "s3BucketName": "my-exec-logs",
        "s3KeyPrefix": "exec-output"
      }
    }'
2.Layer 2 — Restrict who can exec

Use IAM condition keys on ecs:ExecuteCommand. This policy allows exec only on tasks tagged environment=development in a specific cluster. Production tasks are blocked — even if someone has the right IAM role.

bash
{
  "Effect": "Allow",
  "Action": "ecs:ExecuteCommand",
  "Resource": [
    "arn:aws:ecs:us-east-1:123456789:cluster/my-cluster",
    "arn:aws:ecs:us-east-1:123456789:task/my-cluster/*"
  ],
  "Condition": {
    "StringEquals": {
      "ecs:ResourceTag/environment": "development"
    }
  }
}
3.Layer 3 — Block production by container name

Add a Deny policy that blocks exec on any container named production-app — regardless of IAM role. This is the safety net. Even if someone tags a task wrong, the container name catches it.

bash
{
  "Effect": "Deny",
  "Action": "ecs:ExecuteCommand",
  "Resource": "*",
  "Condition": {
    "StringEquals": {
      "ecs:container-name": "production-app"
    }
  }
}

What ECS Exec can't do

LimitationWhy
20-minute idle timeoutThe SSM session drops after 20 minutes of inactivity — not configurable. Active commands keep it alive, but a paused shell will disconnect. Plan for reconnection.
1 session per PID namespaceIf you share a PID namespace across containers in a task, you can only exec into one at a time. The second session will fail until the first exits.
Must be enabled at launchYou can't retroactively enable ECS Exec on an already-running task. If you forgot --enable-execute-command, you need to redeploy the task.
Read-only root FS breaks itThe SSM agent writes to /var/lib/amazon/ssm/ inside the container. readonlyRootFilesystem: true makes this impossible. No workaround.
Commands run as rootEven if your container runs as a non-root user, commands executed through ECS Exec run as root. The SSM agent and its children ignore the container's USER directive.
No AWS Console supportECS Exec is CLI/SDK only. You can't click a button in the Console to get a shell. AWS Copilot supports it (copilot svc exec), but the web Console doesn't.
Only tools in the imageIf curl, netstat, or jq aren't in your container image, you can't use them during an exec session. ECS Exec doesn't inject tools — it only gives you access to what's already there.

FAQ

If you read this, you might also want to know

Can I use ECS Exec on EC2 launch type?

Yes — it works identically on EC2 and Fargate. On EC2, you need the latest ECS-optimized AMI with the SSM agent pre-installed. If you're using a custom AMI, you need to install the SSM agent and session-manager-plugin yourself. The IAM setup is the same for both launch types.

How do I log ECS Exec sessions for compliance?

Three pieces: (1) CloudTrail captures the ExecuteCommand API call — who, when, which task; (2) S3 or CloudWatch Logs capture command output — configure at cluster level with logging=OVERRIDE; (3) KMS encryption for the data channel — add kmsKeyId to your executeCommandConfiguration. Together they satisfy SOC 2 audit requirements.

Is there a way to debug without ECS Exec — like a sidecar approach?

Some teams run a debug sidecar container (e.g. alpine with curl/netcat) in the same task for diagnostics. This works without enabling ECS Exec but requires modifying your task definition. ECS Exec is simpler because it doesn't change your task definition — it's a runtime feature, not a deployment change.

Does ECS Exec work with ECS Anywhere?

Yes — ECS Exec is supported on external instances (ECS Anywhere) for both Linux and Windows containers. The SSM agent must be installed on the external instance alongside the ECS agent. The IAM and setup requirements are the same.

Debugging is step one

Operating the fleet
is the rest.

ECS Exec solves one problem — getting a shell. Fleet scheduling, cost visibility, environment cloning, and developer self-service are the next ones. Fortem handles the fleet so you don't have to debug at 2am.