What is the difference between platform engineering and DevOps for ECS teams?

DevOps focuses on the deployment pipeline — CI/CD, testing, releases. Platform engineering focuses on the operational layer that enables developers to self-serve: environment scheduling, access controls, fleet visibility, cost attribution. For ECS teams, the difference is concrete: DevOps is GitHub Actions deploying a new task definition; platform engineering is making sure the dev environment isn't running at 3am on a Saturday.

Do I need Backstage or a developer portal for ECS platform engineering?

Probably not. Backstage and similar developer portals solve developer-experience problems across a 50+ engineer org with multiple platforms. Most ECS teams at 10–50 environments have an operations problem: environments running when nobody uses them, developers blocked on platform engineers for basic actions, no cost visibility per environment. A developer portal doesn't schedule your environments or give developers scoped access to restart their own services.

At how many ECS environments does platform engineering pay off?

The inflection point is around 10 environments. Below 10, a few Lambda functions and some Terraform modules cover the gaps. Above 10, the DIY approach starts costing more in engineering time than it saves — you have 10+ separate scheduling stacks to maintain, 10+ sets of IAM policies to update, and no single view of what's running. The math: at 12 environments each costing $200/month in idle compute, scheduling them saves ~$1,200/month — more than the cost of any off-the-shelf tool.

What happened to AWS Proton?

AWS shut down Proton to new customers in October 2025. Existing deployments continue to run, but the service received no new features and AWS stopped accepting new accounts. If you were evaluating Proton as your ECS platform engineering layer, it's off the table. See the AWS Proton deprecation guide for migration options.

What does a 2-person ECS platform team realistically own?

A realistic scope for a 2-person ECS platform team: (1) the Terraform modules developers use to provision new environments, (2) the scheduling system that starts/stops non-prod environments, (3) the IAM setup that lets developers self-serve without touching production, (4) the cost attribution system that shows per-environment spend. Everything else — CI/CD pipelines, application architecture, monitoring dashboards — is the application team's responsibility.

Guide

Matt S

June 14, 2026 · 9 min read

platform-engineering-ecsecs-internal-developer-platformecs-developer-self-service

Platform Engineering for ECS Teams: What It Actually Means at 10+ Environments

"Platform engineering" gets used to mean everything from Backstage portals to golden paths to internal tooling teams. For ECS Fargate teams, it means something more specific: closing the gap between what Terraform provisions and what your environments need to operate at scale. At 10 environments the gap is annoying. At 30 it's a full-time job. Here's what that gap looks like and what to do about it.

TL;DR

—Terraform provisions ECS environments. It doesn't operate them — no scheduling, no self-service, no fleet visibility, no cost attribution per environment.
—The "operations gap" opens at ~10 environments and gets worse with every new environment you add.
—Platform engineering for ECS = closing that gap. It doesn't require Backstage, a portal, or a 5-person dedicated team.
—Three things every ECS platform team needs: environment scheduling, developer self-service with scoped access, fleet visibility with cost attribution.
—Build vs buy: custom Lambda + EventBridge scheduling works at 3 environments. At 20 it's a maintenance burden.

What "platform engineering" actually means for an ECS team

Platform engineering solves problems that recur across every service and environment — so developers stop solving them individually. For ECS teams, those problems are operational, not organizational.

The best one-sentence definition comes from a Hacker News thread on the topic: "common problems that your software engineers are having to solve that aren't about the unique value of the system they're building — solved once, for everybody, in a coherent and managed way." That's it. The label doesn't matter.

For an ECS Fargate team, those recurring problems are almost always operational. You have 15 environments. Each one needs to be started in the morning, stopped at night, cloneable for QA, and visible as a fleet. Each developer needs to be able to restart their own environment without asking you. Each environment needs a cost number attached to it.

What platform engineering does not mean for most ECS teams: a Backstage portal, Score language, landing zones, or cloud account governance. Those are enterprise IDP problems — appropriate for a 200+ engineer org running workloads across AWS, GCP, and Azure with five dedicated platform engineers. For a 50-person company running 20 ECS Fargate environments on Terraform, that's the wrong solution to the wrong problem.

The reframe that makes this concrete: platform engineering for ECS = the operational layer that sits on top of Terraform. Terraform provisions. The platform layer operates.

The operations gap — what Terraform can't do

Terraform provisions ECS infrastructure. It has no concept of "stop this environment at 7pm" or "show me which environments are idle right now." That gap widens with every new environment.

This isn't a criticism of Terraform. IaC is the right tool for provisioning. The problem is that provisioning is only half of the job. Once an environment exists, someone has to operate it — and Terraform has no primitives for that.

Here's what the operations gap looks like concretely at 10+ ECS environments:

Gap	What it costs	DIY fix (and its price)
Scheduling	Environments run 168 hrs/week; team works ~55	Lambda + EventBridge + CW cron per environment — 20 separate stacks to maintain at 20 envs
Self-service	Developers open Slack tickets to restart staging on Friday at 6pm	Per-developer IAM policies — updated manually every time a new environment or developer is added
Visibility	No single view of which environments are running, drifted, or healthy	CloudWatch dashboards per environment — manually created, quickly stale
Cost attribution	Cost Explorer shows total Fargate spend, not per-environment cost	Custom cost allocation tags + Cost Explorer grouping — requires consistent tagging across all resources from day one
Orphan detection	$200–$400/month per dead environment nobody shut down	Manual audit — someone opens the console and checks last-used timestamps quarterly

The state sprawl problem compounds all of this. At 50 environments, you're looking at roughly 1,500 Terraform resources. A terraform plan across the full fleet takes 4+ minutes. Adding a new environment requires updating a checklist of steps, not running a single command.

None of this is Terraform's fault. These are operations problems. Terraform was never designed to solve them.

For more on the Terraform state sprawl problem at ECS scale, see Managing ECS Fargate with Terraform: What Works and What Doesn't.

What the operational layer actually contains

Closing the operations gap comes down to three capabilities every ECS team at 10+ environments independently discovers it needs: environment scheduling, developer self-service with scoped access, and fleet visibility with cost attribution. Everything else is optional until those three are solved.

We don't re-derive each one here — the full decision framework, the build-vs-buy economics, and what you can skip (you don't need a Backstage-style portal) live in the canonical guide: Do You Need an Internal Developer Platform for AWS ECS?

1Scheduling — stop non-prod outside business hours; the single highest-leverage move. Mechanics in the scheduling guide, and in the broader best practices for a 10+ environment fleet.
2Self-service with scoped access — per-environment RBAC so a developer restarts their own staging without you as the bottleneck. Full pattern in staging self-service — the kind of DevOps automation that lives beyond the CI/CD pipeline.
3Fleet visibility & cost attribution — per-environment cost Cost Explorer can't show. See why you can't see per-environment costs. Visibility gets harder once the fleet spans a multi-environment strategy or runs across multiple AWS accounts.

Worth reading

ComparisonECS vs EKS: Which AWS Container Service?Platform engineering on ECS vs EKS looks very different. See what the choice means for your IDP scope.GuideDo You Need an Internal Developer Platform for AWS ECS?When a full IDP makes sense vs when scheduling and self-service is all you need.