The Fleet Audit skill, in full

This is the exact file you download. One markdown file your CLI agent reads. It makes read-only AWS calls (6 ECS reads, plus sts:GetCallerIdentity and ec2:DescribeRegions; an optional ~$0.01 Cost Explorer cross-check) and uploads zero bytes — the IAM policy below is the whole permission surface. Read it, then run it.

1 fileread-only AWS calls0 bytes uploaded1518 lines

name: fortem-fleet-audit description: | Audits your AWS ECS Fargate fleet: maps all environments, computes modeled monthly cost (compute + shared overhead, from task definitions), and produces a savings estimate with ROI calculation vs. Fortem pricing. Generates three local files: - HTML fleet report with ROI callout and conversion path - YAML config for Fortem onboarding - DIY scheduler with honest limitations breakdown

Read-only AWS access. Nothing transmitted. Runs in 5–15 minutes.

Fortem Fleet Audit Skill

Universal AI agent skill. Read-only AWS access. Nothing leaves your machine. Works with Claude Code, Codex, OpenCode, Gemini CLI, or any CLI agent. Inspired by the Agent Skills open format.

What this skill produces

Three files land on your machine. Open them in this order:

File	What it is	When to read
`fortem-discovery-report.html`	Visual fleet report — your real cluster names, costs, savings estimate	Open in browser first
`fortem-discovery.yaml`	Your fleet mapped to Fortem config format	Bring to a Fortem call
`diy-scheduler.yaml`	A CloudFormation snippet that turns off your dev/staging envs after 7pm weekdays — the limited local build that captures the savings without Fortem	Read to see what "doing it yourself" looks like

Total time: 5–15 minutes. The agent asks 5 questions, then runs read-only AWS CLI.

Required permissions: 6 read-only ECS calls, plus two auxiliary read-only lookups (sts:GetCallerIdentity, ec2:DescribeRegions). No Pricing API access needed (the skill uses a curated rates table). Nothing is created, modified, or deleted. The only call that costs anything is the optional Cost Explorer cross-check (ce:GetCostAndUsage, ~$0.01/call) — it runs only if your credentials already permit it, and is skipped silently otherwise.

AWS access — how it works (read this if you're not sure)

This skill reads your ECS fleet by calling the AWS CLI on your own machine, using your own credentials. It makes 6 read-only ECS calls (ecs:List* / ecs:Describe* — listed below), plus two auxiliary read-only lookups (sts:GetCallerIdentity to confirm the account, ec2:DescribeRegions to find active regions). It never creates, updates, or deletes anything, and your credentials never leave your machine — Fortem never sees them. The skill talks only to the AWS API, never to Fortem.

You give the skill access in one of two ways. Use the first.

1. Recommended: a named AWS profile

If you already run AWS locally you almost certainly have a profile — use it. If not, set one up once:

aws configure --profile fortem-audit
# paste your AWS Access Key ID, Secret Access Key, and default region when prompted

This writes the credentials to ~/.aws/credentials on your machine (persists across sessions, reusable, nothing transmitted). Then point the skill at it:

export AWS_PROFILE=fortem-audit

The skill also asks which profile to use in Phase 2, so you can just name it there. If you have several accounts, run the skill once per profile.

2. Fallback: paste keys for this session only

If you'd rather not configure a profile, export the credentials as environment variables before running. They live only in the current shell session and are used only to call the AWS API:

export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
export AWS_SESSION_TOKEN=...        # only if you're using temporary STS credentials; otherwise omit
export AWS_DEFAULT_REGION=us-east-1

The AWS CLI picks these up automatically — no profile needed. Close the terminal and they're gone.

Permissions

Either way, the credentials need only read-only ECS access — the 6-statement policy below (ecs:List* / ecs:Describe*). No write access of any kind. Review it before applying.

No credentials at all?

If neither a profile nor keys are configured, the skill does NOT dead-end — it automatically runs in test mode (Phase 0) on example data so you can see the full output first, then come back with real access.

Read-only IAM policy (copy-paste)

{
  "Version": "2012-10-17",
  "Statement": [
    { "Effect": "Allow", "Action": ["ecs:ListClusters", "ecs:DescribeClusters", "ecs:ListServices", "ecs:DescribeServices", "ecs:DescribeTaskDefinition", "ecs:ListTagsForResource", "ec2:DescribeRegions", "sts:GetCallerIdentity"], "Resource": "*" }
  ]
}

Optional — real billed cost cross-check. If you also want the audit to show your actual AWS-billed ECS spend (from Cost Explorer) next to the modeled number, add one more read-only action: ce:GetCostAndUsage. The skill probes for it and silently skips if it's absent — it's never required, and Cost Explorer bills $0.01 per call. Most read-only ECS roles don't include it; add it only if you want the real-vs-modeled comparison.

{ "Effect": "Allow", "Action": ["ce:GetCostAndUsage"], "Resource": "*" }

Terraform / CloudFormation to create the role

If you don't already have a read-only ECS role, here's the smallest setup. Save as iam-readonly-ecs.tf and apply:

resource "aws_iam_role" "fortem_discovery" {
  name = "fortem-discovery-readonly"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Principal = { AWS = "arn:aws:iam::YOUR_ACCOUNT_ID:user/YOUR_IAM_USER" }
      Action = "sts:AssumeRole"
    }]
  })
}

resource "aws_iam_role_policy_attachment" "readonly_ecs" {
  role       = aws_iam_role.fortem_discovery.name
  policy_arn = aws_iam_policy.readonly_ecs.arn
}

resource "aws_iam_policy" "readonly_ecs" {
  name = "fortem-discovery-readonly-ecs"
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Action = [
        "ecs:ListClusters", "ecs:DescribeClusters",
        "ecs:ListServices", "ecs:DescribeServices",
        "ecs:DescribeTaskDefinition",
        "ecs:ListTagsForResource",
        "ec2:DescribeRegions", "sts:GetCallerIdentity"
      ]
      Resource = "*"
    }]
  })
}

CloudFormation equivalent (save as iam-readonly-ecs.yaml):

AWSTemplateFormatVersion: "2010-09-09"
Resources:
  FortemDiscoveryRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: fortem-discovery-readonly
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal: { AWS: "arn:aws:iam::YOUR_ACCOUNT_ID:user/YOUR_IAM_USER" }
            Action: sts:AssumeRole
      Policies:
        - PolicyName: readonly-ecs
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - ecs:ListClusters
                  - ecs:DescribeClusters
                  - ecs:ListServices
                  - ecs:DescribeServices
                  - ecs:DescribeTaskDefinition
                  - ecs:ListTagsForResource
                  - ec2:DescribeRegions
                  - sts:GetCallerIdentity
                Resource: "*"

Multi-account via multiple profiles (optional)

If you want to scan more than one AWS account, run the skill once per account with a different AWS_PROFILE each time. The skill writes fortem-discovery.yaml and fortem-discovery-report.html to the current directory — if you run from the same directory each time, the second run overwrites the first. Workaround: run each scan from a separate subdirectory (mkdir fortem-prod && cd fortem-prod && AWS_PROFILE=prod ...), then merge the YAMLs manually.

This skill does not support sts:AssumeRole across accounts in a single run. Use the multi-profile workflow above.

Step 0 — Cold-start preflight (run this first, always)

You were probably invoked by a pasted prompt that pointed at this URL, with no other setup. Before anything else, run a silent preflight so the first run never dead-ends:

Is the AWS CLI installed? (command -v aws)
Are credentials resolvable? (aws sts get-caller-identity succeeds — via a profile or env vars)

If both pass: continue to Phase 1, then ask the Phase 2 questions, then run the real scan.
If either fails (no aws CLI, or no credentials): do NOT stop and do NOT block on questions. Automatically set FORTEM_TEST_MODE=1 and produce the full example report so the user sees exactly what they'd get. Tell them clearly: "No AWS credentials found — showing example output. To scan your real fleet, set up access (see 'AWS access' above) and run me again."

This guarantees a pasted, zero-setup run always ends in a report, never an error.

Phase 0 — Test mode (no AWS account required)

Test mode runs the skill against example data, with no AWS calls at all. It's triggered two ways: automatically by Step 0 when no credentials are found, or manually when a user wants to preview:

export FORTEM_TEST_MODE=1

When set, the skill skips AWS entirely and uses 3 example environments (14 total envs across 2 regions, ~$4,200/mo). The HTML report and YAML still generate normally. The DIY scheduler is identical to what a real run produces.

Use test mode for:

First-time exploration of what the skill does
Demos in a sales conversation
Validating the HTML template renders correctly
When a user just wants to see "what would this look like for a typical fleet"

When FORTEM_TEST_MODE=1 is not set but credentials also can't be resolved, Step 0 has already switched to test mode automatically — generate the example report and tell the user how to come back with real access. Never fail, never silently stop.

Phase 1 — Research (silent, no user interaction)

Before asking the user anything, briefly ground yourself. Use your training data or one web search if needed:

AWS ECS data model — clusters contain services, services have task definitions, task definitions define containers with vCPU + memory
Common naming conventions — {region}-{account}-{env} (e.g. use1-prod-main), {env}-{name}, {name}-{env}
Fargate pricing is per-region — us-east-1 and us-west-2 are the cheapest; eu/ap regions are 5-15% higher. Phase 3a fetches the right rates for the current region from a curated table sourced from https://aws.amazon.com/fargate/pricing/ (verified May 2026). Don't hardcode us-east-1 constants.
Fargate Spot — fixed ~70% discount per AWS (no spot price API for Fargate, unlike EC2). Not 0.30 magic; it's a flat 70% across all regions.
Tagging conventions — Environment, Stage, Team, Project, Tier, Namespace
Shared services overhead — Fargate compute is only ~~60% of real env cost. ALB (~~$22/mo), NAT Gateway (~$33-66/mo per AZ), CloudWatch Logs ($0.50/GB), ECR storage ($0.10/GB), and EFS ($0.30/GB) add a fixed overhead that tags cannot attribute. Phase 3.5 estimates this and adds it to each env's total cost. At typical Fortem customer defaults (NAT per env × 2 AZs, mixed ALB, 50 GB CW, shared ECR), overhead is ~$84/env.

You do not need to read this back to the user. Just have it in mind.

Phase 2 — Ask the user (5 questions, all at once)

Ask these together. Do not ask one at a time. The user can answer them in any order.

Before I scan your AWS account, five quick questions:

1. Which AWS profile should I use?
   (default: $AWS_PROFILE env var, then 'default' profile)
   If you don't know which one, run: aws configure list-profiles

2. How many AWS accounts should I scan?
   (default: 1, using the profile from Q1; for multiple accounts,
   run this skill once per account with a different profile)

3. Which regions? (default: every region where ECS is enabled)

4. Do you have Terraform locally?
   If yes, what's the path? (optional, but it enriches the output)

5. Which environments are production? Anything not in this list
   gets scheduling suggestions.

What you infer without asking:

Default region from aws configure get region --profile "$PROFILE"
Default timezone from the user's machine

Do not ask:

AWS permissions (they're documented in the "AWS access — how it works" section above — the user reviews them)
File format preferences (YAML + HTML are standard)
Whether to use the skill at all (they already chose)

Storing the profile for the rest of the skill: Once the user answers Q1, store the value in a shell variable and use it in every AWS CLI call:

PROFILE="<answer from Q1>"   # e.g. "dimas-fortem-prod", "default", or "" for default
[[ -z "$PROFILE" ]] && PROFILE_ARG="" || PROFILE_ARG="--profile $PROFILE"

Every subsequent aws command in this skill uses $PROFILE_ARG. The HTML report records which profile was used (so the report is auditable).

Phase 3 — AWS discovery

Every aws command in this phase includes $PROFILE_ARG from Phase 2. If $PROFILE_ARG is empty (default profile), omit it. Commands are shown with $PROFILE_ARG as a literal — substitute the value when running.

3a. Fetch Fargate pricing for the current region (real rates, not hardcoded us-east-1)

Fargate rates vary 5-15% by region. This skill uses real per-region rates from the AWS pricing page, verified monthly. The table below is the source of truth — update it when AWS publishes a new rate change (rare).

# Detect the region we're scanning
PROFILE_REGION=$(aws configure get region $PROFILE_ARG)

# Per-region Fargate on-demand rates (Linux/x86, USD per hour)
# Source: https://aws.amazon.com/fargate/pricing/ — verified May 2026
# Format: "region": "vCPU_RATE GB_RATE"
declare -A FARGATE_RATES=(
  ["us-east-1"]="0.04048 0.004445"
  ["us-east-2"]="0.04048 0.004445"
  ["us-west-1"]="0.04576 0.005013"
  ["us-west-2"]="0.04048 0.004445"
  ["eu-west-1"]="0.04576 0.005013"
  ["eu-west-2"]="0.04576 0.005013"
  ["eu-west-3"]="0.04656 0.005101"
  ["eu-central-1"]="0.04656 0.005101"
  ["eu-central-2"]="0.04656 0.005101"
  ["eu-north-1"]="0.04516 0.004948"
  ["eu-south-1"]="0.04576 0.005013"
  ["ca-central-1"]="0.04576 0.005013"
  ["ca-west-1"]="0.04576 0.005013"
  ["ap-southeast-1"]="0.04456 0.004880"
  ["ap-southeast-2"]="0.04864 0.005330"
  ["ap-southeast-3"]="0.04576 0.005013"
  ["ap-southeast-4"]="0.04576 0.005013"
  ["ap-southeast-5"]="0.04656 0.005101"
  ["ap-northeast-1"]="0.04576 0.005013"
  ["ap-northeast-2"]="0.04864 0.005330"
  ["ap-northeast-3"]="0.04864 0.005330"
  ["ap-south-1"]="0.04456 0.004880"
  ["ap-south-2"]="0.04576 0.005013"
  ["ap-east-1"]="0.04864 0.005330"
  ["ap-east-2"]="0.04456 0.004880"
  ["sa-east-1"]="0.05328 0.005837"
  ["me-south-1"]="0.04864 0.005330"
  ["me-central-1"]="0.04576 0.005013"
  ["il-central-1"]="0.04864 0.005330"
  ["af-south-1"]="0.04932 0.005404"
  ["ca-west-99"]="0.04576 0.005013"
  ["mx-central-1"]="0.04576 0.005013"
)

# Look up the rates for the current region
RATES="${FARGATE_RATES[$PROFILE_REGION]}"

if [ -n "$RATES" ]; then
  VCPU_RATE=$(echo "$RATES" | awk '{print $1}')
  MEM_RATE=$(echo "$RATES" | awk '{print $2}')
  REGION_HAS_KNOWN_RATES=true
else
  # Fallback: us-east-1 with a warning
  VCPU_RATE=0.04048
  MEM_RATE=0.004445
  REGION_HAS_KNOWN_RATES=false
  echo "⚠ WARNING: region '$PROFILE_REGION' not in Fargate rates table. Using us-east-1 rates as fallback. Update skill to add this region." >&2
fi

Why a curated table instead of Pricing API? The Pricing API works for ECS Compute products but its Fargate usagetype filter is inconsistent across regions (some have regional prefixes like USE2-, others don't), and pagination across thousands of products is awkward in a bash script. The table above is sourced from the same place (https://aws.amazon.com/fargate/pricing/) — it's just pre-parsed. Verified monthly.

If AWS changes Fargate pricing (rare): update the values in the array. Add a new region by appending a new line.

3c. List active ECS regions

aws ec2 describe-regions $PROFILE_ARG --query 'Regions[?!starts_with(RegionName, `cn-`)].RegionName' --output text

If FORTEM_TEST_MODE=1, skip this — use us-east-1 and us-west-2 as the test regions.

3d. List clusters per region

for region in $REGIONS; do
  aws ecs list-clusters \
    $PROFILE_ARG \
    --region "$region" \
    --query 'clusterArns[]' \
    --output text
done

If a region returns throttling (ThrottlingException):

Wait 2 seconds and retry
If it persists, skip the region and note: Region <region> skipped due to throttling. Retry later.

3e. Describe clusters (metadata + tags)

aws ecs describe-clusters \
  $PROFILE_ARG \
  --region "$region" \
  --clusters <cluster-arn-1> <cluster-arn-2> ... \
  --include TAGS \
  --query 'clusters[].{name:clusterName,arn:clusterArn,status:status,registeredAt:registeredAt,tags:tags}'

3f. List services per cluster

aws ecs list-services \
  $PROFILE_ARG \
  --region "$region" \
  --cluster <cluster-name> \
  --max-items 100 \
  --query 'serviceArns[]' \
  --output text

Paginate with --next-token if more than 100 services (rare for non-prod).

3g. Describe services (capacity, task def, load balancers)

aws ecs describe-services \
  $PROFILE_ARG \
  --region "$region" \
  --cluster <cluster-name> \
  --services <svc1> <svc2> ... \
  --query 'services[].{name:serviceName,desired:desiredCount,running:runningCount,taskDef:taskDefinition,launchType:launchType,platformVersion:platformVersion,deployments:deployments[].{status:status,rolloutState:rolloutState}}'

3h. Describe task definitions (CPU + memory, the cost drivers)

aws ecs describe-task-definition \
  $PROFILE_ARG \
  --task-definition <family>:<revision> \
  --query 'taskDefinition.{family:family,cpu:cpu,memory:memory,containerDefs:containerDefinitions[].{name:name,cpu:cpu,memory:memory,image:image,envKeys:environment[].name}}'

Critical: only extract environment variable NAMES, never values. Secret leakage is the #1 risk in this skill.

3i. Cost calculation per service

Use the per-region rates fetched in Phase 3a ($VCPU_RATE and $MEM_RATE shell variables), not the hardcoded us-east-1 constants. After computing per-service compute cost, add the per-env shared overhead from Phase 3.5 to get the env's total.

def monthly_cost(cpu_units: int, memory_mib: int, schedule: str = "24-7") -> float:
    """schedule: '24-7', 'weekdays-9-19', 'weekdays-8-20', 'weekends-off'
    $VCPU_RATE and $MEM_RATE are substituted from Phase 3a — they vary by region.
    """
    vcpu = cpu_units / 1024  # ECS reports CPU in 1024-unit vCPU
    gb = memory_mib / 1024
    # Rates are substituted from shell vars at skill-invocation time
    base_per_hour = vcpu * float("${VCPU_RATE}") + gb * float("${MEM_RATE}")

    hours_per_month = 730
    if schedule == "weekdays-9-19":
        hours_per_month = 50 * 4.345  # 50 work-hours per week
    elif schedule == "weekdays-8-20":
        hours_per_month = 60 * 4.345
    elif schedule == "weekends-off":
        hours_per_month = 730 * 5 / 7  # ~24% off
    # else 24-7

    return base_per_hour * hours_per_month

Per service, sum across all running tasks. Per environment, sum across services. Then add per-env shared overhead:

# $SHARED_OVERHEAD_PER_ENV is computed in Phase 3.5b and substituted
env_compute_cost = sum(services_costs)
env_total_cost = env_compute_cost + float("${SHARED_OVERHEAD_PER_ENV}")

The YAML estimated_cost_mo field = compute only. The YAML estimated_total_cost_mo field = compute + shared overhead. The HTML report's summary cards show total (compute + overhead), not just compute.

Be precise about what this number is. Per-environment cost is modeled — computed from each service's task-definition vCPU/memory × the current Fargate rates, plus an estimate of shared overhead. It is NOT pulled from your AWS bill. It's typically within ~10–15% on the Fargate compute line, but it cannot see Savings Plans, RIs, data transfer, or anything cost allocation tags don't capture. Always label it "modeled" in the report and YAML — never "your AWS bill."

3i-bis. Real billed cost cross-check via Cost Explorer (optional, only if access exists)

This is the one place we can show a real number from AWS billing — but only if the user's credentials happen to have Cost Explorer read access. Treat it as a best-effort enrichment, never a requirement.

Make exactly ONE call (this same call is both the access probe and the data fetch — there is no separate free probe; if it succeeds you have the data, if it 403s you skip). Most read-only ECS roles do NOT include CE, so expect this to be skipped often:

# Cost Explorer is a billing API: $0.01 per GetCostAndUsage request, us-east-1 endpoint only.
# Portable 30-days-ago date — BSD/macOS (-v) first, GNU/Linux (-d) fallback.
START=$(date -u -v-30d +%Y-%m-%d 2>/dev/null || date -u -d '30 days ago' +%Y-%m-%d)
END=$(date -u +%Y-%m-%d)
aws ce get-cost-and-usage \
  --time-period Start=$START,End=$END \
  --granularity MONTHLY \
  --metrics UnblendedCost \
  --filter '{"Dimensions":{"Key":"SERVICE","Values":["Amazon Elastic Container Service"]}}' \
  --region us-east-1 2>/dev/null

If it succeeds: show the real billed ECS total for the last 30 days alongside the modeled total in the report — labelled "Actual (AWS Cost Explorer)" vs "Modeled (this audit)". (If the account mixes Fargate and EC2 launch types, this CE total covers both, so note it as the full ECS service total.) The gap between actual and modeled is itself a useful signal (Savings Plans, data transfer, untagged spend). Note CE data lags up to 24h and excludes today.
If per-environment attribution is wanted AND cost allocation tags are active: add --group-by Type=TAG,Key=<env tag> to the SAME call (don't make a second one) to break the real total down per environment. If tags aren't activated (the common case — it's the very pain this audit exists for), say so plainly: "Cost Explorer can show your real total, but not per-environment, until you activate cost allocation tags. The per-env split below is modeled."
If access is denied or ce isn't permitted: skip silently. Do NOT error, do NOT ask the user to grant billing access. The modeled number stands on its own; CE is a bonus when it's already there.

Cost & honesty note: the single GetCostAndUsage call bills $0.01 to the user's account. Make exactly one call (probe and fetch are the same request). Because this is the only non-free, non-ECS call in the skill, never run it in FORTEM_TEST_MODE and never run it without the access being already present.

3j. Fargate Spot (optional but valuable)

If any service uses capacityProviderStrategy with FARGATE_SPOT, apply the Spot discount. Fargate Spot is a fixed ~70% discount off on-demand for the same region — there is no per-region variation and no Pricing API endpoint to query a live Spot price (unlike EC2 Spot). So cost = on_demand_cost × 0.30 (i.e., 70% off).

aws ecs describe-services \
  $PROFILE_ARG \
  --region "$region" \
  --cluster <cluster-name> \
  --services <svc> \
  --query 'services[].capacityProviderStrategy'

3k. Tag enrichment

For each cluster and service, list tags and prioritize:

Environment / Stage / Tier — for stage inference
Team / Owner / Project — for grouping
CostCenter / Namespace — for cost allocation

aws ecs list-tags-for-resource $PROFILE_ARG --resource-arn <arn> --region "$region"

3l. Fleet health signals (compute from data already collected — no extra AWS calls)

The cost number is the hook. Fleet sprawl is the part a cron job can't see — and it's why this audit exists. From the cluster/service/tag data you already pulled in 3c–3k, compute three signals. No new IAM permissions, no new API calls — these are derived from what's already in memory.

Owner-less environments — count environments with no Owner / Team / Project tag (the keys from 3k). Report X of N environments have no owner tag. This is the "whose staging is this?" pain made countable.
Sprawl / recent growth — using cluster registeredAt (from 3e), count how many environments were created in the last 90 days. Report Y environments created in the last quarter. Flag clusters with registeredAt older than 90 days AND runningCount 0 across all services as likely stale (Z environments registered long ago, nothing running).
Idle-but-running zombies — count environments where every service has runningCount > 0 but the env is a non-prod stage (from Phase 5 mapping) with no recent deployment activity (all deployments[].rolloutState == COMPLETED, none IN_PROGRESS). These are paid-for, on, and untouched.

Keep the framing honest and dry: these are signals derived from tags and metadata, not a verdict. If tags are clean and the fleet is small, say so — No owner-less environments found. Tagging is in good shape. Never invent a problem that isn't in the data.

Store the three counts for the report (OWNERLESS_COUNT, SPRAWL_COUNT, STALE_COUNT) and the Phase 10 terminal summary.

Phase 3.5 — Shared services cost (asks user, computes fixed overhead)

Fargate compute is only ~60% of a real environment's monthly cost. The rest is shared infrastructure that cost allocation tags cannot attribute to a specific environment: Application Load Balancers, NAT Gateways, CloudWatch Logs, ECR storage, EFS/FSx.

The skill now estimates this fixed overhead and adds it to each environment's total. The estimate is rough by design (it's better than ignoring 40% of the cost) and configurable per AWS account architecture.

3.5a. Ask 5 questions about shared infrastructure

Ask these after Phase 3 discovery is complete, so the user can see the env list and answer more accurately. The skill has just printed how many envs it found — reference that count.

Now that I've found N envs, a few more questions about shared infra
so I can include the overhead costs (ALB, NAT, CloudWatch, ECR).
Defaults assume "I don't track this" if you skip.

1. ALB pattern:
   (1) One shared ALB for all envs       (~$22 / N envs per month)
   (2) One ALB per env                    (~$22 per env, typical prod-grade)
   (3) Mixed: shared for non-prod, dedicated for prod
   (4) I don't track this — use defaults

2. NAT Gateway pattern:
   (1) One shared NAT, all envs route through it   (~$33 × AZs / N envs)
   (2) One NAT per env (typical for VPC-per-env)   (~$66 per env at 2 AZs)
   (3) Mixed
   (4) I don't track this — use defaults
   (Default: "per env × 2 AZs" — the most common pattern for teams
   with 10+ envs. This is the biggest line item.)

3. CloudWatch Logs ingest per month, rough estimate (in GB):
   (Free text, or "unknown" → use 50 GB/mo default, ~$25 / N envs)

4. ECR repos:
   (1) All shared (one repo per service, all envs pull same image)  (~$5/env)
   (2) Per-env (each env has its own image)                          (~$10/env)
   (3) I don't track this — use defaults

5. EFS or FSx file systems?
   (1) None
   (2) Yes, shared (~$10-30 / N envs, depends on size)
   (3) I don't track this — use defaults

3.5b. Compute per-env overhead

After the user answers, compute a per-env fixed overhead. Use this in the per-env cost calculation (Phase 3i) and in the HTML report's "Shared services" section.

# Per-component rates (us-east-1, May 2026)
ALB_BASE=22                  # $ per ALB per month (base + LCU variable, ignore LCU)
NAT_PER_AZ=33               # $ per NAT Gateway per AZ
CW_LOGS_PRICE=0.50           # $ per GB ingested
ECR_PER_ENV_SHARED=5         # $ per env for shared ECR
ECR_PER_ENV_DEDICATED=10     # $ per env for dedicated ECR
EFS_PER_ENV=15               # $ rough per env for shared EFS/FSx

# Compute per-env overhead for each component
case "$ALB_PATTERN" in
  shared)     ALB_PER_ENV=$(echo "scale=2; $ALB_BASE / $N_ENVS" | bc) ;;
  per_env)    ALB_PER_ENV=$ALB_BASE ;;
  mixed)      ALB_PER_ENV=$(echo "scale=2; $ALB_BASE * 0.5 / $N_ENVS + $ALB_BASE * 0.5" | bc) ;;
  *)          ALB_PER_ENV=$(echo "scale=2; $ALB_BASE * 0.5 / $N_ENVS + $ALB_BASE * 0.5" | bc) ;;
esac

case "$NAT_PATTERN" in
  shared)     NAT_PER_ENV=$(echo "scale=2; $NAT_PER_AZ * 2 / $N_ENVS" | bc) ;;
  per_env)    NAT_PER_ENV=$(echo "scale=2; $NAT_PER_AZ * 2" | bc) ;;
  mixed)      NAT_PER_ENV=$NAT_PER_AZ ;;  # 1 AZ average
  *)          NAT_PER_ENV=$(echo "scale=2; $NAT_PER_AZ * 2" | bc) ;;  # default per_env
esac

CW_PER_ENV=$(echo "scale=2; $CW_LOGS_GB * $CW_LOGS_PRICE / $N_ENVS" | bc)

case "$ECR_PATTERN" in
  shared)         ECR_PER_ENV=$ECR_PER_ENV_SHARED ;;
  per_env)        ECR_PER_ENV=$ECR_PER_ENV_DEDICATED ;;
  *)              ECR_PER_ENV=$ECR_PER_ENV_SHARED ;;
esac

case "$EFS_USED" in
  true)   EFS_PER_ENV=$EFS_PER_ENV ;;
  false)  EFS_PER_ENV=0 ;;
  *)      EFS_PER_ENV=0 ;;
esac

# Sum the per-env overhead
SHARED_OVERHEAD_PER_ENV=$(echo "scale=2; $ALB_PER_ENV + $NAT_PER_ENV + $CW_PER_ENV + $ECR_PER_ENV + $EFS_PER_ENV" | bc)
TOTAL_SHARED_OVERHEAD_MO=$(echo "scale=2; $SHARED_OVERHEAD_PER_ENV * $N_ENVS" | bc)

Default overhead at typical Fortem customer (per env × 2 AZ NAT, mixed ALB, 50 GB CW, shared ECR, no EFS):

ALB: ~$11/env (half shared, half dedicated)
NAT: $66/env (per env, 2 AZs — the biggest line)
CloudWatch: ~$2/env (50 GB / 14 envs × $0.50)
ECR: $5/env
Total: ~$84/env shared overhead

This is added to each env's estimated_cost_mo to produce estimated_total_cost_mo (compute + shared overhead). The HTML report's "Shared services" section shows the breakdown.

Phase 4 — Terraform enrichment (optional)

If the user provided a Terraform path:

Find all aws_ecs_cluster resources — map name to cluster ARN
Find all aws_ecs_service resources — extract cluster reference and name for cross-reference
Find locals or variables.tf with environment / stage definitions — these are gold for environment mapping
Look for any module "ecs_environment" or module "service" patterns

Use Terraform to enrich, not replace. The AWS API is the source of truth; Terraform tells you what the env should be.

If the path is invalid or unreadable, skip silently and note in the report: "Terraform path not readable — using AWS-only discovery."

Phase 5 — Environment mapping

Goal: group every cluster/service into a named "environment" with a stage (prod / staging / dev / qa / unknown) and a region.

Strategy order (try each, stop when confident):

Terraform locals/variables — if you found env = "prod" in TF, use that
Tags — Environment=production or Stage=staging → map to stage
Name patterns — parse use1-prod-main, dev-cluster-1, etc.

Name parsing rules (in priority order):

Contains prod, production, prd (as a token) → prod
Contains stag, staging, stg, uat → staging
Contains qa, test → qa
Contains dev, develop, sandbox → dev
Else → unknown (ask user)

If everything is "unknown": ask the user once to confirm stages. Don't ask per-environment.

Region inference: parse the cluster name (use1-* → us-east-1, usw2-* → us-west-2, euw2-* → eu-west-2). The use1 / usw2 / euw2 / apse2 / cac1 / sae1 short codes are well-known.

Phase 6 — Schedule recommendation per environment

Default rule:

prod → null (never schedule)
staging → weekdays-9-19 in primary timezone
dev → weekdays-9-19 in primary timezone
qa → weekdays-9-19 in primary timezone
unknown → ask user

If the user has Fargate Spot usage, mention it: "Your dev envs already use Fargate Spot. You can stack Spot + scheduling for additional ~70% on top."

Phase 7 — Output 1: `fortem-discovery.yaml`

Write this file. Schema:

# Generated by Fortem Fleet Audit skill
# Review before importing to Fortem

workspace:
  name: <company-name>     # From tags, AWS account alias, or ask user
  primary_region: <region> # Most-used region
  primary_timezone: <tz>   # From user or system

accounts_scanned: 1
regions_scanned: [us-east-1, us-west-2]
total_environments: 14
total_compute_cost_mo: 3500       # Fargate compute only
total_shared_overhead_mo: 1180     # ALB, NAT, CloudWatch, ECR, EFS (Phase 3.5)
total_monthly_cost: 4680           # compute + overhead
total_savings_with_scheduling: 2840

# Shared services architecture (from Phase 3.5)
shared_services:
  alb_pattern: mixed               # one of: shared | per_env | mixed
  nat_pattern: per_env             # biggest line item
  cw_logs_gb_per_month: 50
  ecr_pattern: shared
  efs_used: false
  per_env_overhead_mo: 84.29       # applied uniformly to every env
  pricing_source: "curated table (May 2026)"  # or "us-east-1 fallback ..."

environments:
  - id: use1-prod-main
    name: "Production (us-east-1)"
    stage: prod
    cluster_arn: "arn:aws:ecs:us-east-1:123456789012:cluster/main"
    region: us-east-1
    schedule: null
    schedule_savings_mo: 0
    services_count: 12
    estimated_cost_mo: 2400                  # Fargate compute only
    shared_overhead_mo: 84.29                 # Phase 3.5 attribution
    estimated_total_cost_mo: 2484.29          # compute + overhead
    uses_spot: false
    tags:
      Environment: production
      Team: platform

  - id: use1-dev-dev1
    name: "Dev (us-east-1)"
    stage: dev
    cluster_arn: "arn:aws:ecs:us-east-1:123456789012:cluster/dev-main"
    region: us-east-1
    schedule:
      suggested: weekdays-9-19
      timezone: America/New_York
    schedule_savings_mo: 623
    services_count: 8
    estimated_cost_mo: 890
    shared_overhead_mo: 84.29
    estimated_total_cost_mo: 974.29
    uses_spot: false
    tags:
      Environment: dev
      Team: platform

  - id: usw2-dev-ml1
    name: "Dev ML (us-west-2)"
    stage: dev
    cluster_arn: "arn:aws:ecs:us-west-2:123456789012:cluster/ml1"
    region: us-west-2
    schedule:
      suggested: weekdays-9-19
      timezone: America/Los_Angeles
    schedule_savings_mo: 156
    services_count: 4
    estimated_cost_mo: 220
    shared_overhead_mo: 84.29
    estimated_total_cost_mo: 304.29
    uses_spot: true
    spot_savings_mo: 506
    tags:
      Environment: dev
      Team: ml

Phase 8 — Output 2: `fortem-discovery-report.html`

Write a self-contained HTML file. Use the template at the bottom of this skill (the section marked ). Save it between the two  markers — copy verbatim, then replace the placeholders marked {{LIKE_THIS}}.

Design tokens (Fortem brand):

Background: #FAF9F5 (warm off-white)
Text primary: #1A1A1A
Text muted: #6B6B6B
Accent (savings): #1C4A2E (deep forest green)
Critical / prod badge: #C5391B
Border: #E5E0D5
Fonts (via Google Fonts CDN): IBM Plex Sans (body), IBM Plex Mono (data/numbers), Fraunces (headings)

Required sections (in order):

Header — "Your Fortem Fleet Report" + timestamp + accounts scanned
Summary cards — Total envs, Total cost (compute + shared overhead), With scheduling, Savings
Shared services breakdown — what patterns were assumed (ALB / NAT / CW / ECR / EFS), per-env overhead
Environment table — name, region, stage badge, services, compute cost, overhead, total, suggested schedule, savings 4b. Fleet health — the three signals from Phase 3l (owner-less, sprawl, idle zombies), rendered as {{FLEET_HEALTH_ROWS}} cards. This is the "fleet is the product" section — the part a cron job can't fix. See build spec below.
Cost chart — HTML/CSS bar chart sorted descending (no JS, no chart lib)
Scheduling candidates — list of non-prod envs with savings amount
Fargate Spot section — if any service uses Spot, show how much extra
ROI Callout Block — compute ROI values and render per spec below (after savings-callout)
DIY path cost table — honest breakdown of what the DIY scheduler misses
Limitations banner — "What the DIY scheduler doesn't cover"
Security notice — "Generated entirely on your machine. No data transmitted."
Next steps block — rewritten with "What happens on the call" format
Feedback — link to https://t.me/fortemdev_bot?start=feedback

No external analytics, no phone-home, no CDN beyond Google Fonts.

Building `{{FLEET_HEALTH_ROWS}}`

Render one .fh-card per signal from Phase 3l. Use the counts you stored (OWNERLESS_COUNT, SPRAWL_COUNT, STALE_COUNT). Each card is <div class="fh-card flag"><div class="fh-num">N</div><div class="fh-label">…</div></div> when the count is > 0, or <div class="fh-card clean"> when the signal is clean (count 0). Keep labels honest and specific:

Owner-less: {{OWNERLESS_COUNT}} of {{TOTAL_ENVS}} environments have no owner tag (flag) / Every environment has an owner tag (clean).
Sprawl: {{SPRAWL_COUNT}} environments created in the last 90 days (flag if > 0; this is neutral-to-informational, use the flag style only if also several are stale) / No recent sprawl (clean).
Idle zombies: {{STALE_COUNT}} non-prod environments running with no recent activity (flag) / No idle non-prod environments running (clean).

If ALL three are clean, still render all three clean cards and add one line under the grid: Your fleet is tidy today. The job is keeping it that way as it grows — that's what Fortem does after the audit. Never fabricate a flag the data doesn't support. These counts also print in the Phase 10 terminal summary.

After rendering the savings-callout div, compute ROI values and render the ROI Callout Block:

Variables to compute: FORTEM_PRICE = 790 PAYBACK_DAYS = ceil(790 / (TOTAL_SAVINGS / 30)) NET_ANNUAL = (TOTAL_SAVINGS - 790) * 12 MULTIPLE = round(TOTAL_SAVINGS / 790, 1)

Select variant: TOTAL_SAVINGS < 790 → Variant A (no booking CTA) 790 ≤ TOTAL_SAVINGS < 1500 → Variant B (standard) TOTAL_SAVINGS ≥ 1500 → Variant C (with multiplier)

Frame savings as a modeled estimate, not a billed fact — e.g. "modeled idle spend" / "projected savings", not "you are wasting exactly $X". The number comes from task-definition modeling, so keep the language honest; round figures read as more credible to engineers than false precision. Pass savings={{TOTAL_SAVINGS}} in booking URL param.

Phase 9 — Output 3: `diy-scheduler.yaml`

When presenting diy-scheduler.yaml, frame it explicitly as follows:

"Here's your DIY scheduler — it works and captures most of the savings. It also has 8 limitations that Fortem handles automatically. The report includes a cost breakdown of each limitation in engineering time. This is the honest picture of what doing this yourself looks like."

Do NOT describe it as "a free alternative to Fortem." Frame it as "the cost of doing this yourself."

Note before deploying: this is a working baseline, not a production-grade scheduler. Review the report's DIY path cost table — particularly the UTC cron, no error handling, and no pagination. For teams running > 200 services, multi-timezone, or production-grade reliability, this DIY is the bridge to evaluating Fortem, not a long-term solution.

Write to diy-scheduler.yaml. Template:

# DIY Scheduler — limited local build
# This captures scheduling savings without Fortem. It works, but it has
# limitations (see the report's Limitations section).
#
# Recent fixes (vs original):
#   - desiredCount is now stored as a service tag on stop and restored on
#     start (was hardcoded to 1, which broke envs with replica > 1)
#   - Tag key is configurable via TagKey parameter (was hardcoded to
#     "Environment"; many teams use "Stage" or "Tier")
# Remaining limitations (intentional, see Phase 11):
#   - Cron is in UTC only — edit for non-UTC timezones
#   - No CloudWatch alarm on Lambda errors — add manually for production
#   - Lambda has 120s timeout; for 200+ services it may time out
#
# Deploy: aws cloudformation deploy --template-file diy-scheduler.yaml \
#   --stack-name fortem-diy-scheduler --capabilities CAPABILITY_IAM

AWSTemplateFormatVersion: "2010-09-09"
Description: "Stop non-prod ECS services at 7pm weekdays, start at 8am. Limited local build."

Parameters:
  EnvTag:
    Type: String
    Default: "dev"
    Description: "Tag value that marks an environment as schedulable (dev/staging/qa)"

  TagKey:
    Type: String
    Default: "Environment"
    Description: "Tag key used to identify schedulable services. Default: Environment. Some teams use 'Stage' or 'Tier' — set accordingly."

  # EventBridge cron is ALWAYS UTC — there is no timezone field. These defaults are
  # picked to land in the US evening / early morning so they never fire mid-workday:
  # stop at 02:00 UTC (late evening across US time zones, the previous local weekday)
  # and start at 13:00 UTC (early morning, US-Eastern through US-Pacific). DST shifts
  # the local wall-clock time by an hour but both stay safely off-hours. Set both to
  # YOUR team's hours, in UTC.
  StopTime:
    Type: String
    Default: "cron(0 2 ? * TUE-SAT *)"    # 02:00 UTC — US evening (TUE-SAT UTC = Mon-Fri local evenings)
    Description: "EventBridge cron in UTC — when to stop. Default = US evening. Edit for your timezone."

  StartTime:
    Type: String
    Default: "cron(0 13 ? * MON-FRI *)"   # 13:00 UTC — US early morning, weekdays
    Description: "EventBridge cron in UTC — when to start. Edit for your timezone."

Resources:
  SchedulerRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Statement:
          - Effect: Allow
            Principal: { Service: lambda.amazonaws.com }
            Action: sts:AssumeRole
      Policies:
        - PolicyName: ToggleServices
          PolicyDocument:
            Statement:
              - Effect: Allow
                Action:
                  - ecs:ListClusters            # handler paginates clusters
                  - ecs:ListServices
                  - ecs:UpdateService
                  - ecs:DescribeServices
                  - ecs:ListTagsForResource     # read tags to find schedulable services
                  - ecs:TagResource             # store original desiredCount on stop
                  - ecs:UntagResource           # clean up state tag after start
                Resource: "*"
              - Effect: Allow
                Action:
                  - logs:CreateLogGroup
                  - logs:CreateLogStream
                  - logs:PutLogEvents
                Resource: "*"

  ToggleFunction:
    Type: AWS::Lambda::Function
    Properties:
      Runtime: python3.12
      Handler: index.handler
      Role: !GetAtt SchedulerRole.Arn
      Timeout: 120
      Code:
        ZipFile: |
          import boto3, os, json
          ecs = boto3.client("ecs")
          env_tag = os.environ["ENV_TAG"]
          tag_key = os.environ["TAG_KEY"]   # configurable via CFN parameter
          state_tag = "fortem_scheduler_original_desired"  # tag we set on stop, read on start

          def toggle(action):   # "stop" or "start", passed in from the event
              paginator = ecs.get_paginator("list_clusters")
              for page in paginator.paginate():
                  for cluster_arn in page["clusterArns"]:
                      svc_paginator = ecs.get_paginator("list_services")
                      for svc_page in svc_paginator.paginate(cluster=cluster_arn):
                          if not svc_page["serviceArns"]:
                              continue
                          services = ecs.describe_services(
                              cluster=cluster_arn, services=svc_page["serviceArns"],
                              include=["TAGS"]   # describe_services omits tags unless asked
                          )["services"]
                          for svc in services:
                              svc_arn = svc["serviceArn"]
                              svc_name = svc["serviceName"]
                              current_tags = {t["key"]: t["value"] for t in svc.get("tags", [])}
                              if current_tags.get(tag_key, "").lower() != env_tag.lower():
                                  continue
                              desired = svc.get("desiredCount", 0)

                              if action == "stop" and desired > 0:
                                  # Store the original desiredCount as a service tag so
                                  # we can restore it on start (was hardcoded to 1 before)
                                  ecs.tag_resource(
                                      resourceArn=svc_arn,
                                      tags=[{"key": state_tag, "value": str(desired)}]
                                  )
                                  ecs.update_service(cluster=cluster_arn, service=svc_name,
                                                     desiredCount=0)
                              elif action == "start" and state_tag in current_tags:
                                  # Only restore services WE stopped (they carry the state tag).
                                  # If the tag is absent the service was at its own desiredCount
                                  # already (e.g. legitimately 0) — never force-start it.
                                  original = int(current_tags[state_tag])
                                  ecs.update_service(cluster=cluster_arn, service=svc_name,
                                                     desiredCount=original)
                                  # Clean up the state tag
                                  ecs.untag_resource(
                                      resourceArn=svc_arn,
                                      tags=[{"key": state_tag}]
                                  )

          def handler(event, context):
              # The EventBridge rule passes {"action": "stop"} or {"action": "start"}.
              # Read it from the event — do NOT hardcode, or StartRule would also stop.
              action = (event or {}).get("action", "stop")
              toggle(action)
              return {"statusCode": 200, "action": action}

      Environment:
        Variables:
          ENV_TAG: !Ref EnvTag
          TAG_KEY: !Ref TagKey

  StopRule:
    Type: AWS::Events::Rule
    Properties:
      ScheduleExpression: !Ref StopTime
      Targets:
        - Id: stop-toggle
          Arn: !GetAtt ToggleFunction.Arn
          Input: '{"action":"stop"}'

  StartRule:
    Type: AWS::Events::Rule
    Properties:
      ScheduleExpression: !Ref StartTime
      Targets:
        - Id: start-toggle
          Arn: !GetAtt ToggleFunction.Arn
          Input: '{"action":"start"}'

  StopPermission:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !Ref ToggleFunction
      Action: lambda:InvokeFunction
      Principal: events.amazonaws.com
      SourceArn: !GetAtt StopRule.Arn

  StartPermission:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !Ref ToggleFunction
      Action: lambda:InvokeFunction
      Principal: events.amazonaws.com
      SourceArn: !GetAtt StartRule.Arn

This is what Fortem does for you automatically. The report's Limitations section explains what this snippet can't do (drift detection, multi-timezone, AI diagnostics on failure, etc.).

Phase 10 — Summary output (print to terminal)

✓ Fleet Audit complete.

  Files generated:
  → fortem-discovery-report.html   Open in browser first
  → fortem-discovery.yaml          Bring to a Fortem call
  → diy-scheduler.yaml             Read limitations before deploying

  Your fleet:  {{TOTAL_ENVS}} environments
  Total cost:  ${{TOTAL_COST}}/mo
  With Fortem: ${{TOTAL_SAVINGS}}/mo savings
  Break-even:  {{PAYBACK_DAYS}} days

  Fleet health (the part a cron job can't fix):
  → {{OWNERLESS_COUNT}} environments with no owner tag
  → {{SPRAWL_COUNT}} created in the last 90 days
  → {{STALE_COUNT}} non-prod running with no recent activity

  Open fortem-discovery-report.html for the full breakdown and next steps.

Compute PAYBACK_DAYS = ceil(790 / (TOTAL_SAVINGS / 30)) using the same formula as the ROI Callout Block in Phase 8.

Phase 11 — What this skill does NOT do (the "Limitations" section)

This is the most important section for the user to understand the value of Fortem. Add this verbatim to the HTML report under a "What this report doesn't tell you" banner.

What this skill gives you (free, today, no signup):
  ✓ Map of every ECS environment you have
  ✓ Per-environment monthly cost (compute + estimated shared overhead)
  ✓ Savings estimate from business-hours scheduling
  ✓ A DIY scheduler you can deploy in 30 minutes
  ✓ Your fleet in Fortem's import format

What Fortem does on top (you'd need to build all of this yourself otherwise):
  ✗ Drift detection — someone scales a service back up at 2am, your "stopped"
    env is actually running. Fortem catches this and re-stops it. DIY: you'd
    have to add another Lambda that polls every 5 minutes.
  ✗ Multi-timezone scheduling — your Berlin team works 9-19 CET, your SF team
    works 9-19 PST. Fortem schedules per-env. DIY: edit the cron expressions
    for non-UTC timezones (DIY is hardcoded to UTC, no per-env timezone yet).
  ✗ Per-environment safety rails — "this env can be stopped, this one can't,
    this one only on weekends". Fortem has UI for this. DIY: you maintain
    YAML by hand and grep production.
  ✗ AI diagnostics — when a service fails to start at 8am because the IAM
    role is missing ecr:GetAuthorizationToken, Fortem reads CloudWatch, walks
    the task definition, checks IAM, and proposes the fix in 8 seconds.
    DIY: add a CloudWatch alarm on the Lambda yourself (DIY has no error
    handling — a single failed update_service errors the whole run).
  ✗ Developer self-service — your developer needs to restart staging at 6pm
    on a Friday. Fortem gives them a scoped UI. DIY: they Slack you, you
    context-switch, you fix it.
  ✗ Multi-account orchestration — you have prod in one AWS account, staging
    in another, dev in a third. Fortem shows all of them in one screen.
    DIY: three browser tabs and a spreadsheet.
  ✗ Cost drift alerts — "this dev env was $400/mo last month, now it's
    $620/mo, what changed?" Fortem tells you. DIY: you check Cost Explorer
    next quarter.
  ✗ Audit log — who stopped what env, when, why. Fortem logs everything.
    DIY: you grep CloudWatch logs.
  ✗ Fleet > 200 services — DIY Lambda has 120s timeout. For larger fleets
    it times out partway. Fortem scales to 10k+ services per account.

DIY scheduler specifics — what was fixed and what remains:

The DIY scheduler CFN has been improved over time. As of the current skill version:

✅ Fixed:

desiredCount=1 bug — was hardcoded to 1 on start, breaking envs with replica > 1. Now stores the original count as a service tag (fortem_scheduler_original_desired) on stop, restores on start.
Hardcoded Environment tag key — was unconfigurable. Now a CFN parameter TagKey (default Environment). Teams using Stage or Tier can set accordingly.

⚠️ Remaining limitations (intentional, not fixed):

Cron is in UTC only. For non-UTC timezones, edit the StopTime / StartTime parameters post-deploy.
No CloudWatch alarm on Lambda errors. Add manually: aws cloudwatch put-metric-alarm --metric-name Errors --namespace AWS/Lambda ...
Lambda has 120s timeout. For fleets > 200 services it may time out partway. Not paginated.
Single-region only. Multi-region requires deploying separate stacks per region.

This section is the bridge from "free skill" to "Fortem." Be honest. Be specific. Don't oversell.

Phase 12 — Edge cases

Scenario	Handling
`FORTEM_TEST_MODE=1` set	Use example data (3 clusters, 14 envs, $4,200/mo). Print "TEST MODE" in the report header.
No AWS credentials	Step 0 auto-switches to `FORTEM_TEST_MODE=1` and produces the example report, with a notice on how to come back with real access. Never ask-and-block, never fail.
`aws:ListClusters` throttled	Wait 2s, retry once. If still throttled, skip region and note in report.
100+ clusters	Paginate with `--max-items 100` + `--next-token`. Process in batches. Note that large fleets may need dedicated onboarding.
Cluster with 0 services	Note in the report: "Cluster has no services — empty or recently created."
Service with no task definition	Skip the service, note: "Service has no active task definition."
Mixed Fargate + EC2 launch type	Note in cost: "EC2 launch type — cost not included in this estimate. EC2 has its own pricing model."
Untagged resources	Fall back to name-based grouping. If all unknown, ask user once for stage confirmation.
Multi-account scan requested	This skill does NOT do cross-account `sts:AssumeRole` in a single run. The user runs the skill once per account with a different `AWS_PROFILE`. Each run writes to its own subdirectory. See "AWS access — how it works" section above.
Service in STOPPED state	Include in the report but mark with $0 cost and "stopped" badge. Don't suggest a schedule for already-stopped services.
Terraform with multiple workspaces	Process the current workspace only. Note in the report which workspace was scanned.
`pricing:GetProducts` permission missing	NOT APPLICABLE — this skill uses a curated rates table (Phase 3a), not the Pricing API. No `pricing:*` IAM permission is required.

Security Requirements

Use only ecs:List* and ecs:Describe* permissions — read-only
Never write AWS credentials to any file
Never extract or log secret values (only environment variable KEYS, never values)
Never make external network calls — no phone-home, no analytics
HTML report declares all IAM permissions used in a visible section
HTML report includes the security notice verbatim
Feedback link uses public Telegram bot handle only — no tokens in any output

If you encounter a situation requiring write access (e.g., the user wants to deploy the DIY scheduler), make a separate, clearly-labeled action and ask before executing.

Definition of Done

Before printing the success summary, verify:

fortem-discovery-report.html opens correctly in a browser (test by checking the file has matching </html> and no syntax errors)
fortem-discovery.yaml is valid YAML (parse it with the agent's YAML tool)
diy-scheduler.yaml is valid CloudFormation (optional: validate with aws cloudformation validate-template)
Every environment has a stage value (no unknown after the confirmation round)
Every environment with stage != "prod" has a schedule.suggested
Total monthly_cost (compute) matches the sum of per-environment estimated_cost_mo
Total shared_overhead_mo matches per_env_overhead_mo × total_environments
Total monthly cost (compute + overhead) is total_compute_cost_mo + total_shared_overhead_mo
Total savings_with_scheduling matches the sum of per-environment savings
HTML report has all 14 required sections (header, summary, shared services, table, fleet health, chart, candidates, spot, ROI callout, DIY cost table, limitations, security, next, feedback)
HTML report header shows which {{PROFILE}} and {{ACCOUNT_ID}} were scanned (auditable)
No AWS credentials, secret values, or access keys in any output file (account ID is OK to show)
If FORTEM_TEST_MODE=1, the report clearly says "Test mode — example data" in the header

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Your Fortem Fleet Report</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Fraunces:opsz,[email protected],400;9..144,500;9..144,600&family=IBM+Plex+Mono:wght@400;500&family=IBM+Plex+Sans:wght@400;500;600&display=swap" rel="stylesheet">
<style>
  :root {
    --bg: #FAF9F5;
    --bg-panel: #F0EBE0;
    --bg-elevated: #FFFFFF;
    --ink: #1A1A1A;
    --ink-soft: #4B4B4B;
    --ink-muted: #6B6B6B;
    --ink-dim: #9B9B9B;
    --border: #E5E0D5;
    --border-soft: #EFEBE0;
    --green: #1C4A2E;
    --green-soft: #E8F0EA;
    --amber: #B6892C;
    --amber-soft: #F5EBD7;
    --crimson: #C5391B;
    --crimson-soft: #FAE7E1;
  }
  * { box-sizing: border-box; }
  body {
    background: var(--bg);
    color: var(--ink);
    font-family: "IBM Plex Sans", -apple-system, sans-serif;
    margin: 0;
    padding: 32px 24px;
    line-height: 1.5;
  }
  .wrap { max-width: 920px; margin: 0 auto; }
  .test-banner {
    background: var(--amber-soft);
    border: 1px solid var(--amber);
    color: var(--amber);
    padding: 10px 16px;
    border-radius: 6px;
    font-size: 13px;
    font-weight: 500;
    margin-bottom: 20px;
  }
  .header { text-align: center; padding: 24px 0 32px; border-bottom: 1px solid var(--border); margin-bottom: 32px; }
  .header h1 { font-family: Fraunces, serif; font-weight: 500; font-size: 32px; margin: 0 0 8px; letter-spacing: -0.01em; }
  .header .meta { color: var(--ink-muted); font-size: 12px; font-family: "IBM Plex Mono", monospace; }
  .summary { display: grid; grid-template-columns: repeat(4, 1fr); gap: 12px; margin-bottom: 32px; }
  @media (max-width: 640px) { .summary { grid-template-columns: repeat(2, 1fr); } }
  .card { background: var(--bg-elevated); border: 1px solid var(--border); border-radius: 8px; padding: 16px; text-align: center; }
  .card .label { font-size: 10px; text-transform: uppercase; letter-spacing: 0.08em; color: var(--ink-muted); font-family: "IBM Plex Mono", monospace; }
  .card .value { font-size: 22px; font-weight: 500; margin-top: 6px; font-family: "IBM Plex Mono", monospace; }
  .card .value.green { color: var(--green); }
  .card .value.muted { color: var(--ink-muted); font-size: 16px; }
  h2 { font-family: Fraunces, serif; font-weight: 500; font-size: 22px; margin: 32px 0 16px; letter-spacing: -0.01em; }
  table { width: 100%; border-collapse: collapse; font-size: 13px; margin-bottom: 24px; }
  th { text-align: left; padding: 10px 12px; background: var(--bg-panel); color: var(--ink-muted); font-size: 10px; text-transform: uppercase; letter-spacing: 0.06em; font-weight: 500; border-bottom: 1px solid var(--border); font-family: "IBM Plex Mono", monospace; }
  td { padding: 10px 12px; border-bottom: 1px solid var(--border-soft); vertical-align: middle; }
  td.num, th.num { text-align: right; font-family: "IBM Plex Mono", monospace; }
  td.mono, th.mono { font-family: "IBM Plex Mono", monospace; }
  .badge { display: inline-block; padding: 2px 8px; border-radius: 3px; font-size: 10px; font-weight: 500; text-transform: uppercase; letter-spacing: 0.06em; font-family: "IBM Plex Mono", monospace; }
  .badge.prod { background: var(--crimson-soft); color: var(--crimson); }
  .badge.staging { background: var(--amber-soft); color: var(--amber); }
  .badge.dev, .badge.qa { background: var(--green-soft); color: var(--green); }
  .badge.unknown { background: #EEE; color: var(--ink-muted); }
  .chart { margin: 24px 0; }
  .bar-row { display: grid; grid-template-columns: 180px 1fr 90px; gap: 12px; align-items: center; margin-bottom: 6px; font-size: 12px; }
  .bar-track { background: var(--border-soft); height: 24px; border-radius: 3px; overflow: hidden; position: relative; }
  .bar-fill { background: var(--green); height: 100%; }
  .bar-fill.savings { background: var(--amber); }
  .bar-label { font-family: "IBM Plex Mono", monospace; font-size: 11px; color: var(--ink-muted); }
  .bar-amount { font-family: "IBM Plex Mono", monospace; font-size: 12px; text-align: right; }
  .fleet-health { display: grid; grid-template-columns: repeat(3, 1fr); gap: 12px; margin: 16px 0 8px; }
  @media (max-width: 640px) { .fleet-health { grid-template-columns: 1fr; } }
  .fh-card { border: 1px solid var(--border); border-radius: 8px; padding: 16px 18px; background: var(--bg-elevated); }
  .fh-card.flag { border-color: var(--amber); background: #FBF3E6; }
  .fh-card.clean { border-color: var(--green); background: var(--green-soft); }
  .fh-card .fh-num { font-family: "IBM Plex Mono", monospace; font-size: 28px; font-weight: 500; color: var(--ink); line-height: 1; }
  .fh-card.flag .fh-num { color: var(--amber); }
  .fh-card.clean .fh-num { color: var(--green); }
  .fh-card .fh-label { font-size: 13px; color: var(--ink-soft); margin-top: 8px; line-height: 1.4; }
  .savings-callout { background: var(--green-soft); border: 1px solid var(--green); border-radius: 8px; padding: 16px 20px; margin: 24px 0; display: flex; gap: 16px; align-items: center; }
  .savings-callout .accent-bar { width: 4px; align-self: stretch; background: var(--green); border-radius: 2px; }
  .savings-callout .label { font-size: 11px; color: var(--ink-muted); text-transform: uppercase; letter-spacing: 0.06em; font-family: "IBM Plex Mono", monospace; }
  .savings-callout .amount { font-size: 24px; color: var(--green); font-family: "IBM Plex Mono", monospace; font-weight: 500; }
  .roi-block { background: #1C4A2E; color: #FFFFFF; border-radius: 8px; padding: 28px 32px; margin: 32px 0; width: 100%; }
  .roi-block--low { background: var(--bg-panel); color: var(--ink); border: 1px solid var(--border); }
  .roi-block__headline { font-family: Fraunces, serif; font-weight: 500; font-size: 24px; margin-bottom: 12px; }
  .roi-block__headline .roi-number { font-size: 38px; font-family: "IBM Plex Mono", monospace; font-weight: 500; display: block; margin-top: 4px; }
  .roi-block__math { font-size: 15px; opacity: 0.92; margin-bottom: 16px; line-height: 1.5; }
  .roi-block__multiplier { font-size: 14px; background: rgba(255,255,255,0.15); border-radius: 6px; padding: 12px 16px; margin-bottom: 16px; }
  .roi-cta { display: inline-block; background: #FFFFFF; color: #1C4A2E; padding: 12px 24px; border-radius: 6px; text-decoration: none; font-weight: 600; font-size: 14px; font-family: "IBM Plex Sans", -apple-system, sans-serif; }
  .roi-cta:hover { background: #E8F0EA; }
  .roi-block--low .roi-cta { background: var(--ink); color: var(--bg); }
  .roi-block--low .roi-cta:hover { background: var(--green); color: #FFF; }
  .diy-table { width: 100%; border-collapse: collapse; font-size: 13px; margin: 16px 0 24px; }
  .diy-table th { text-align: left; padding: 8px 12px; background: var(--bg-panel); color: var(--ink-muted); font-size: 10px; text-transform: uppercase; letter-spacing: 0.06em; font-weight: 500; border-bottom: 1px solid var(--border); font-family: "IBM Plex Mono", monospace; }
  .diy-table td { padding: 10px 12px; border-bottom: 1px solid var(--border-soft); vertical-align: top; }
  .diy-table td:first-child { font-weight: 500; color: var(--ink); }
  .diy-table td:last-child { color: var(--ink-muted); }
  .limitations { background: #FAE7E1; border: 1px solid var(--crimson); border-radius: 8px; padding: 20px 24px; margin: 32px 0; }
  .limitations h3 { font-family: Fraunces, serif; font-weight: 500; font-size: 18px; color: var(--crimson); margin: 0 0 12px; }
  .limitations p { margin: 0 0 16px; font-size: 14px; color: var(--ink-soft); }
  .limitations ul { margin: 0; padding-left: 20px; font-size: 13px; color: var(--ink-soft); }
  .limitations li { margin-bottom: 8px; }
  .limitations li strong { color: var(--ink); }
  .next-steps { background: var(--bg-elevated); border: 1px solid var(--border); border-radius: 8px; padding: 32px 24px; text-align: center; margin: 32px 0; }
  .next-steps h3 { font-family: Fraunces, serif; font-weight: 500; font-size: 20px; margin: 0 0 12px; color: var(--ink); }
  .next-steps p { font-size: 14px; color: var(--ink-soft); margin: 0 0 16px; line-height: 1.6; }
  .next-steps .btn { display: inline-block; background: #1C4A2E; color: #FFFFFF; padding: 12px 24px; border-radius: 6px; text-decoration: none; font-weight: 500; font-size: 14px; }
  .next-steps .btn:hover { background: #145232; }
  .next-steps .secondary-note { font-size: 12px; color: var(--ink-muted); margin-top: 12px; }
  .next-steps .secondary-note a { color: var(--ink-soft); text-decoration: underline; }
  .next-steps .secondary-note a:hover { color: var(--ink); }
  .feedback { text-align: center; font-size: 12px; color: var(--ink-muted); margin: 16px 0; }
  .feedback a { color: var(--ink-soft); }
  .footer { text-align: center; font-size: 11px; color: var(--ink-dim); margin-top: 32px; padding-top: 16px; border-top: 1px solid var(--border); }
</style>
</head>
<body>
<div class="wrap">

{{IF_TEST_MODE}}<div class="test-banner">⚠ Test mode — example data. Run without <code>FORTEM_TEST_MODE=1</code> to scan a real account.</div>{{ENDIF_TEST_MODE}}

<div class="header">
  <h1>Your Fortem Fleet Report</h1>
  <div class="meta">Generated {{TIMESTAMP}} · Profile: <code>{{PROFILE}}</code> · Account: <code>{{ACCOUNT_ID}}</code> · Region: <code>{{PRIMARY_REGION}}</code> · Pricing: <code>{{PRICING_SOURCE}}</code> · {{ACCOUNTS}} account(s) · {{REGIONS}} region(s)</div>
</div>

<div class="summary">
  <div class="card"><div class="label">Environments</div><div class="value">{{TOTAL_ENVS}}</div></div>
  <div class="card"><div class="label">Monthly cost</div><div class="value">${{TOTAL_COST}}</div><div class="value muted" style="font-size: 11px; margin-top: 2px;">compute + shared overhead</div></div>
  <div class="card"><div class="label">With scheduling</div><div class="value green">${{WITH_SCHEDULING}}</div></div>
  <div class="card"><div class="label">Savings</div><div class="value green">−${{SAVINGS}}/mo</div></div>
</div>

<h2>Shared services overhead</h2>
<p style="font-size: 13px; color: var(--ink-muted); margin-bottom: 12px;">
  Fargate compute is only part of the per-env cost. ALB, NAT Gateway, CloudWatch Logs, ECR storage, and EFS (if any) add fixed overhead that cost allocation tags can't see. This is estimated from your answers to Phase 3.5 questions.
</p>
<table>
  <thead>
    <tr>
      <th>Component</th>
      <th>Pattern</th>
      <th class="num">Per-env $/mo</th>
    </tr>
  </thead>
  <tbody>
    {{SHARED_SERVICES_TABLE_ROWS}}
  </tbody>
  <tfoot>
    <tr style="border-top: 2px solid var(--border);">
      <td><strong>Total per-env overhead</strong></td>
      <td></td>
      <td class="num"><strong>${{SHARED_OVERHEAD_PER_ENV}}/mo</strong></td>
    </tr>
  </tfoot>
</table>
<p style="font-size: 12px; color: var(--ink-dim); margin-top: 8px;">
  Total shared overhead across {{TOTAL_ENVS}} envs: ${{TOTAL_SHARED_OVERHEAD}}/mo.
  Pricing source: <code>{{PRICING_SOURCE}}</code>.
</p>

<h2>Environment breakdown</h2>
<table>
  <thead>
    <tr>
      <th>Environment</th>
      <th>Region</th>
      <th>Stage</th>
      <th class="num">Svcs</th>
      <th class="num">Compute $/mo</th>
      <th class="num">Overhead</th>
      <th class="num">Total $/mo</th>
      <th class="num">With sched.</th>
    </tr>
  </thead>
  <tbody>
    {{ENV_TABLE_ROWS}}
  </tbody>
</table>

<h2>Fleet health</h2>
<p style="font-size: 13px; color: var(--ink-muted); margin-bottom: 12px;">
  The bill above is the part a cron job can fix. These are the parts it can't — the fleet sprawl that grows back every time someone spins up an env. Derived from your tags and metadata; read-only, nothing here left your machine.
</p>
<div class="fleet-health">
  {{FLEET_HEALTH_ROWS}}
</div>

<h2>Cost by environment</h2>
<div class="chart">
  {{COST_BAR_CHART_ROWS}}
</div>

{{IF_SPOT_DETECTED}}
<h2>Fargate Spot usage</h2>
<p style="font-size: 13px; color: var(--ink-soft);">You use Fargate Spot on {{SPOT_ENV_COUNT}} environment(s). That's an additional <strong style="color: var(--green);">${{SPOT_SAVINGS}}/mo</strong> beyond the scheduling savings above.</p>
{{ENDIF_SPOT_DETECTED}}

<div class="savings-callout">
  <div class="accent-bar"></div>
  <div>
    <div class="label">Total potential savings</div>
    <div class="amount">−${{TOTAL_SAVINGS}}/mo</div>
  </div>
</div>

{{ROI_BLOCK_LOW}}
<div class="roi-block roi-block--low">
  <p>Your current savings opportunity (<strong>${{TOTAL_SAVINGS}}/mo</strong>) is below Fortem's base price. You may not be a fit right now — but here's what to do when your fleet grows.</p>
  <a href="https://fortem.dev/blog/ecs-fargate-cost-optimization" class="roi-cta">Read: ECS Fargate cost optimization guide →</a>
</div>
{{ENDIF_ROI_BLOCK_LOW}}

{{ROI_BLOCK_STANDARD}}
<div class="roi-block">
  <div class="roi-block__headline">
    Modeled idle spend: <span class="roi-number">~${{TOTAL_SAVINGS}}/mo</span>
  </div>
  <div class="roi-block__math">
    Fortem costs $790/mo — break even in roughly <strong>{{PAYBACK_DAYS}} days</strong>.
    After that, projected net savings of <strong>~${{NET_ANNUAL}}</strong>/year.
    These figures are modeled from your task definitions, not pulled from your AWS bill.
  </div>
  <a href="https://fortem.dev/book?ref=report&savings={{TOTAL_SAVINGS}}" class="roi-cta">
    Book a 20-min call →
  </a>
  {{ROI_BLOCK_MULTIPLIER}}
  <div class="roi-block__multiplier">
    That's about {{MULTIPLE}}× Fortem's cost.
    Modeled idle spend continues at ~${{TOTAL_SAVINGS}}/mo until you act.
  </div>
  {{ENDIF_ROI_BLOCK_MULTIPLIER}}
</div>
{{ENDIF_ROI_BLOCK_STANDARD}}

<h2>The DIY path — what it actually costs</h2>
<p style="font-size: 13px; color: var(--ink-soft); margin-bottom: 16px;">
  We generated <code>diy-scheduler.yaml</code> — a working CloudFormation scheduler. Deploy it and it saves money. Here's what it doesn't do, and what each gap costs in engineering time:
</p>
<table class="diy-table">
  <thead>
    <tr>
      <th>What it misses</th>
      <th>Engineering cost</th>
    </tr>
  </thead>
  <tbody>
    <tr><td>Drift detection (service scaled back up)</td><td>1–4 hrs/month debugging "why is this running"</td></tr>
    <tr><td>Per-timezone scheduling</td><td>Manual cron edits every time work hours change</td></tr>
    <tr><td>Safety rails (which envs can stop)</td><td>One accidental prod stop = incident</td></tr>
    <tr><td>AI diagnostics on startup failure</td><td>20–40 min per incident vs. 8-sec Fortem diagnosis</td></tr>
    <tr><td>Developer self-service</td><td>2–5 Slack interrupts/week: "restart my staging env"</td></tr>
    <tr><td>Multi-account view</td><td>3 browser tabs, 2 spreadsheets, 1 headache</td></tr>
    <tr><td>Cost drift alerts</td><td>You find out next quarter in Cost Explorer</td></tr>
    <tr><td>Audit log</td><td><code>grep</code> CloudWatch at 11pm</td></tr>
  </tbody>
</table>
<p style="font-size: 13px; color: var(--ink-soft); margin-bottom: 16px;">
  The DIY scheduler is a bridge, not a destination. Most teams run it for 2–3 months, hit one of the above, and switch. The question is whether you pay with engineering time or with $790/mo.
</p>

<div class="limitations">
  <h3>What the DIY scheduler doesn't cover</h3>
  <p>The CloudFormation snippet captures the headline savings. These are the limitations you accept when deploying it:</p>
  <ul>
    <li><strong>UTC-only cron.</strong> For non-UTC timezones, edit the <code>StopTime</code> / <code>StartTime</code> parameters post-deploy.</li>
    <li><strong>No CloudWatch alarm on Lambda errors.</strong> Add manually for production use.</li>
    <li><strong>Lambda has 120s timeout.</strong> For fleets > 200 services it may time out. Not paginated.</li>
    <li><strong>Single-region only.</strong> Multi-region requires deploying separate stacks per region.</li>
  </ul>
</div>

<div class="security">
  <strong>Security.</strong> This report was generated entirely on your machine using read-only AWS API calls. No data was sent to Fortem, Anthropic, OpenAI, or any other third party. The IAM permissions used: <code>ecs:ListClusters, ecs:DescribeClusters, ecs:ListServices, ecs:DescribeServices, ecs:DescribeTaskDefinition, ecs:ListTagsForResource</code>. Treat this file like internal documentation.
</div>

<div class="next-steps">
  <h3>What happens on the call</h3>
  <p>
    A Fortem engineer opens this report with you. Not a sales pitch —
    we check: does your tagging map cleanly, are there edge cases in your setup,
    what's the realistic savings vs. the estimate above.
    If Fortem isn't the right fit, we'll tell you in the first 5 minutes.
  </p>
  <p>Bring this file. Bring the YAML. 20 minutes.</p>
  <a href="https://fortem.dev/book?ref=fleet-audit-report&savings={{TOTAL_SAVINGS}}" class="btn">
    Book a 20-min call →
  </a>
  <p class="secondary-note">
    Not ready for a call?
    <a href="https://fortem.dev/audit">See how Fleet Audit works →</a>
  </p>
</div>

<div class="feedback">
  Found a bug or have suggestions? <a href="https://t.me/fortemdev_bot?start=feedback">Send feedback via Telegram</a>
</div>

<div class="footer">
  Generated by the Fortem Fleet Audit skill · Review yaml before importing to Fortem
</div>

</div>
</body>
</html>

How to use the template:

Copy the HTML between the two  markers
Replace each {{PLACEHOLDER}} with actual data:
- {{IF_TEST_MODE}} ... {{ENDIF_TEST_MODE}} — wrap with the test banner block if test mode
- {{TIMESTAMP}} — current ISO 8601 datetime
- {{PROFILE}} — AWS profile name used (or "default" if none)
- {{ACCOUNT_ID}} — 12-digit AWS account ID (from aws sts get-caller-identity $PROFILE_ARG --query Account --output text)
- {{PRIMARY_REGION}} — most-used region in the scanned fleet
- {{PRICING_SOURCE}} — either "curated table (May 2026)" if region is in Phase 3a table, or "us-east-1 fallback — region not in table" if it fell back
- {{ACCOUNTS}} — number of accounts scanned
- {{REGIONS}} — comma-separated list of regions
- {{TOTAL_ENVS}} — total environment count
- {{TOTAL_COST}} — total monthly cost (compute + shared overhead)
- {{WITH_SCHEDULING}} — total cost with scheduling
- {{SAVINGS}} — total savings (without formatting commas)
- {{SHARED_OVERHEAD_PER_ENV}} — per-env fixed overhead from Phase 3.5
- {{TOTAL_SHARED_OVERHEAD}} — total shared overhead across all envs
- {{SHARED_SERVICES_TABLE_ROWS}} — for each component (ALB/NAT/CW/ECR/EFS), generate <tr><td>ALB</td><td>shared (1×$22)</td><td class="num">$1.57</td></tr>
- {{ENV_TABLE_ROWS}} — for each env, generate <tr><td>name</td><td>region</td><td><span class="badge dev">dev</span></td><td class="num">8</td><td class="num">$890</td><td class="num">$84</td><td class="num">$974</td><td class="num">$267</td></tr> (now 7 columns with overhead and total)
- {{COST_BAR_CHART_ROWS}} — for each env, generate a .bar-row with name, filled bar, and amount. Sort descending by cost. The bar width = cost / max_cost * 100%.
- {{IF_SPOT_DETECTED}} ... {{ENDIF_SPOT_DETECTED}} — wrap with Fargate Spot section if any env uses Spot
- {{SPOT_ENV_COUNT}}, {{SPOT_SAVINGS}} — Fargate Spot stats
- {{TOTAL_SAVINGS}} — total savings including Spot
- {{ROI_BLOCK_LOW}} ... {{ENDIF_ROI_BLOCK_LOW}} — wrap with Variant A if TOTAL_SAVINGS < 790 (no booking CTA)
- {{ROI_BLOCK_STANDARD}} ... {{ENDIF_ROI_BLOCK_STANDARD}} — wrap with Variant B/C if TOTAL_SAVINGS ≥ 790
- {{ROI_BLOCK_MULTIPLIER}} ... {{ENDIF_ROI_BLOCK_MULTIPLIER}} — include multiplier line only if TOTAL_SAVINGS ≥ 1500 (Variant C)
- {{PAYBACK_DAYS}} — ceil(790 / (TOTAL_SAVINGS / 30))
- {{NET_ANNUAL}} — (TOTAL_SAVINGS - 790) * 12
- {{MULTIPLE}} — round(TOTAL_SAVINGS / 790, 1)
Save to fortem-discovery-report.html in the user's current directory

What this skill is NOT

Not a real-time dashboard. It's a one-time discovery. Fortem is the dashboard.
Not a deployment tool. It doesn't change your infrastructure (except for the optional DIY scheduler you can deploy yourself).
Not a Terraform executor. It reads Terraform files for context but doesn't plan or apply them.
Not a billing integration. Cost estimates are based on Fargate pricing pages, not your actual Cost & Usage Reports.

For real-time cost tracking, drift detection, and the rest, the report ends with a clear path to Fortem.

That's the whole thing.

No telemetry, no callback, no account. Download it, hand it to your agent, and get your fleet cost report in 15 minutes.

Download the skill (.md)