Triggered by: AWS Budget Notification — "$90/day All Accounts Budget" Account: 806877424398 (MilliononMars management) Budget: $2,700.00/month | Alert threshold: $2,295.00 (85%) | Actual (Apr 1–9): $2,480.72 Investigated: April 10, 2026 at ~12:30 PM
Two overlapping problems:
- A real spike — Polaris Dev Bedrock usage jumped ~10x on April 6 (from ~$3/day to $118/day) and is still running at $46–73/day. Needs investigation immediately.
- A structural problem — the org actually spends ~$7,000–8,000/month. The $2,700 budget is ~3x underallocated. March full month actual: $7,334.
Current burn rate: ~$286/day → Month-end forecast: ~$8,600 (3.2x the budget)
| Rank | Account | Name | Spend | % of Total |
|---|---|---|---|---|
| 1 | 209418081336 | Polaris Dev | $819.01 | 31.8% |
| 2 | 806206016804 | Polaris Prod | $410.98 | 15.9% |
| 3 | 924932513137 | Bike4Mind Dev | $389.74 | 15.1% |
| 4 | 806877424398 | MilliononMars (mgmt) | $307.58 | 11.9% |
| 5 | 191072691096 | Q Portal Prod | $169.35 | 6.6% |
| 6 | 115229011504 | Bike4Mind Prod | $154.57 | 6.0% |
| 7 | 137805459015 | AcmePrivateCo | $125.41 | 4.9% |
| 8 | 614006037970 | Q Portal Dev | $102.61 | 4.0% |
| 9 | 446072083824 | ErikBethkeMoM | $35.06 | 1.4% |
| 10 | Others (18 accts) | Various | $63.74 | 2.5% |
| Total | $2,578.05 |
| Rank | Service | 9-Day Spend | Annualized Pace | Status |
|---|---|---|---|---|
| 1 | Amazon Bedrock | $652 | ~$26,000/yr | 🔴 Surging |
| 2 | AWS Lambda | $401 | ~$16,000/yr | 🟡 Elevated |
| 3 | Amazon S3 | ~$220 | ~$8,800/yr | 🟡 Growing |
| 4 | Amazon ECS | ~$180 | ~$7,200/yr | 🟢 Stable |
| 5 | EC2 + EC2-Other | ~$175 | ~$7,000/yr | 🟡 Recurring spikes |
| 6 | AWS Support (Business) | $100 | $1,200/yr | 🟢 Fixed |
| 7 | AmazonCloudWatch | ~$110 | ~$4,400/yr | 🟡 Elevated |
| 8 | Amazon SQS | ~$90 | ~$3,600/yr | 🟡 Elevated |
| 9 | Amazon OpenSearch | ~$75 | ~$3,000/yr | 🟡 Watch |
| 10 | AWS WAF | ~$25 | ~$1,000/yr | 🟢 Normal |
Something changed on or around April 5–6 on Polaris Dev (account 209418081336).
| Date | Spend | Primary Driver |
|---|---|---|
| Apr 1 | $10.69 | Sonnet 4.5 + Haiku |
| Apr 2 | $3.84 | Mixed |
| Apr 3 | $25.65 | Opus 4.6 spike ($17.30) |
| Apr 4 | $2.18 | Light |
| Apr 5 | $0.38 | Minimal |
| Apr 6 | $118.62 | Opus 4.6: $65.52 + Sonnet 4.5: $46.78 |
| Apr 7 | $91.22 | Opus 4.6: $53.46 + Sonnet 4.5: $36.44 |
| Apr 8 | $73.01 | Sonnet 4.5: $40.90 + Opus 4.6: $30.99 |
| Apr 9 | $46.26 | Sonnet 4.5 + Sonnet 4.6 |
| Model | Spend | % of Bedrock |
|---|---|---|
| Claude Opus 4.6 | $419.45 | 64% |
| Claude Sonnet 4.5 | $170.22 | 26% |
| Claude Sonnet 4.6 | $46.49 | 7% |
| Claude Haiku 4.5 | $9.00 | 1% |
| Claude Opus 4.5 | $7.33 | 1% |
| Total | $652 |
| Anomaly | Dates | Impact | Service | Account | Root Cause |
|---|---|---|---|---|---|
| Polaris Dev Bedrock spike | Apr 6–present | ~$283 | Bedrock | Polaris Dev | Likely runaway agent loop or batch job |
| S3 Storage growth | Mar 3–present (ongoing 6+ wks) | $45+ | S3 TimedStorage | Multiple | Growing data, no lifecycle policies |
| EC2 c6a.2xlarge spikes | Recurring (6+ times in 45 days) | $22/event | EC2 | mgmt acct | Periodic high-CPU workload |
| Lambda spike | Apr 7 | $24 single day | Lambda | Multiple | Traffic spike or runaway invocation |
| CloudFront blow-up | Mar 30–31 | $243 | CloudFront | ErikBethkeMoM | CA-region HTTPS proxy, likely bot |
| SQS volume | Recurring (7+ times) | $4–8/event | SQS | b4m-dev + Polaris Dev | High polling frequency |
| Aurora RDS spike | Apr 6–9 | $8 | Aurora ServerlessV2 | Polaris | Scaling events |
- Provisioned Concurrency alone: $74.67 in 9 days = ~$249/month — charged 24/7 regardless of traffic
- Accounts running Provisioned Concurrency: b4m-prod, b4m-dev, AcmePrivateCo
- Lambda GB-Second execution ($324.54) implies high-duration or memory-heavy functions
- b4m-dev: 4,428 GB-hours of storage at $101.86 in 9 days
- Anomaly has been ongoing since March 3 — no cleanup policies active
- Affected: Polaris Dev, b4m-dev, b4m-prod, AcmePrivateCo, Polaris Prod
- Q Portal Prod (191072691096): $77.96 on CloudWatch in 9 days — 46% of its total bill
- Root cause:
USE2-Application-Signals-Bytes— Application Signals enabled with no byte-limit controls - Likely indexing verbose logs with no sampling rate
- 82.7M standard + 1.9M FIFO SQS requests on b4m-dev
- At $0.40/million = ~$110/month pace
- Likely Lambda-based queue polling with short polling intervals or high-frequency processing loops
- Recurring anomaly flagged 7+ times since March 1
| Issue | Classification | Financial Impact |
|---|---|---|
| Polaris Dev Bedrock spike (Apr 6+) | New infra / runaway agent process | ~$90–120/day ongoing |
| Bedrock Opus 4.6 org-wide | Traffic growth / model selection | $419 in 9 days |
| Lambda Provisioned Concurrency | Configuration — always-on charge | ~$249/month structural |
| S3 storage growth | No lifecycle cleanup | Growing ~$50+/month |
| Q Portal CloudWatch App Signals | Misconfiguration | $78 in 9 days on one account |
| EC2 c6a.2xlarge recurring | Scheduled/periodic, unoptimized | $10–12/event × 6+ times |
| SQS high volume | Architecture — polling frequency | ~$110/month recurring |
| Budget itself | Structural — 3x underallocated vs actual | $7,334 actual in March |
P0-1: Identify what changed on Polaris Dev on/around April 6 Check deployments, agent runs, or batch jobs started on account 209418081336 around April 5–6. The Bedrock spend jumped from $2–3/day to $118/day and is still running at $46–73/day.
- Estimated savings if resolved: ~$1,500–2,100/month
P0-2: Review Bedrock model selection — Opus 4.6 is 64% of Bedrock spend Evaluate whether Claude Sonnet 4.5/4.6 can handle most requests (3–5x cheaper than Opus 4.6). Model routing strategy (Haiku for classification, Sonnet for generation, Opus only for complex reasoning) could reduce Bedrock costs 40–60%.
- Estimated savings: $700–1,200/month
P1-1: Disable Lambda Provisioned Concurrency on dev stage
Provisioned Concurrency on staging environments charges 24/7 regardless of traffic. Unless hard cold-start SLAs are required on staging, remove it from dev.
- Estimated savings: $60–90/month
P1-2: Fix Q Portal Prod CloudWatch Application Signals Disable Application Signals or add sampling/filtering on account 191072691096.
- Estimated savings: $200–250/month
P1-3: Implement S3 Lifecycle Policies across all accounts Transition objects >30–90 days to Intelligent-Tiering or Glacier. Delete old versions and incomplete multipart uploads.
- Accounts to prioritize: b4m-dev, Polaris Dev, b4m-prod, AcmePrivateCo
- Estimated savings: $50–150/month
P1-4: Add per-account Bedrock anomaly alerts Create Cost Anomaly Detection monitors for Bedrock with daily thresholds: $20 on dev accounts, $50 on prod accounts.
P2-1: Investigate SQS polling frequency on b4m-dev and Polaris Dev Review Lambda event source mappings. Consider longer polling intervals, larger batch sizes, and DLQ configurations to prevent retry storms.
- Estimated savings: $30–60/month
P2-2: Investigate recurring EC2 c6a.2xlarge on management account 6 anomaly events in 45 days. Convert to Spot Instances or right-size if workload allows.
- Estimated savings: $40–60/month
P2-3: Review OpenSearch sizing Polaris Dev ($17) and Polaris Prod ($58) — confirm clusters are appropriately sized and not running idle indexes.
P3-1: Right-size the budget Actual spend is ~$7,000–8,000/month. The $2,700 budget triggers alerts constantly and creates noise. Either raise the budget to reflect reality or create per-account/per-project budgets for meaningful attribution.
P3-2: Implement Bedrock cost allocation tags Tag all Bedrock API calls with project, team, and environment to understand model usage by feature/product.
P3-3: Consider Bedrock Reserved Capacity for consistent base load If Sonnet-tier usage is consistent (which it appears to be for b4m-prod), explore Bedrock provisioned throughput for baseline workloads.
Investigation performed: April 10, 2026 | No changes made — verification only Tools: AWS Cost Explorer, Cost Anomaly Detection, CloudWatch Metrics