For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Review history:
- v1: Full-file diff-and-apply. Codex R1: controllers carry security regressions. → REVISE
- v2: Hunk-level triage with denylist. Codex R2: Room.php missing, frontend misclassified, Stripe broken. → REVISE
- v3: 4-category system. Codex R3: "clean diff" files have removals, Payment.vue copy-then-restore unsafe. → REVISE
- v4: Master-first principle. Codex R4: Plan broke its own rule with "Maybe" copy items. PlanService missing buildPlanCode/getGroupedPlans. Non-failing patch loop. → REVISE
- v5: No "Maybe" — every file definitive. PlanService fully specified. Fail-fast on apply.
UPDATE 2026-04-13 PM — Cadence finalized: Tue + Thu pushes.
After team feedback (Ahmed +1'd alternate-day on this gist; Gaurav locked in the specific days), the cadence is:
Day What happens Mon Build deployment/branch → forward-integrate master → QA in PM → Megan signs off by 5pmTue AM Push Mon's approved package → prod
After 100 candidates → ranked → reviewed by strategic subagent + Codex adversarial → revised.
| Change | Reason | Source |
|---|---|---|
| Dropped Net new subs/week | Redundant with New MRR (both measure acquisition) | Both reviewers |
| Dropped % churned who never got inquiry | Redundant with No-inquiry churn rate (same signal, different angle) | Strategic review |
Every round of this KPI work exposed the same pattern: measurement was bolted on after the fact.
- Features shipped → THEN metrics chosen → THEN discovered they were wrong
- Metrics chosen → THEN validated → THEN discovered they didn't predict outcomes
- Dashboard built → THEN reviewed → THEN discovered it "makes no sense"
- MRR computed → THEN challenged → THEN discovered it was off by $4K
Starting from all available data sources (production DB, Stripe, PostHog, GSC, PSI, Rollbar, Postmark, GitHub), generating 100 candidates, ranking by decision utility × collectibility × uniqueness.
Each metric scored 1-5 on three axes:
Before picking metrics, we ran the analyses to validate which proxies actually predict the outcomes we care about.
Getting just 1 inquiry in 14 days nearly doubles lister retention (19.9% → 33.7%). Everything else we build — messaging strategy, ChurnKey, email lifecycle, pricing optimization — is secondary to this one insight. 60% of paid listers got zero inquiries this week.
For each feature: (1) Why are we doing this? (2) How would we know it worked? (3) What to measure, at what cadence? Then: does the proposed dashboard have a metric that would capture this?