Skip to content

Instantly share code, notes, and snippets.

@savarin
Last active June 16, 2026 03:58
Show Gist options
  • Select an option

  • Save savarin/12dc62f646a0d9f061e334d34df5185d to your computer and use it in GitHub Desktop.

Select an option

Save savarin/12dc62f646a0d9f061e334d34df5185d to your computer and use it in GitHub Desktop.
Weekly Experiment Readout v10 — Friday (4-skill routine): instruction, skills, and output

Skills (v10 — 4-skill routine)

The weekly_experiment_readout orchestrator calls these 4 sub-skills:

  1. readout_partner (parameterized: company=suno, company=coterie, company=hexclad)
  2. readout_other
  3. readout_topline
  4. readout_next_steps

weekly_experiment_readout

---
name: weekly_experiment_readout
description: Produces the weekly experiment readout by orchestrating partner, other, top-line, and next-step sections. Use when asked for `weekly_experiment_readout` or the final weekly experiment readout routine instruction.
---

# weekly_experiment_readout

Produce the weekly experiment readout as the final Slack post. The visible output must be the readout itself, not setup/status text, routine metadata, tool logs, file paths, QA notes, or implementation details.

Evidence boundary: do not reference prior Slack conversation for context unless the current run explicitly provides a source link or message. Use current connected experiment/reporting sources and the skills named below. If current-week rows are empty but an older verified source is usable, include it as carry-forward evidence and state freshness plainly.

Version boundary: when the user asks for `weekly_experiment_readout` and explicitly says "not v11" (or otherwise excludes v11), do not load or follow `weekly_readout_draft_v11`, `weekly_readout_routine_v11`, or `readout_polish_v11`. Run this legacy orchestration only, while still honoring one-off sources supplied in the current request.

Orchestration:

Phase 1 — independent and parallel-safe:
1. Run `readout_partner company=suno` and store its output exactly as returned.
2. Run `readout_partner company=coterie` and store its output exactly as returned.
3. Run `readout_partner company=hexclad` and store its output exactly as returned.
4. Run `readout_other` and store its output exactly as returned.

Phase 2 — after Phase 1 completes:
5. Run `readout_next_steps`, passing the stored outputs from steps 1-4 as input, and store its output exactly as returned.
6. Run `readout_topline`, passing the stored outputs from steps 1-4 as input, and store its output exactly as returned.

Phase 3 — assemble and post:
Concatenate all stored outputs verbatim in this exact order:
1. Header: `## Weekly Experiment Readout — {current date}`
2. `readout_topline`
3. `readout_partner company=suno`
4. `readout_partner company=coterie`
5. `readout_partner company=hexclad`
6. `readout_other`
7. `readout_next_steps`

Do not rewrite, summarize, re-rank, or add analyst/debug commentary during assembly. If a section needs fixing, the section skill should produce the fixed section; the orchestrator should not patch content ad hoc.

Visible-output rules:
- Stakeholder-grade and forwardable.
- Start partner sections by saying what was tested.
- Include dollar/value estimate where the partner section provides one, with assumptions/caveats.
- Give specific next actions and owners in the next-steps section.
- Omit p-values, confidence intervals, raw metric names, raw IDs, tool/API names, evidence blocks, and backend/process details.
- Every per-day or per-flow bullet must include the actual winning variant name from the report data, never a generic label such as `leading treatment`. Use the format `- Day N: Variant Name — +~X% lift (~Y users)`.
- Use total evaluated users (T+C) from the report's headline sample sizes for population context. Use treatment-only exposure counts for dollar math.
- If the first draft looks like a setup/status note, discard it and produce the requested readout.

readout_partner

---
name: readout_partner
description: Produces one parameterized partner section of the weekly experiment readout for Suno, Coterie, or HexClad. Use when asked for `readout_partner company=suno`, `readout_partner company=coterie`, `readout_partner company=hexclad`, or consolidated partner readout sections.
---

# readout_partner — Skill Prompt

You are producing one **partner section** of the weekly experiment readout. The caller will specify which company: `suno`, `coterie`, or `hexclad`. This section will be concatenated verbatim into a larger readout message.

## What to do

1. Pull the latest experiment variant reports for the specified company. Use the primary outcome metric for that company's experiments.

2. For each experiment, rank variants by highest `metric_lift` vs control — NOT by the report's `rank` field. The `rank` field uses lower-bound statistical ordering, which differs from actual lift.

3. Attempt to pull current-week data. If current-week data returns empty or errors, note the most recent report date and flag as carry-forward evidence.

4. If the caller supplies a one-off source for this partner, treat that source as authoritative for this partner only. Verify it is reachable and parseable during the run. If it is unavailable, fail closed for that partner: do not fall back to prior Slack/session context, stale local files, or standard connected data; produce a short unavailable-source section with no scale/kill/dollar call and add a next-step item to restore or replace the source. See `references/one_off_source_failure_handling_2026_06_15.md`. If a Coterie one-off source is reachable, parse the exact current URL's `Headline read per experiment` table and use `references/coterie_one_off_source_success_2026_06_15.md` for the verified row interpretation and dollar-math pattern.

5. When normal partner data must be fetched, first list available organizations and use the partner-specific `org_slug` (`suno`, `coterie`, `hexclad`) rather than defaulting to the workspace org. For readouts, inspect active experiments first; if a partner has none, also check staging and concluded before saying there is no readable portfolio.

6. If no experiments exist or reports error out, say so plainly. Do not speculate or pad.

## Output format

Start with the company name as a header (e.g., `### Suno`) and go directly into content.

**Opening line:** One sentence describing what the company tested — the strategic hypothesis, not metric names.

**Winners:** Compact one-line bullets per experiment/variant. Include: variant name, rounded lift (business precision like ~8%, ~4.8%), and population (rounded to readable magnitude like ~1.4M, ~156K).

**Dollar impact:** ONE combined estimate per company. Use the math appropriate to the data:
- For subscription metrics: `absolute_delta × exposure_count × price_per_unit`
- For revenue metrics: use revenue-per-exposure-day deltas from the report
- For dollar math, use the exposure count from the metric row treatment sample size (`treatment.stats.n`), not the evaluation evidence sample size.
- State what caveats apply (holdout needed, finance-grade inputs missing, etc.)

**Data freshness:** One line with the specific report date and current-week availability.

**Next action:** One concrete recommendation — what to scale, cut, or test next.

## Company-specific rules

### When company = suno
- Suno's subscription price is ~$10/month. Use this for dollar math on subscription lift metrics.

### When company = coterie
- Display the T+C total (treatment + control) as the evaluated population for context. Use treatment-only counts for dollar math.

## General rules

- Group experiments by strategic intent, not by individual flow or variant ID
- If a partner has no live experiments, keep to 3-4 lines: what's happening instead, no dollar estimate, recommend converting current work into a measured test with a revenue metric and holdout
- If the report flags data corrections or taxonomy changes, note what changed and how it affects the read
- If an experiment's conversion window hasn't matured (e.g., 60-day window with 30-day cohorts), flag it as too early to call
- For every stat-sig result, include the specific lift percentage and evaluated population — never reduce a measured result to just "positive" or "negative"
- For experiments that are not yet decision-ready, group by strategic intent and state why each group isn't readable yet (sample size, conversion window, data correction, etc.)
- No p-values, confidence intervals, or statistical jargon
- No raw metric names, UUIDs, or experiment IDs
- No tool names or API references
- Do not restate data that belongs in other sections
- Keep each section stakeholder-grade and forwardable

readout_topline

---
name: readout_topline
description: Produces the Top-line section of the weekly experiment readout. Use when asked for `readout_topline`, weekly experiment readout executive summary, or decisive top-line bullets synthesized from runtime partner section outputs.
---

# readout_topline

You are producing the **Top-line** section of the weekly experiment readout. This section will be concatenated verbatim into a larger readout message. Start with `### Top-line` and go directly into the content.

## What to do

1. The routine will pass you the outputs from `readout_partner company=suno`, `readout_partner company=coterie`, `readout_partner company=hexclad`, and `readout_other`. Synthesize them into 3-4 decisive top-line bullets.
2. Lead with what is decision-ready. Include the headline dollar number from the partner sections where one exists.

## Output format

A bulleted list of 3-4 items:

1. **What's decision-ready** — name the partner, the key result, and the dollar headline. Lead with the call, not the hedge.
2. **What's positive but needs more** — name the partner, what is promising, and what is blocking a decision.
3. **What's not readable** — partners with no data, errored reports, or no live experiments.
4. **Data gap** — include when there is an operational blocker affecting multiple partners.

Each bullet should be 1-2 sentences maximum.

## Rules

- Lead with the decision, not the caveat.
- Include dollar numbers from partner sections; do not recalculate, use the numbers they produced.
- Do not repeat full partner analysis; this is the executive summary.
- No p-values, statistical jargon, metric names, UUIDs, or tool names.
- Maximum 4 bullets.
- Synthesize from the partner outputs you receive at runtime; do not hardcode specific results.

## Quality checks

Before sending, verify:

1. The section starts with `### Top-line`.
2. There are 3-4 bullets.
3. The first bullet leads with the strongest decision-ready call from the runtime partner outputs.
4. Dollar numbers are copied from the runtime partner sections and are not recomputed.
5. The section contains no detailed variant-analysis paragraphs or operational/tool details.
6. No specific partner result, dollar number, variant, or incident appears unless it came from the runtime partner outputs.

readout_other

---
name: readout_other
description: Produces the Other section of the weekly experiment readout. Use when asked for `readout_other`, a non-experiment weekly readout section, or a concise summary of recent non-experiment partner work not already covered by Suno, Coterie, or HexClad experiment sections.
---

# readout_other

You are producing the **Other** section of the weekly experiment readout. This section will be concatenated verbatim into a larger readout message. Start with `### Other` and go directly into the content.

## What to do

1. Look at recent Slack activity and any non-experiment work across partners this week, such as reporting QA, website/funnel analytics, GTM setup, creative strategy, and instrumentation reads.
2. If the user explicitly says not to reference prior conversation/thread context, do not use Slack history, session history, prior readout text, or local files derived from prior Slack work as evidence for this section. Use only current connected non-conversation sources; if none are available, say no separate non-experiment activity is being used as evidence for this readout.
3. Only include items that are not already covered by the Suno, Coterie, or HexClad experiment sections.
4. Exclude experiment-result summaries, winners, lift calls, and partner experiment readout content that belongs in Suno, Coterie, or HexClad.

## Output format

- Start with `### Other`.
- Write one short paragraph of 2-4 sentences summarizing non-experiment activity.
- If a finding is notable, such as funnel instrumentation gaps that affect future experiments, call it out briefly.
- Include no dollar estimates for non-experiment work.
- Keep this section short; it is context, not the main story.

## Rules

- No p-values, confidence intervals, or statistical jargon.
- No raw metric names, UUIDs, or experiment IDs.
- No tool names or API references.
- Do not repeat anything from Suno, Coterie, or HexClad sections.
- Maximum 3-4 lines.

## Quality checks

Before sending, verify:

1. The section starts with `### Other`.
2. The content is non-experiment activity only.
3. No Suno, Coterie, or HexClad experiment results are repeated.
4. No dollar estimate appears.
5. The section is one short paragraph after the heading.

readout_next_steps

---
name: readout_next_steps
description: Produces the Next steps & owners section of the weekly experiment readout. Use when asked for `readout_next_steps`, weekly experiment readout action items, or owner-assigned next steps synthesized from runtime partner section outputs.
---

# readout_next_steps

You are producing the **Next steps & owners** section of the weekly experiment readout. This section will be concatenated verbatim into a larger readout message. Start with `### Next steps & owners` and go directly into the content.

## What to do

1. The routine will pass you the outputs from `readout_partner company=suno`, `readout_partner company=coterie`, `readout_partner company=hexclad`, and `readout_other`. Extract the recommended next actions from each.
2. Add any operational blockers as action items, including data gaps, pipeline errors, and report failures.
3. Every item must have a specific action and dual owners.

## Output format

A numbered list of 4-6 action items. Each item is one line:

```markdown
### Next steps & owners
1. Action description (Owner A / Owner B)
2. Action description (Owner A / Owner B)

Owner assignments

Use these owner pairings based on the type of work:

• Data pipeline / reporting issues -> Ezzeri / Jasper • Suno experiment design / scaling -> Ezzeri / Steve • Coterie client communication / scaling -> Antoine / Ezzeri • HexClad experiment setup -> Antoine / TBD • Coterie report/variant issues -> Jasper / Ezzeri • Dollar estimate validation -> Steve / Ezzeri

Rules

• Every item gets dual owners; never leave owners blank. • Be specific: write Investigate why X returned empty instead of Follow up on data issues. • Do not repeat full analysis; just the action and who owns it. • Include data gap / pipeline items as operational blockers. • Maximum 6 items. • Synthesize from the partner outputs you receive at runtime; do not hardcode specific results. • No p-values, confidence intervals, raw metric names, UUIDs, tool names, or API references.

Quality checks

Before sending, verify:

  1. The section starts with ### Next steps & owners.
  2. There are 4-6 numbered items.
  3. Each item is one line and ends with (Owner A / Owner B).
  4. Every action is specific and mapped to the correct owner pairing.
  5. Operational blockers are included when partner outputs mention missing or stale data.
  6. No specific partner result, dollar number, variant, or incident appears unless it came from the runtime partner outputs.

Output

Weekly Experiment Readout — 2026-06-15

Top-line

  • Suno has the clearest subscription wins: Day 1, Day 2, and Day 3-5 all show decision-ready lifts, worth roughly $20K/month in new subscription value at ~$10/month per new subscriber.
  • Coterie’s wipe-upsell pilots are decision-ready: cart and browse abandonment both lifted wipe revenue, with a combined d30-equivalent planning impact of about $91.5K before finance-grade validation.
  • Coterie’s smaller OTP/trial/retention flows are not ready to scale: Flow 1 is promising but tiny, several flows are flat or negative, and retention browse is directionally negative.
  • HexClad has no readable experiment portfolio in the current source set, so there is no scale/kill/dollar call for this readout.

Suno

Suno tested whether lifecycle messages tied to user intent — first action, music taste, and continuation prompts — would lift new subscriptions across the welcome flow.

  • Day 1: Behavior: Discover & Listen — +~8.2% lift (~987K evaluated users).
  • Day 2: Taste: Chill & Electronic (Simple HTML) — +~9.5% lift (~1.4M evaluated users).
  • Day 3-5: Continue: Share Your Music — +~12.4% lift (~965K evaluated users).

Dollar impact: roughly $20K/month in new subscription value across the three leading arms, assuming ~$10/month per new subscriber and using treatment exposure counts; plan mix, retention, and downstream LTV still need finance validation.

Data freshness: latest usable report date is 2026-05-21; this is carry-forward evidence rather than a fresh current-week read.

Next action: scale the three winning lifecycle directions while holding out enough traffic to validate subscription value and downstream retention.

Coterie

Coterie tested whether Neon Blue-rendered wipe-upsell creative could increase wipe revenue in abandonment flows, while smaller lifecycle flows tested whether OTP, trial, and retention nudges could move recurring or follow-on purchase behavior.

  • Cart Abandonment Wipe Upsell: +~4.8% lift across ~156K evaluated users.
  • Browse Abandonment Wipe Upsell: +~4.3% lift across ~107K evaluated users.
  • Flow 1 — First Purchase OTP, Diaper NB/1/2: +~42.7% lift across ~501 evaluated users, promising but too small for a scale call.
  • Flow 2, Flow 4, and Flow 6: negative and underpowered; do not scale.
  • Flow 3: essentially flat after the product-category taxonomy correction.
  • Reclaim Browse Abandonment Retention: directionally negative and not decision-ready.

Dollar impact: the two wipe-upsell pilots imply about $91.5K in d30-equivalent planning impact — roughly $57.0K from cart abandonment and $34.6K from browse abandonment — before net-revenue holdout and finance-grade inputs.

Data freshness: one-off source report date is 2026-05-12 and was reachable for this run.

Next action: scale the wipe-upsell pilots into a controlled revenue holdout, keep Flow 1 in learning mode until it has more volume, and stop expanding the negative/flat small-flow treatments.

HexClad

HexClad did not have active, staging, or concluded experiment results available in the current source set for this readout.

Data freshness: no readable current experiment source was available.

Next action: convert the next HexClad lifecycle or revenue opportunity into a measured experiment with a clear revenue metric and holdout before making performance claims.

Other

No separate non-experiment activity is being used as evidence for this readout because this run was requested without prior conversation or thread context, and no current non-conversation source was provided for an “Other” section.

Next steps & owners

  1. Scale Suno’s Day 1, Day 2, and Day 3-5 winning lifecycle directions with a retained holdout for subscription and retention validation. (Ezzeri / Steve)
  2. Validate Suno’s ~$20K/month planning estimate against plan mix, first-month retention, and downstream LTV. (Steve / Ezzeri)
  3. Move Coterie’s cart and browse wipe-upsell pilots into a controlled revenue holdout and prepare client-facing scale guidance. (Antoine / Ezzeri)
  4. Keep Coterie Flow 1 running until it has enough volume for a reliable recurring-order call; do not scale the negative or flat small-flow treatments. (Antoine / Ezzeri)
  5. Set up HexClad’s next measurable experiment with a revenue outcome, holdout, and launch-ready reporting path. (Antoine / TBD)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment