We tracked the product edge conclusion for American Terawatt across 10 packet generations to understand consistency.
History:
7831b0f(oldest) — Mixed differentiation; some interesting aspects but likely not decisiveccf59cb— Mostly surface-level differentiationc5aaa97— Mixed differentiation; some interesting aspects but likely not decisivec974d12— Mixed differentiation; some interesting aspects but likely not decisive8dc96a0— Mostly surface-level differentiation184142d— Mixed differentiation; some interesting aspects but likely not decisivea16688a— The product is too early or there is not enough public information to confidently conclude how differentiated the product is1c40732— Mixed differentiation; some interesting aspects but likely not decisive8390c52— Mixed differentiation; some interesting aspects but likely not decisive6cfefd4— The product is too early or there is not enough public information to confidently conclude how differentiated the product islatest— Mixed differentiation; some interesting aspects but likely not decisive
Findings: The conclusion oscillates between "mixed differentiation" (7/11), "not enough info" (2/11), and "mostly surface-level" (2/11). The "not enough info" results may be due to a prompt rule that said to "strongly prefer" that conclusion for companies with limited public info. American Terawatt has a sparse website but does explicitly market "12 months to power" as a differentiator. After softening the "not enough info" guidance to only apply when there is limited or minimal product information, the latest run returned to "mixed differentiation" and correctly cited the 12-month deployment claim.
Feedback indicated that market potential sections tend to follow the same formulaic pattern — "narrow wedge... could expand... key question is..." — with excessive equivocation.
Change: Added instruction: "Keep it concise and punchy — only a few sentences. Be direct and specific, avoid equivocating or hedging both ways in the same paragraph."
Results on 3 test companies (Deeptune, Blacksmith, Vesence):
The outputs are more substantive and direct, with real data points (GitHub Actions 71M jobs/day, ABA survey findings, Anthropic $1B RL environment spend). The old pattern of hedging both ways in every paragraph is reduced. Sections are 4-5 dense sentences each.
However, the underlying structure is similar — the model still tends toward "real but narrow wedge, can expand, but risk of X." The tone improved more than the structure.
Deeptune example:
Deeptune is selling into a real but currently narrow budget pool: frontier labs first, and possibly large enterprises doing serious agent post-training later. The spend per account could be large—labs are explicitly increasing demand for RL environments, and Anthropic reportedly discussed spending more than $1 billion on them over a year—but the buyer universe is tiny, concentrated, technically sophisticated, and inclined to build internally, which makes acquisition hard and account concentration high. Deeptune's wedge is specific: "training gyms" that simulate digital work in tools like Slack and Salesforce for reinforcement learning; that can expand into adjacent evaluation, synthetic task generation, and enterprise agent testing, but only if the company proves it is more than a bespoke vendor to a handful of labs. This can become a meaningful AI infrastructure business if agent post-training becomes a durable layer of the stack, but near term it looks like a narrow, high-value infrastructure wedge rather than a broad software market.
After adding a consistency rule (requiring conclusions to match analysis text), there was concern the model may be oversteering toward "timing is especially favorable" even for companies where the answer should be "packaging improvement."
Test: Ran 5 consumer/CPG companies where "why now" should genuinely be execution/packaging, not timing:
| Company | Conclusion |
|---|---|
| KUDO Snacks (protein popcorn) | The product was possible before, but timing is especially favorable now |
| BetterBrand (low-carb bagels) | The product was possible before, but timing is especially favorable now |
| Chip City Cookies | This is mostly execution, packaging, or product improvement rather than timing |
| Arriba Foods (Mexican salsas) | The product was possible before, but timing is especially favorable now |
| BelliWelli (gut health snack bars) | The product was possible before, but timing is especially favorable now |
Result: 4/5 said "timing is especially favorable."
The model finds generic consumer trends (protein demand growing, Hispanic food market expanding, gut health going mainstream) and treats those as timing tailwinds. The analysis text acknowledges these products "were plainly sellable years ago" and "nothing required a recent breakthrough" — but the conclusion still picks "timing is favorable," which may be due to the consistency rule locking onto the analysis's mention of market trends.
KUDO Snacks:
Nothing about KUDO's current product required a recent breakthrough. The company was already selling and promoting protein popcorn years ago, including older 1 oz/5g-per-serving pages and a 2023 UFC partnership around the same basic concept. The more credible "why now" is market-side: protein has become a stronger mainstream snack attribute, retailers are entertaining more functional snack hybrids, and cleaner-oil / cleaner-label messaging has gained salience.
Why Now: The product was possible before, but timing is especially favorable now
BetterBrand:
This product was possible before; BetterBrand itself launched in 2021 and hit Whole Foods in 2022, which is evidence enough that the core experience is not a post-2023 technological breakthrough. The better explanation is favorable timing: protein has become the nutrient consumers most actively seek, better-for-you claims are spreading across bread, and big food is increasingly reformulating around protein/fiber and GLP-1-adjacent habits. That helps a product like BetterBrand, but it does not make it newly possible.
Why Now: The product was possible before, but timing is especially favorable now
Chip City Cookies:
Nothing about Chip City's current product required a recent technological breakthrough. A strong operator could have sold warm, rotating premium cookies with delivery and shipping three to five years ago. The only real timing help is softer: social-media-driven dessert virality, mature off-the-shelf commerce/loyalty tooling, and a now-proven premium-cookie playbook from chains like Crumbl, Insomnia, and Levain that has trained consumers to pay for oversized indulgent cookies. That matters, but this still reads as execution on an existing format, not a timing-dependent invention.
Why Now: This is mostly execution, packaging, or product improvement rather than timing
Arriba Foods:
No recent technological breakthrough made Arriba's current product newly possible; jarred salsa, enchilada sauces, seasoning sauces, and barbecue sauces were plainly sellable years ago. The more credible timing argument is market-side: Hispanic households are over-indexing on CPG growth, Mexican foods already have meaningful grocery penetration, and retailers are receptive to culturally rooted, cleaner-label pantry brands. Arriba's own messaging reflects that reality: it is explicitly reintroducing the brand with new packaging, expanding into cooking sauces, and framing itself as a "modern Mexican pantry" for today's home cook.
Why Now: The product was possible before, but timing is especially favorable now
BelliWelli:
The product itself is not newly enabled by technology. A flavored fiber powder/gummy with probiotics and collagen could have been sold several years ago in roughly the same form. What changed is the commercial backdrop: gut health is more mainstream in retail, Walmart is actively expanding wellness assortments, TikTok Shop and creator-led commerce now offer a real acquisition path, and GLP-1 adoption created a fresh, highly legible use case around constipation, bloating, hydration, and appetite support that BelliWelli now markets against directly. This is favorable timing, not a breakthrough-tech story.
Why Now: The product was possible before, but timing is especially favorable now
Assessment: The consistency rule may have fixed the original problem (analysis says timing is favorable but conclusion says packaging improvement) but introduced a new one: the model may be latching onto any market trend as a "timing tailwind," and the consistency rule then locks in that conclusion. May need further tuning to distinguish between genuine timing shifts (new technology, regulatory change, platform shift) and normal market evolution (growing consumer demand for a category).