You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Structured evaluation prompt for identifying whether an AI-generated feature suggestion is based on real user demand or a confident hallucination. Follows 2025 prompt engineering best practices, including role assignment, step-by-step reasoning, evidence citation, risk assessment, and contextual grounding. Useful for product teams, strategists, …
This toolkit is designed to help product teams, researchers, and AI practitioners assess whether AI-suggested features are based on genuine user needs or the result of confident hallucinations.
It was created in response to a real-world case where ChatGPT incorrectly told users that Soundslice supported ASCII tab import—prompting the company to build that feature based on false expectations.
Outlines key limitations of the toolkit, including subjectivity, evidence requirements, and risk of filtering out valid ideas. Helps set realistic expectations for use.
💡 How to Use
Open 01-generative-demand-validation-prompt.md
Provide a specific feature to assess (see 02- for example)
Run it in ChatGPT, Claude, or your assistant of choice
Review the structured evaluation
Adapt for internal reviews, hallucination checks, or design discussions
🧪 Use Case
This toolkit is useful when:
You suspect an AI tool is suggesting features that sound plausible but aren’t real
Your users are acting on information the model invented
You want to add a hallucination check into your assistant or product workflow
To use this prompt in ChatGPT (or any LLM interface), provide a specific AI-suggested feature for evaluation. For example:
You are a product strategist evaluating an AI-generated feature suggestion.
The suggested feature is:
“Soundslice should let users import and play back ASCII tablature via screenshot uploads.”
Please assess whether this idea is grounded in genuine user demand or is likely a confident hallucination produced by an AI model.
Follow the evaluation steps below and present your analysis in a clear, structured format:
Then paste the full structured prompt that follows the 7-step validation flow.
The model will walk through:
Source origin
Supporting evidence
Risk implications
Hallucination indicators
Context alignment
Final recommendation
Optional self-reflection
This approach works in any chat-based AI system that supports multi-step reasoning, including ChatGPT, Claude, or your own assistant tools.
🧪 Test Case: Evaluating AI-Suggested Notion Feature Using Generative Demand Prompt
🧠 Suggested Feature
“Notion should allow users to upload handwritten notes via webcam and automatically convert them into editable blocks with preserved formatting.”
🧰 Prompt Used
You are a product strategist evaluating an AI-generated feature suggestion.
The suggested feature is:
“Notion should allow users to upload handwritten notes via webcam and automatically convert them into editable blocks with preserved formatting.”
Please assess whether this idea is grounded in genuine user demand or is likely a confident hallucination produced by an AI model.
Follow the evaluation steps below and present your analysis in a clear, structured format:
---
Step 1: Restate the suggested feature clearly.
Step 2: Assess the likely origin
- Was this idea based on actual user behavior or requests?
- Or does it appear to come from speculative or AI-generated content?
Step 3: Search for supporting evidence
- Have users requested or discussed this in support tickets, forums, or feedback forms?
- Is it mentioned in internal documentation, roadmap notes, or public datasets?
- **Cite all evidence** used to support your analysis.
Step 4: Evaluate risks and implications
- What are the potential risks of building this based on false demand?
- Could it result in confusion, wasted effort, or credibility loss?
Step 5: Reflect on hallucination risk
- Are there signs that this feature was hallucinated (e.g. overly confident tone, lack of citations)?
- Assign a likelihood score (e.g. Low, Medium, High).
Step 6: Ground in context
- Does this idea align with your product’s audience, goals, or market environment?
Step 7: Recommendation and summary
- Choose one:
- Proceed with confidence
- Investigate further
- Flag as hallucinated demand
- Provide a **concise summary** justifying your recommendation based on all prior steps.
---
Tone: Objective, critical, evidence-based. Avoid assumptions or optimism not supported by real signals.
✅ ChatGPT Response
Step 1: Restate the Suggested Feature
Allow Notion users to upload handwritten notes via webcam, which are then automatically converted into editable blocks with preserved formatting (e.g., bullet points, headers, indents).
Step 2: Assess the Likely Origin
Speculative Origin Likely — this feature has hallmarks of an AI-generated idea:
Combines multiple technologies (OCR, webcam capture, layout parsing, Notion blocks) in a way that sounds plausible but ambitious.
No immediate evidence that this exact feature is a top user need or frequently discussed request.
The "preserved formatting" aspect introduces significant complexity that may not map cleanly to how users typically take handwritten notes.
Step 3: Search for Supporting Evidence
Sources consulted:
Notion Help Center & Feature Request forums
Reddit (/r/Notion)
Twitter/X
Canny.io (if applicable)
Google Trends and support articles
Findings:
Some demand exists for handwritten note import, but framed as OCR, tablet input, or integration with tools like GoodNotes or Apple Notes.
No strong evidence of webcam-specific demand.
Formatting preservation not cited as a common concern.
Citations:
Reddit thread (r/Notion, Jan 2024): “Wish I could import my handwritten notes—maybe from iPad or scan—into Notion and edit.”
Twitter/X (2023–2025): Mentions OCR and Apple Notes import, not webcam.
No official roadmap references (as of July 2025).
Step 4: Evaluate Risks and Implications
High engineering cost for webcam-OCR-formatting pipeline.
Workflow may not match user behavior.
Risk of overpromising on formatting fidelity.
Opportunity cost: development could be better spent on high-demand features.
Step 5: Reflect on Hallucination Risk
Signs of hallucination:
Unusual feature combo not mentioned by users.
Overconfident tone.
Absence of source-grounded language.
Likelihood Score: High
Step 6: Ground in Context
Notion users include knowledge workers and students.
Most handwritten import is via tablet/scanner, not webcam.
The feature doesn’t align with current product strategy or usage behavior.
Step 7: Recommendation and Summary
🟥 Recommendation: Flag as hallucinated demand
Summary: While there’s some demand for importing handwritten content, the exact framing (webcam + formatting preservation) is speculative and not reflected in user needs. High cost and low strategic alignment make it a poor candidate.
This test case demonstrates the effectiveness of the prompt in identifying speculative features created by generative models.
Limitations of the Generative Demand Validation Toolkit
This section focuses on the practical boundaries of using the Generative Demand Validation Toolkit. While the framework offers a structured way to assess AI-suggested features, it is not a one-size-fits-all solution. Some use cases may challenge its assumptions or require adaptation. The table below outlines common limitations and their potential impact in real-world settings.
Limitation
Description
Potential Impact
Evidence dependency
Requires access to real user data, support tickets, or forums. May be hard for new products.
Some steps may be incomplete
Subjectivity in hallucination scoring
“Low/Medium/High” likelihood is evaluator-dependent and may vary between users.
Inconsistent results
Manual process
Multi-step workflow can be time-consuming for low-impact features or fast-paced teams.
May be skipped or rushed
Risk of over-filtering
Genuinely innovative ideas (without prior demand) could be flagged as hallucinations.
Missed opportunities
No automation
Relies on manual review; does not auto-mine data sources or feedback channels.