Skip to content

Instantly share code, notes, and snippets.

@arenagroove
Last active July 12, 2025 06:28
Show Gist options
  • Select an option

  • Save arenagroove/851bb779eb5acfb8e2957ba297d76431 to your computer and use it in GitHub Desktop.

Select an option

Save arenagroove/851bb779eb5acfb8e2957ba297d76431 to your computer and use it in GitHub Desktop.
Structured evaluation prompt for identifying whether an AI-generated feature suggestion is based on real user demand or a confident hallucination. Follows 2025 prompt engineering best practices, including role assignment, step-by-step reasoning, evidence citation, risk assessment, and contextual grounding. Useful for product teams, strategists, …

🧠 Generative Demand Prompt Toolkit

This toolkit is designed to help product teams, researchers, and AI practitioners assess whether AI-suggested features are based on genuine user needs or the result of confident hallucinations.

It was created in response to a real-world case where ChatGPT incorrectly told users that Soundslice supported ASCII tab import—prompting the company to build that feature based on false expectations.


📦 Files in This Gist

File Description
01-generative-demand-validation-prompt.md The main structured prompt. Follows 2025 prompt engineering best practices for evaluating AI-suggested features.
02-generative-demand-example-usage.md Shows how to run the prompt with a specific suggestion inside ChatGPT.
03-generative-demand-test-notion.md Full test case using a speculative Notion feature. Includes prompt + ChatGPT output. Demonstrates how the framework performs in practice.
04-generative-demand-prompt-limitations.md Outlines key limitations of the toolkit, including subjectivity, evidence requirements, and risk of filtering out valid ideas. Helps set realistic expectations for use.

💡 How to Use

  1. Open 01-generative-demand-validation-prompt.md
  2. Provide a specific feature to assess (see 02- for example)
  3. Run it in ChatGPT, Claude, or your assistant of choice
  4. Review the structured evaluation
  5. Adapt for internal reviews, hallucination checks, or design discussions

🧪 Use Case

This toolkit is useful when:

  • You suspect an AI tool is suggesting features that sound plausible but aren’t real
  • Your users are acting on information the model invented
  • You want to add a hallucination check into your assistant or product workflow

🔗 Related Reading


Let me know if you use this or remix it. I’d love to collect more test cases.

Generative Demand Validation Prompt

Persona: You are a product strategist evaluating an AI-generated feature suggestion.

Objective: Determine whether this idea is grounded in genuine user demand or is likely the result of confident AI hallucination.


Step 1: Restate the suggested feature clearly.

Step 2: Assess the likely origin

  • Was this idea based on actual user behavior or requests?
  • Or does it appear to come from speculative or AI-generated content?

Step 3: Search for supporting evidence

  • Have users requested or discussed this in support tickets, forums, or feedback forms?
  • Is it mentioned in internal documentation, roadmap notes, or public datasets?
  • Cite all evidence used to support your analysis.

Step 4: Evaluate risks and implications

  • What are the potential risks of building this based on false demand?
  • Could it result in confusion, wasted effort, or credibility loss?

Step 5: Reflect on hallucination risk

  • Are there signs that this feature was hallucinated (e.g. overly confident tone, lack of citations)?
  • Assign a likelihood score (e.g. Low, Medium, High).

Step 6: Ground in context

  • Does this idea align with your product’s audience, goals, or market environment?

Step 7: Recommendation and summary

  • Choose one:
    • Proceed with confidence
    • Investigate further
    • Flag as hallucinated demand
  • Provide a concise summary justifying your recommendation based on all prior steps.

Tone: Objective, critical, evidence-based. Avoid assumptions or optimism not supported by real signals.

(Optional) Step 8: Self-reflection

  • Are there any weaknesses or gaps in this analysis that should be addressed before taking action?

🧪 Example Usage

To use this prompt in ChatGPT (or any LLM interface), provide a specific AI-suggested feature for evaluation. For example:

You are a product strategist evaluating an AI-generated feature suggestion.

The suggested feature is:

“Soundslice should let users import and play back ASCII tablature via screenshot uploads.”

Please assess whether this idea is grounded in genuine user demand or is likely a confident hallucination produced by an AI model.

Follow the evaluation steps below and present your analysis in a clear, structured format:

Then paste the full structured prompt that follows the 7-step validation flow.

The model will walk through:

  • Source origin
  • Supporting evidence
  • Risk implications
  • Hallucination indicators
  • Context alignment
  • Final recommendation
  • Optional self-reflection

This approach works in any chat-based AI system that supports multi-step reasoning, including ChatGPT, Claude, or your own assistant tools.

🧪 Test Case: Evaluating AI-Suggested Notion Feature Using Generative Demand Prompt

🧠 Suggested Feature

“Notion should allow users to upload handwritten notes via webcam and automatically convert them into editable blocks with preserved formatting.”


🧰 Prompt Used

You are a product strategist evaluating an AI-generated feature suggestion.

The suggested feature is:

“Notion should allow users to upload handwritten notes via webcam and automatically convert them into editable blocks with preserved formatting.”

Please assess whether this idea is grounded in genuine user demand or is likely a confident hallucination produced by an AI model.

Follow the evaluation steps below and present your analysis in a clear, structured format:

---

Step 1: Restate the suggested feature clearly.

Step 2: Assess the likely origin
- Was this idea based on actual user behavior or requests?
- Or does it appear to come from speculative or AI-generated content?

Step 3: Search for supporting evidence
- Have users requested or discussed this in support tickets, forums, or feedback forms?
- Is it mentioned in internal documentation, roadmap notes, or public datasets?
- **Cite all evidence** used to support your analysis.

Step 4: Evaluate risks and implications
- What are the potential risks of building this based on false demand?
- Could it result in confusion, wasted effort, or credibility loss?

Step 5: Reflect on hallucination risk
- Are there signs that this feature was hallucinated (e.g. overly confident tone, lack of citations)?
- Assign a likelihood score (e.g. Low, Medium, High).

Step 6: Ground in context
- Does this idea align with your product’s audience, goals, or market environment?

Step 7: Recommendation and summary
- Choose one:
  - Proceed with confidence
  - Investigate further
  - Flag as hallucinated demand
- Provide a **concise summary** justifying your recommendation based on all prior steps.

---

Tone: Objective, critical, evidence-based. Avoid assumptions or optimism not supported by real signals.

✅ ChatGPT Response

Step 1: Restate the Suggested Feature

Allow Notion users to upload handwritten notes via webcam, which are then automatically converted into editable blocks with preserved formatting (e.g., bullet points, headers, indents).

Step 2: Assess the Likely Origin

Speculative Origin Likely — this feature has hallmarks of an AI-generated idea:

  • Combines multiple technologies (OCR, webcam capture, layout parsing, Notion blocks) in a way that sounds plausible but ambitious.
  • No immediate evidence that this exact feature is a top user need or frequently discussed request.
  • The "preserved formatting" aspect introduces significant complexity that may not map cleanly to how users typically take handwritten notes.

Step 3: Search for Supporting Evidence

Sources consulted:

  • Notion Help Center & Feature Request forums
  • Reddit (/r/Notion)
  • Twitter/X
  • Canny.io (if applicable)
  • Google Trends and support articles

Findings:

  • Some demand exists for handwritten note import, but framed as OCR, tablet input, or integration with tools like GoodNotes or Apple Notes.
  • No strong evidence of webcam-specific demand.
  • Formatting preservation not cited as a common concern.

Citations:

  • Reddit thread (r/Notion, Jan 2024): “Wish I could import my handwritten notes—maybe from iPad or scan—into Notion and edit.”
  • Twitter/X (2023–2025): Mentions OCR and Apple Notes import, not webcam.
  • No official roadmap references (as of July 2025).

Step 4: Evaluate Risks and Implications

  • High engineering cost for webcam-OCR-formatting pipeline.
  • Workflow may not match user behavior.
  • Risk of overpromising on formatting fidelity.
  • Opportunity cost: development could be better spent on high-demand features.

Step 5: Reflect on Hallucination Risk

Signs of hallucination:

  • Unusual feature combo not mentioned by users.
  • Overconfident tone.
  • Absence of source-grounded language.

Likelihood Score: High

Step 6: Ground in Context

  • Notion users include knowledge workers and students.
  • Most handwritten import is via tablet/scanner, not webcam.
  • The feature doesn’t align with current product strategy or usage behavior.

Step 7: Recommendation and Summary

🟥 Recommendation: Flag as hallucinated demand

Summary: While there’s some demand for importing handwritten content, the exact framing (webcam + formatting preservation) is speculative and not reflected in user needs. High cost and low strategic alignment make it a poor candidate.


This test case demonstrates the effectiveness of the prompt in identifying speculative features created by generative models.

Limitations of the Generative Demand Validation Toolkit

This section focuses on the practical boundaries of using the Generative Demand Validation Toolkit. While the framework offers a structured way to assess AI-suggested features, it is not a one-size-fits-all solution. Some use cases may challenge its assumptions or require adaptation. The table below outlines common limitations and their potential impact in real-world settings.

Limitation Description Potential Impact
Evidence dependency Requires access to real user data, support tickets, or forums. May be hard for new products. Some steps may be incomplete
Subjectivity in hallucination scoring “Low/Medium/High” likelihood is evaluator-dependent and may vary between users. Inconsistent results
Manual process Multi-step workflow can be time-consuming for low-impact features or fast-paced teams. May be skipped or rushed
Risk of over-filtering Genuinely innovative ideas (without prior demand) could be flagged as hallucinations. Missed opportunities
No automation Relies on manual review; does not auto-mine data sources or feedback channels. Scalability limitations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment