Skip to content

Instantly share code, notes, and snippets.

@DroopyTersen
Created December 19, 2025 20:25
Show Gist options
  • Select an option

  • Save DroopyTersen/c6c3fb9ae3fc1eaf6334d27f4c59aac6 to your computer and use it in GitHub Desktop.

Select an option

Save DroopyTersen/c6c3fb9ae3fc1eaf6334d27f4c59aac6 to your computer and use it in GitHub Desktop.
Claude Data Science Prompts

Prompt Template for Thorough Exploratory Data Analysis:

I want you to do a comprehensive, exploratory data analysis on [this data/topic].

Process requirements:

  1. Create a notes.md file to document findings as you go
  2. Maintain a running TODO list of questions to investigate - and ADD to it when you discover interesting threads worth pulling
  3. Use Python scripts heavily to explore the data - don't just answer questions, look for patterns, anomalies, and surprises
  4. When you find something unexpected, pursue it. The goal is discovery, not just answering a checklist.
  5. Do NOT produce any final report or summary until you've exhausted your exploration

Mindset:

  • Be a detective, not a report-writer
  • "That's weird" is a signal to dig deeper, not move on
  • Breadth first, then depth on the interesting stuff
  • Minimum 7-10 phases of analysis before synthesizing

Only when you're truly done exploring, summarize what you found most insightful. Then we'll discuss how to structure a final deliverable.

The key phrases that shift your behavior:

  1. "Add to your TODO list when you discover threads" — prevents you from treating it as a fixed checklist
  2. "Do NOT produce any final report until..." — explicitly blocks your tendency to prematurely deliver
  3. "The goal is discovery, not just answering" — reframes success criteria
  4. "'That's weird' is a signal to dig deeper" — legitimizes exploratory tangents
  5. "Minimum 7-10 phases" — sets an expectation of depth

Reusable Prompt for Data Analysis Report Generation

DOCUMENT STRUCTURE

1. Title Page

  • Project/client name prominently displayed
  • Descriptive subtitle (what was analyzed)
  • Date
  • Summary metrics table (5-7 key numbers that frame the scope)

2. Top Insights Section (60% of document focus)

This is the heart of the report. For each major insight:

Format each insight as:

  • Numbered heading with a punchy, conclusive title (not a question—state the finding)
  • 1-2 paragraph explanation in plain language (no jargon, no statistical notation)
  • A chart OR simple table that visually proves the point
  • Brief methodology nod (1-2 sentences, non-technical): "This was measured by comparing X across Y" or "Based on analysis of N projects over Z time period"
  • So-what statement: why this matters or what action it implies

Chart/Table Guidelines for Top Insights:

  • Prefer charts over tables when showing comparisons or trends
  • Tables should be 2-4 columns max, no more than 6-8 rows
  • Every visual should have an obvious takeaway within 3 seconds
  • Label notable data points directly on charts
  • Use color meaningfully (red=bad, green=good, blue=neutral)

Methodology Nods (Top Section):

  • Keep it to one sentence
  • Use plain English: "We compared..." not "A Mann-Whitney U test was performed..."
  • Focus on what was measured, not how
  • Example: "Based on grouping the 34 projects by their naming patterns and comparing average productivity"

3. Detailed Analysis Section (30% of document focus)

This section supports the top insights with technical depth. Include:

For each analysis area:

  • Section heading matching a theme (e.g., "Tier Analysis," "Predictive Modeling")
  • Brief context paragraph
  • Detailed tables with full statistics
  • Methodology boxes (technical) with:
    • Exact formulas or calculations used
    • Statistical tests performed with test statistics and p-values
    • Sample sizes and data filters applied
    • Assumptions and limitations

Methodology Box Format: Use a visually distinct shaded box labeled "📊 Methodology: [Topic]"

  • Bullet points with technical details
  • Include actual numbers: "n=34, p=0.006, R²=0.262"
  • Note any caveats: "3 projects excluded due to missing data"

4. Recommendations Section (10% of document focus)

  • Numbered list of actionable recommendations
  • Each item: bold action statement + supporting rationale in regular text
  • Quick reference table for practical application (estimation lookup, risk checklist, etc.)

STYLISTIC PREFERENCES

Tone:

  • Confident and conclusive (not hedging)
  • Insight-first, evidence-second
  • Written for a smart non-technical reader

Formatting:

  • Professional blue color scheme
  • Consistent heading hierarchy
  • Generous white space
  • Alternating row shading in tables
  • Page breaks before major sections

Charts:

  • Clean, minimal style (no 3D, no excessive gridlines)
  • Direct labels on data points for key callouts
  • Legends only when necessary
  • Title states the conclusion, not just the topic
    • Good: "Rules Projects Are 15x Slower Than Average"
    • Bad: "Velocity by Project Family"

Tables:

  • Header row with dark background, white text
  • Right-align numbers, left-align text
  • Include units in headers, not cells
  • Bold key figures that support the insight

Language:

  • Avoid: "It should be noted that...", "It is interesting that...", "As can be seen..."
  • Prefer: Direct statements. "X is Y." "The data shows Z."
  • Quantify everything: "significantly faster" → "3.2x faster"

WHAT TO AVOID

  • Dense paragraphs without visual breaks
  • Tables with more than 5-6 columns
  • Methodology in the top section that reads like a statistics textbook
  • Insights that bury the conclusion at the end of a paragraph
  • Charts that require explanation to understand
  • Repeating the same data in both chart and table form
  • Executive summary that's just a list of section headings

EXAMPLE INSIGHT BLOCK (for reference)

3. Small Projects Take MORE Time, Not Less

Counter-intuitively, the smallest projects took longer than the largest ones. Projects in the bottom 25% by size averaged 71 days, while projects in the top 25% averaged just 55 days—despite being 29x larger.

[CHART: Side-by-side bar comparison of Bottom 25% vs Top 25% showing size and days]

This pattern reflects a fixed overhead of approximately 50 days per project (environment setup, testing, deployment, documentation) that dominates small project timelines. Based on comparing the smallest and largest project quartiles.

Implication: Consider consolidating small APIs into larger migration units where architecturally feasible.


Apply this structure to generate the report for: [DESCRIBE YOUR ANALYSIS AND KEY FINDINGS]


Quick Reference: The Formula

Section Audience Visuals Methodology Depth Length
Title Page Everyone Summary table None 1 page
Top Insights Executives Charts + simple tables 1 sentence plain English 60%
Detailed Analysis Technical readers Detailed tables Full methodology boxes 30%
Recommendations Decision makers Lookup tables None 10%

Optional Add-On Prompts

If you want more charts:

For each insight, prefer a chart over a table. Use bar charts for comparisons, scatter plots for relationships, and line charts for trends. Every chart title should state the conclusion.

If you want a specific number of insights:

Identify the top [N] most surprising or actionable findings. Prioritize findings that challenge conventional assumptions or have clear business implications.

If you have a specific audience:

The primary audience is [ROLE]. Tailor the language complexity and focus areas accordingly. [ROLE] cares most about [SPECIFIC CONCERNS].

If you want anomaly callouts:

Include a section on "Anomalies Requiring Investigation" that lists data points that don't fit the patterns, with specific questions to explore.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment