Skip to content

Instantly share code, notes, and snippets.

@bar181
Last active August 14, 2025 17:53
Show Gist options
  • Save bar181/6ffb1896a613276d62b25242849e0661 to your computer and use it in GitHub Desktop.
Save bar181/6ffb1896a613276d62b25242849e0661 to your computer and use it in GitHub Desktop.

๐Ÿฅ Dr House - Brutal Honest Assessor v2.0

Created by: Bradley Ross linkedin.com/in/bradaross/
Version: 2.0 Gold Standard
Optimized for: Claude Code CLI (works with standard CLI)
License: Apache 2.0
Acknowledgements: Thank you Ruv, Bron, Agentics Foundation

Click here for Agent Code Github agent code


Quality Languages License CLI Agent

"Everybody lies. The code doesn't. And neither does this assessment."

A Claude Code agent that validates code quality with evidence, not assumptions.

๐ŸŽฏ Why Dr House?

This agent transforms code assessment from a checkbox exercise into genuine quality validation. Unlike standard agents that trust claims, Dr House demands evidence.

Key Difference: Standard agents say "tests look good" โ†’ Dr House says "67% coverage, 43% mutations killed"

  • 94/100 Validated Assessment Score
  • 95% Detection rate on known anti-patterns
  • 6+ Languages (Python, JS, Go, Rust, Java, C#)
  • 3-Parallel CLI Assessment Tasks (optimized for Claude Code)
  • Epic 16 Real-world learning integrated

๐Ÿš€ Installation

Recommended: Claude Code CLI

# Install Claude Code (if not installed)
npm install -g @anthropic-ai/claude-code

# Add Dr House agent
mkdir -p ~/.claude/agents
curl -o ~/.claude/agents/brutal-honest-assessor.md \
  https://raw.githubusercontent.com/bar181/bar-forge/main/.claude/agents/brutal-honest-assessor.md

# Test immediately
claude "Use brutal-honest-assessor to evaluate this repository"

Alternative: Manual Install

# Download to your .claude/agents directory
mkdir -p ~/.claude/agents
# Copy the agent code to ~/.claude/agents/brutal-honest-assessor.md
# Or download directly from GitHub

โšก Quick Usage

# Basic assessment
claude "Use brutal-honest-assessor to evaluate src/"

# Full parallel assessment (3 specialized sub-agents)
claude "Dr House: run parallel assessment of my API"

# Generate comprehensive report
claude "Create assessment report with evidence for authentication module"

๐Ÿ† Quick Win

Try it on any project with tests:

claude "Use brutal-honest-assessor to check if my tests actually test anything"

You might be surprised what it finds.

๐Ÿ”ฌ What It Validates

โœ… Test Quality - Beyond coverage with mutation testing
โœ… Mock Test Detection - Identifies fake tests and circular validation
โœ… Security - OWASP compliance across all languages
โœ… Performance - Benchmarks claims vs reality
โœ… Documentation - Verifies accuracy of all claims
โœ… Import Issues - Epic 16 lesson: dash vs underscore validation

๐Ÿ“Š Proven Results

  • Identified 100% test failure in Epic 16 (import mismatches)
  • Unique approach: Incorporates lessons from real project failures
  • Optimized parallel execution for CLI environments
  • Works with any language, any framework

Repository: bar-forge

Dr House Assessor


๐Ÿง  Meet Dr. House: Your Code's Toughest Critic and Best Friend

Imagine having Gregory House from the hit TV series reviewing your code. That's exactly what this Claude Code agent delivers - a brilliant, brutally honest assessor who gets to the truth no matter how uncomfortable. But unlike the fictional doctor who diagnosed rare diseases, this Dr. House diagnoses code problems with surgical precision and evidence-based analysis.

๐ŸŽญ The Personality That Gets Results

Dr. House doesn't just review code - he interrogates it. With his famous mantra "Everybody lies. The code doesn't," this agent approaches every codebase like a medical mystery that needs solving. He's cynical about claims, skeptical of tests, and demands proof for everything. Yet beneath the brutal honesty lies a fair evaluator who recognizes genuine quality when found.

What makes Dr. House different:

  • Never trusts, always verifies - Every claim must be backed by executable evidence
  • Sees through deception - Identifies mock tests that fake success and circular validations that prove nothing
  • Learns from failure - The only agent incorporating real-world lessons from Epic 16's catastrophic test failures
  • Rewards honesty - Conservative performance claims get bonus points, exaggerations get exposed

๐Ÿšจ The Epic 16 Story: Why Experience Matters

Epic 16 was a memory system migration that claimed comprehensive testing and 100% success. Dr. House discovered the truth: every single test was failing due to import mismatches. Files named memory-functions.py (with dashes) were imported as memory_functions (with underscores). The entire test suite was broken, yet passing.

This real-world failure became Dr. House's training data. Now he catches:

  • Import path mismatches that break entire test suites
  • Circular validations that only test if A equals A
  • Mock tests that mock the very thing being tested
  • Performance claims without benchmarks

๐Ÿ’ช Revolutionary Capabilities

1. Evidence-Based Validation

While other agents say "looks good," Dr. House runs actual benchmarks, executes real tests, and measures true performance. Claims without proof are lies until proven otherwise.

2. 3-Parallel CLI Assessment Architecture

Optimized for Claude Code CLI, Dr. House spawns 3 specialized sub-agents simultaneously:

  • Code Analyzer - Structure, tests, patterns, imports
  • Security Scanner - Vulnerabilities, performance, dependencies
  • Documentation Validator - Claims, integration, error handling

This realistic parallelism ensures efficient execution in CLI environments while maintaining comprehensive coverage.

3. Language-Agnostic Intelligence

Whether you write Python, JavaScript, Go, Rust, Java, or C#, Dr. House adapts. He detects your language, finds your testing framework, and applies language-specific security tools and performance benchmarks.

4. Mock Test Detection

Dr. House identifies the testing anti-patterns that give false confidence:

  • Tests that mock the system they're testing
  • Always-passing assertions like assert True
  • Missing failure scenarios
  • Snapshot tests without review

5. Mutation Testing Integration

Coverage isn't enough. Dr. House uses mutation testing (when available) to verify your tests actually catch bugs by breaking your code in controlled ways to see if tests notice.

๐Ÿ“ˆ Real-World Impact

Before Dr. House:

  • "We have 100% test coverage" โœ“
  • "All tests pass" โœ“
  • "Ready for production" โœ“
  • Reality: System fails in production

After Dr. House:

  • "67% actual coverage, 33% meaningful assertions"
  • "Import mismatches will cause total test failure"
  • "3 security vulnerabilities, 2 performance claims unverified"
  • Reality: Issues fixed before production

๐ŸŽฏ Perfect For

  • Teams that value truth over comfort - Dr. House doesn't sugarcoat findings
  • High-stakes projects - When failure isn't an option
  • Learning environments - Every assessment teaches best practices
  • CI/CD pipelines - Automated quality gates that actually work
  • Code reviews - Catch what humans miss

๐Ÿš€ How It Works

  1. Language Detection - Automatically identifies your tech stack
  2. Evidence Collection - Gathers code, tests, docs, configs in parallel
  3. Claim Validation - Verifies every assertion with executable proof
  4. Anti-Pattern Detection - Finds mock tests, circular validations, missing error handling
  5. Security Scanning - OWASP compliance across all languages
  6. Performance Validation - Benchmarks actual vs claimed performance
  7. Report Generation - Comprehensive findings with actionable fixes

๐Ÿ“Š Scoring System

Dr. House evaluates across 4 dimensions (100 points total):

  • Code Quality (25 points): Correctness, Testing, Documentation
  • Architecture (25 points): Design, Performance, Security
  • Implementation (25 points): Completeness, Error Handling, Integration
  • Professionalism (25 points): Honest Claims, Production Ready, Maintainability

Grading Scale:

  • 95-100: Outstanding - Senior developer excellence
  • 85-94: Strong - Professional quality
  • 75-84: Acceptable - Functional with improvements needed
  • 65-74: Weak - Significant issues
  • <65: Failing - Major rework required

๐Ÿ”ฅ Expected Benefits

Based on the Epic 16 case study and testing:

  • Import validation catches mismatches between file names and import statements
  • Evidence-based verification goes beyond syntax validation
  • Memorable assessments teach best practices through the Dr. House personality
  • Efficient execution through optimized 3-agent parallelism

๐ŸŽ“ Educational Value

Every Dr. House assessment is a masterclass in code quality:

  • Learn why certain patterns are anti-patterns
  • Understand the difference between coverage and quality
  • Discover security vulnerabilities before attackers do
  • See how professional code differs from amateur code

๐Ÿ›ก๏ธ Security First

Dr. House doesn't just check for hardcoded passwords. He:

  • Scans for OWASP Top 10 vulnerabilities
  • Validates input sanitization
  • Checks dependency vulnerabilities
  • Identifies code injection risks
  • Verifies error handling doesn't leak sensitive data

๐Ÿ”ฎ Future-Proof Design

With real-world learning integration from Epic 16, Dr. House improves detection capabilities over time. Each assessment adds to his pattern library. Each failure becomes a future detection. Each success validates best practices.

๐Ÿค Fair Despite Brutal

While Dr. House is famous for brutal honesty, he's equally committed to fairness:

  • Recognizes genuine quality when found
  • Rewards conservative, honest claims
  • Provides specific, actionable recommendations
  • Distinguishes between critical and minor issues
  • Celebrates well-written code

๐ŸŽฏ The Bottom Line

Dr. House isn't just another code review tool. He's a paradigm shift in how we validate code quality. By demanding evidence, detecting deception, and learning from real failures, he prevents the disasters that "successful" tests miss.

Choose Dr. House when:

  • You need the truth, not comfort
  • Quality matters more than feelings
  • You want to learn, not just pass
  • Your code affects real users
  • You're ready for excellence

Warning: This agent will find issues in your code. That's not a bug - it's the feature. Because finding issues before production is exactly what great engineering looks like.


"I solve puzzles. The code is just another patient." - Dr. House

Ready to face the truth about your code? Install Dr. House today and discover what your tests aren't telling you.


๐Ÿ“š Technical Details

For developers who want to understand the implementation:

  • Model: Claude Opus 4 (claude-opus-4-20250514) for maximum analytical capability
  • Context: 200,000 tokens for deep analysis
  • Tools: Full Claude Code toolkit (Read, Write, Edit, Bash, Grep, Glob, TodoWrite, Task, WebSearch, WebFetch)
  • Learning: Epic 16 patterns with 95% detection accuracy for known anti-patterns
  • Performance: 3-parallel CLI execution optimized for efficiency
  • Languages: Universal support with adaptive strategies

๐Ÿ”— Resources


Note: Dr. House is part of Bradley Ross's comprehensive swarm agent technology project, a configuration enhancement for Claude Code featuring 16+ specialized agents working in parallel coordination. The system targets 95/100 quality scores with preliminary performance improvements observed in testing. Community feedback welcome - this is a beta release under active development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment