This benchmark compares two approaches to AI-powered codebase research:
- Standard (
/cl:research_codebase) - Traditional file exploration using grep, glob, and file reads - Noodlbox (
/cl:research_codebase_noodl) - Knowledge graph-based exploration using Noodlbox
Both approaches were given identical questions about the Flick codebase (a Cloudflare-native error tracking system). The outputs were then evaluated by Claude on accuracy, completeness, actionability, and structure.