| description |
|---|
Analyze codebase for over-engineering patterns and unnecessary complexity |
You are tasked with analyzing a codebase to identify signs of over-engineering, unnecessary abstractions, and architectural complexity that doesn't add business value.
-
Analyze the Project Structure:
- Count total files and directory depth
- Identify abstract classes and their implementations
- Map inheritance hierarchies
- Detect wrapper patterns around external APIs
- Calculate test-to-code ratios
-
Identify Over-Engineering Patterns:
- Premature Abstraction: Abstract classes with β€2 implementations
- File Explosion: Multiple files for single logical units
- Deep Hierarchies: Inheritance chains >2 levels
- Wrapper Mania: API wrappers without added value
- Copy-Paste Inheritance: Nearly identical classes
-
Calculate Metrics:
- Lines of code (production vs test)
- Test-to-code ratio
- Files per thousand lines of code
- Average file size
- Directory nesting depth
- Abstraction-to-implementation ratio
-
Severity Assessment:
- π’ Healthy (0-30): Clean, maintainable code
- π‘ Warning (31-60): Some unnecessary complexity
- π΄ Critical (61-100): Severe over-engineering
- First, check the target directory structure using
list_files - Count and categorize files by type and purpose
- Analyze inheritance patterns and class relationships with
list_code_definition_names - Search for abstract base classes and count implementations
- Calculate code metrics and duplication
- Generate severity score and recommendations
Present findings in this structure:
- Production Code: X lines across Y files
- Test Code: Z lines across W files
- Test-to-Code Ratio: X:1
- Average File Size: X lines
- Max Directory Depth: X levels
- Description of the pattern
- Specific examples from codebase
- Files affected: List key files
- Recommendation: Actionable fix
- Lines of Code: ~X lines (Y% reduction possible)
- File Count: Current β Suggested
- Maintenance Time: Estimated hours/year saved
- [Highest priority issue]
- [Second priority issue]
- [Third priority issue]
Found: BaseDocumentParser with only PDFParser, DocxParser implementations Issue: All parsers 95% identical, just calling same API Fix: Single configurable parser class
Found: 6+ files for single node implementation
βββ node.py
βββ params.py
βββ schemas.py
βββ ports/input.py
βββ ports/output.py
βββ __init__.py
Issue: Unnecessary file separation Fix: Consolidate into 1-2 files
Issue: Unnecessary file separation Fix: Consolidate into 1-2 files
Found: ServiceWrapper β ExternalAPI Issue: Just passes through calls Fix: Use ExternalAPI directly
- Start simple, iterate when needed
- Use composition over inheritance
- Keep related code together
- Write for current requirements
- Prefer configuration over new classes
- Create abstractions for single use
- Split into many tiny files
- Build for hypothetical futures
- Wrap APIs without adding value
- Test implementation details
Before: 1,842 lines across 18 files (5 parser classes) After: 198 lines in 1 file Savings: 93.7% code reduction
Before: 70+ files, 15+ node classes After: 3-4 files with configurable nodes Savings: 90% file reduction
- Always start with severity assessment
- Provide concrete examples from the analyzed code
- Include specific file paths and line counts
- Offer actionable recommendations
- Estimate potential savings in concrete terms
- Prioritize refactoring suggestions by impact
Remember: The goal is to identify complexity that doesn't serve the business need. Every abstraction should earn its keep by providing clear value.