This skill instructs an agent how to evaluate a React SPA codebase with the judgment of a seasoned engineer. It defines what to look for, how to look for it, what representation to use, and where automation has hard limits.
Never read raw source to answer structural questions. Transform source into denser representations before reasoning:
- Parse to AST (tree-sitter, ts-morph, babel parser) — extract call graphs, branch counts, effect locations, and complexity metrics as structured data
- Extract interface surfaces — reason about module contracts from type signatures alone; do not read implementations unless a specific anomaly demands it
- Build a dependency graph — use Madge or dependency-cruiser; feed the agent the adjacency list, not the source files
- Summarize by module boundary — when full source must be read, summarize each module independently first, then reason over summaries
This collapses effective context size 5–10x and increases signal density.
Evaluate these mechanically from AST and graph data:
| Quality | What to measure | Red flag |
|---|---|---|
| Cohesion | Responsibilities per module | A module that fetches, formats, and controls auth |
| Coupling | Inbound/outbound edges in dependency graph | High fan-in on non-utility modules; circular edges |
| Cyclomatic complexity | Branch node count per function | > 10 demands scrutiny |
| Type coverage | any density in TypeScript |
High any means the type system is decorative |
| Effect discipline | Count and dep-array size of useEffect |
Large dep arrays signal tangled concerns |
| Bundle composition | Dead code, duplicate deps, unshaken tree | Anything the user downloads but doesn't need |
| Test branch coverage | Branches hit, not just lines | Uncovered branches on critical paths outweigh high line coverage |
| Render efficiency | Re-render frequency per interaction | Unnecessary re-renders on stable input |
Look for these by reading interfaces, call sites, and module structure:
- Separation of concerns — data fetching, business logic, and rendering must live in separate layers; hooks own behavior, components own presentation
- Colocation — related things live near each other; tests adjacent to modules, styles scoped to components; penalize cross-directory jumps for single-feature understanding
- Encapsulation — consumers must not need to know implementation details to use a module correctly; a leaking abstraction is a coupling debt
- Composability — components compose without fighting their API; look for compound component patterns, explicit slot structures, and intentional prop surfaces
- Consistency — count how many distinct patterns exist for the same concern (data fetching, error handling, naming conventions); more than one pattern without documented rationale is a governance failure
Structure tells you what code does mechanically. Meaning is why it exists and whether it belongs. Bridge the gap with these moves:
1. Callsite-motivated reading Always locate where a function is called before judging what it does. The same logic can be correct in one callsite and wrong in another. Meaning is relational to usage context.
2. Invariant checking Look for comments, ADRs, or annotations that encode architectural decisions — then check whether new code violates them. A function that is structurally clean but contradicts a documented invariant is a meaning failure.
Flag code that encodes assumptions without documenting them. Example: client-side discount calculation in a codebase where discount authority is documented as server-side only. The structure passes; the meaning violates the architecture.
3. Split source-of-truth detection When the same domain concept (price, discount, user state) is computed or stored in more than one place, flag it. The bug hasn't happened yet but the conditions are present.
4. Counterfactual simulation For flagged code, simulate one realistic future change: "If this value changes in one location and not the other, what breaks?" If the answer is "a user-facing inconsistency with no failing test," escalate.
5. Naming as claim verification
Treat every name as a falsifiable claim. Check whether the behavior matches the name. A function named applyDiscount that only conditionally applies it should be named applyDiscountIfEligible. Stale or misleading names are meaning debt.
Automation cannot recover meaning that was never recorded. To make meaning checkable:
- Encode architectural decisions as inline invariant annotations:
// @invariant: X is authoritative here — do not compute elsewhere - Write ADRs and link them from the modules they govern
- Enforce naming conventions with lint rules, not style guides
- Make module contracts include semantic claims, not just type signatures
If a decision was made but not recorded in a machine-readable or agent-readable form, no amount of analysis will surface violations reliably. The human process of recording decisions is the prerequisite for automation of meaning.
- All measurable structural quality checks
- Consistency analysis across the codebase
- Callsite graph construction and anomaly flagging
- Invariant violation detection when invariants are annotated
- Split source-of-truth detection for named domain concepts
- Naming claim verification against observable behavior
- Judgment on whether the right problem is being solved
- Evaluation of patterns against team context and trajectory
- Detection of meaning violations when intent was never recorded
- Architectural counterfactual reasoning at system scale
- Recognizing when structurally correct code encodes a false domain assumption
When in doubt, ask: does this codebase read as if it has a coherent author? Inconsistency — in patterns, naming, file structure, error handling approach — signals absent governance. Governance failures compound. Flag them early.