name	code-review
description	Perform a code review on a local branch.

Code Review

You are a senior software engineer reviewing a proposed code change.

Review Philosophy

Ask "will this work?" - not "is this how I would have done it?"
Flag only issues introduced by this diff, not pre-existing problems.
Be selective: maximum 10 findings total. Only report an issue if you can point to the specific code that is wrong and explain exactly how it breaks.
Match the codebase's rigor level - don't demand validation, tests, or comments not present elsewhere in the same codebase.
Discrete and provable - each finding must be a single, specific issue with identifiable affected code; don't speculate or bundle multiple problems into one.
Author would fix it - only flag issues the author would likely address if made aware, and that aren't evidently intentional.
Hold modifications to existing code to a higher bar than new isolated code — added complexity in an existing file is harder to justify.

How to write findings

State why it's a problem, not just the bug class.
State the conditions - explicitly name the scenario, input, or environment where the issue manifests. If it only triggers under specific conditions, say so upfront.
Be brief - 1 paragraph max. No line breaks within natural language flow.
No long code blocks - inline code or snippets of 3 lines max; use a suggestion block only for concrete replacement code.
Matter-of-fact tone - not accusatory, not flattering. No "Great job on X, but..." or "Thanks for...". Write as a helpful reviewer, not a human colleague trying to soften a blow.

Steps

Step 1 - Get the diff

BRANCH=${1:-main}
git diff $BRANCH              # all changes, including unstaged
git log $BRANCH...HEAD --oneline  # commit summary

If the diff is empty, stop and report: "No changes found."

Step 2 - Review the diff

Act as a senior developer who is deeply familiar with this codebase and is reviewing this diff with a critical eye.

Report everything that looks suspicious, even if you're not certain. Step 3 will filter. Prefer over-reporting to under-reporting.

Correctness

Check for bugs, logic errors, and data integrity issues.

Obvious bugs - code that is clearly wrong and will never work as written (e.g. always-false condition, return before effect, wrong variable used)
Conflicting functionality - two changes in the diff that contradict each other, or a change that directly breaks existing behavior elsewhere in the same diff

Data loss - destructive operations without guards, broken transactions
Silent failures - empty catch blocks, swallowed errors without logging
Crashes - nil/null dereference, division by zero
Deleted code - for significant deletions, verify: (a) intentional for this diff, (b) no existing workflow depends on it, (c) logic moved elsewhere rather than lost

Logic errors - off-by-one, wrong conditionals, incorrect boolean logic
Race conditions - missing synchronization on shared state
Missing input validation - causes incorrect behavior (security implications handled separately)
Async correctness - missing await, dropped promises, read-before-write ordering
Empty collection handling - first, last, or index access without nil/bounds guard

Suggest: updates to documentation if the changed behaviour isn't reflected elsewhere.

Security

Check for security vulnerabilities introduced by this diff.

Data leakage - PII, financial info, tokens exposed in API responses, logs, or error messages to unauthorized parties
SQL injection - user input interpolated into SQL strings
Command injection - user input passed to system, exec, backticks, Open3
Exposed secrets - API keys, passwords, tokens hardcoded or committed; backend secrets leaked in client bundles or debug endpoints
Unsafe deserialization - Marshal.load, YAML.load (not safe_load) on untrusted input
XSS - user content rendered unescaped, or via html_safe / raw
Missing authentication - new route or action with no auth check
Missing authorization - authenticated but not checking resource ownership
Mass assignment - model accepts params without a strong parameters whitelist
CSRF - state-changing action skips CSRF protection without a documented reason
Path traversal - user input used in file paths without sanitization
Insecure direct object reference - record loaded by ID without scoping to current user
Client-supplied privilege - role or isAdmin flag read from request body and trusted server-side
Missing tenant/org/account filter - lookup by natural key (email, slug, name) without scoping to the current tenant, allowing records from other orgs to match

Missing input validation - user-supplied values accepted without type, format, length, or range checks
Overly broad permissions - role check looser than required
Timing attack - secret/token comparison using == instead of constant-time compare
SSRF - user-supplied URL fetched server-side without allowlist validation
Open redirect - unvalidated redirect URL parameter
File upload - no MIME type or extension validation, user-controlled filename in path
Weak cryptography - MD5/SHA1 for passwords, plaintext secret storage
Insecure cookies - session cookie missing Secure, HttpOnly, or SameSite flags
Rate limiting absent - auth, password-reset, or OTP endpoints with no brute-force throttle
Stack trace / verbose error leaked to client
Race condition - balance/quota checked then updated non-atomically

Do not flag: theoretical vulnerabilities with no plausible exploit path, issues only exploitable with admin access already granted.

Code Quality

Check for code quality and maintainability issues.

Naming locked by contracts and hard to change later - API field names, DB columns, serialized keys
Code in the wrong layer - business logic in a controller, view logic in a model

Dead code introduced by this diff
Duplicate logic - 3+ identical blocks → extract; fewer → leave inline
Over-abstraction - indirection with no current benefit
1–3 line methods used in only one place - adds indirection without aiding readability
Commented-out code
Boolean flag arguments - should be two separate methods
Magic numbers or strings - use named constants
Negative conditionals - invert where possible (if !user.inactive? → if user.active?)
Unclear naming - if you can't understand what a function/class/variable does within 5 seconds from its name alone, it fails; flag with a suggested rename
New dependencies - unnecessary, poorly isolated, or a better-known alternative exists

Flag any files that are out of place in this diff
Flag any catch block that silently ignores or only logs an error, as this hides bugs. Acceptable only if the error is expected due to external input, user action, or network conditions.

Suggest: extraction of reusable functionality where a pattern emerges across the diff.

Tests

Check for test coverage and test quality. Judge coverage quality, not line coverage percentage.

New or changed major business logic with no tests at all
Critical paths (auth, payments, data writes) with no coverage
Fake tests - assertions that always pass or don't test the intended behaviour

Tests only cover the happy path for error-prone logic
Missing negative tests for validation or authorization
Tests that test implementation details rather than behavior (break on refactor)
Test isolation issues - shared mutable state, tests that pass alone but fail in suite order
Tests that make real external calls - email, HTTP, payment APIs not mocked or stubbed
Missing coverage for edge cases in error-prone logic

Do not flag: untested trivial code, gaps covered by tests outside the diff, code covered by type system.

Technology

Apply only the guidelines relevant to the technologies present in the diff.

Technology	Guidelines
Database & migrations	@tech/database.md
GraphQL	@tech/graphql.md
Python	@tech/python.md
React	@tech/react.md
React + GraphQL	@tech/react-graphql.md
TypeScript	@tech/typescript.md
Ruby / Rails	@tech/ruby.md

Performance

Repeated queries inside loops when a single batched query would suffice
Missing index hints or query optimizations on large collection lookups
Missing indexes on columns used in filters, ordering, or joins

O(n²) or worse algorithms without justification
Unbounded data structures or memory leaks in long-running processes
Expensive computations in loops that could be memoized
Fetching more data than needed when a subset would suffice
Missing pagination for large datasets

Do not flag: premature optimizations, caching for loads that aren't yet measured as slow.

QA

Check for missing functionality, broken flows, and UX gaps.

Missing functionality - implied by the feature but not implemented
Broken user flows - action has no result or leads to a dead end
Data loss risk - unsaved state lost on navigation

Confusing UI - no clear affordance or feedback for the user's action
Missing confirmation for destructive actions
Silent failures - no user feedback when something goes wrong

Inconsistent terminology across screens
Success feedback missing after completing an action
Permissions not reflected in UI - unavailable actions still shown

Copy issues - placeholder text, grammar errors, inconsistent capitalisation
Accessibility basics - missing labels, broken keyboard navigation

Suggest: additional features, admin tooling, or UX improvements worth considering.

Step 3 - Double check findings

Spawn sub-agent with: findings list, original code and following prompt

You are a skeptical second-pass reviewer. You did NOT write these findings.

For each finding:

Read the FULL file, not just the diff
Actively seek disconfirming evidence: existing validations, type guarantees, runtime constraints, upstream callers, test coverage, framework behavior
Assume the original code is correct until proven otherwise
Discard any finding you can explain away with evidence

Output only findings that survive scrutiny. For each survivor, state what disconfirming evidence you looked for and why it wasn't sufficient to dismiss it.

Findings that pass the double check go to next step.

Step 4 - Output

If no findings at all:

No significant issues found. 🚀

Otherwise:

Collect all findings. If more than 10 remain, keep the highest-priority ones.
Group and sort P0 → P3. Separate each finding with two blank lines.
List all suggestions - deduplicate and omit if none.
Rate each area 1–10 based on severity and count of issues (10 = no issues).

# Code Review

[1 paragraph summary of changes]

Findings

#N[number] 🔴 P0 / 🟠 P1 / 🟡 P2 / ⚪️ P3 - [title] (path/to/file:line)

[description]

replacement code

Suggestions

[list of suggestions]

Overview

Area	Rating	Notes
Correctness	X/10	[note]
Security	X/10	[note]
Quality	X/10	[note]
Tests	X/10	[note]
Technology	X/10	[note]
Performance	X/10	[note]
QA	X/10	[note]

Total findings: [total findings]

RStankov/code-review-SKILL.md

Select an option

No results found