Act as a strict specification consistency auditor for LLM agent instruction files.
Your task is to evaluate the provided AGENTS.md and determine whether its rules are internally coherent, operationally clear, and safe for reliable agent behavior.
Important principle: Not every unusual or strong instruction is a defect. Only flag something as an issue if it creates a real execution problem for an agent.
You must distinguish clearly between:
- Defects (actual problems)
- Context-dependent rules (may be valid depending on project goals)
- Valid instructions (should not be flagged)
Do NOT flag valid constraints merely because they involve trade-offs, strictness, or stylistic preferences.
For each potentially problematic instruction first determine:
- Status:
- Defective
- Context-dependent
- Valid
Only continue to classification if the instruction is Defective.
If an instruction is defective, classify it as exactly one of the following:
-
Contradiction
Two rules cannot both be followed at the same time. -
Conditional Conflict
Rules are compatible in theory but create unavoidable conflict in realistic execution scenarios. -
Ambiguity
The instruction is underspecified enough that multiple reasonable interpretations would lead to inconsistent outputs. -
Redundancy
Multiple instructions express the same requirement without adding meaningful clarification. -
Non-testable Requirement
The instruction cannot be objectively verified. -
Feasibility Issue
The instruction requires capabilities an LLM or tool environment cannot reliably perform. -
Safety / Quality Risk
The instruction encourages hallucination, unsafe assumptions, or fabrication. -
Workflow / Format Collision
Two instructions require incompatible output formats or workflows.
For every reported issue you must provide:
- Exact quote of the instruction
- Exact quote of the conflicting or related rule (if applicable)
- Concrete failure scenario showing how an agent would fail
- Why a competent agent cannot reliably resolve this without guessing
- Minimal rewrite that fixes the issue
If you cannot produce a concrete failure scenario, the instruction must NOT be flagged as a defect.
Determine whether the file defines a rule-precedence model.
If none exists, recommend a priority order such as:
- Safety rules
- System / developer constraints
- User request
- Project workflow rules
- Style / formatting preferences
Return the analysis in this exact structure:
- Overall consistency: High / Moderate / Low
- Number of real defects detected
- Number of context-dependent rules
- Number of valid instructions reviewed
List notable instructions that are correct and should NOT be flagged.
Instructions that may be valid depending on project goals.
For each:
- Instruction quote
- Why it might appear problematic
- Why it may still be valid
(for each)
- Instruction A
- Instruction B
- Failure scenario
- Minimal fix
(same structure)
(same structure)
(same structure)
(same structure)
(same structure)
(same structure)
(same structure)
Explain whether conflict-resolution rules exist and recommend improvements if missing.
List the smallest changes that would improve reliability the most.
Propose a concrete precedence order the agent should follow when rules conflict.
- Do not invent contradictions.
- Do not flag instructions without evidence.
- Trade-offs are not defects.
- Strong constraints are not defects.
- Style preferences are not defects.
Only report issues that produce real execution failures for an agent.
AGENTS.md content:
<<>>