Aviad Rozenhek aviadr1

PRD: Planning Quality — Closing the Issue→Plan Gap

March 2026. Informed by 5 rounds of structured exploration and 23 concrete GitHub failure cases.

Section 1 — Problem Statement

1.1 The Asymmetric Quality Problem

Issue #960: [L2] Issue clustering and category detection — recognize when individual issues are symptoms of a systemic gap requiring meta-level prevention

Pre-Planning Research

Collision Detection

Open PRs matching "cluster category detect": none found.
Closed issues matching "cluster category detect": #463 (Workflow task expectations, closed 2026-03-22) and #504 (Batch auto-dent harness, closed 2026-03-23) — neither overlaps with clustering.
No active work in flight on this feature area.

Extended GitHub Evidence: Issue→Plan Quality Failures in Garsson-io/kaizen

Forensic research pass. Excludes the 8 already-documented cases (PR #832/#666, PR #816/#814, PR #894/#891, PR #970/#966, issues #940/#957, issue #724, issue #901).

Additional Cases (15+ examples)

R5: Grand Synthesis — The Full Picture

March 2026

1. What We Now Know for Certain

About the failure mode at the issue→plan boundary:

R4: Concrete Skill Text — Plan Formation Phase

March 2026 Target files: kaizen-evaluate/SKILL.md (new phase), kaizen-implement/SKILL.md (plan schema)

1. The New "Plan Formation Phase"

This phase inserts between Phase 3.7 (Architecture & Tooling Fitness) and Phase 5 (Ask the Admin). It is Phase 4.5: Plan Formation.

Round 3: Category Library and Hypothesis Formation Protocol

March 2026 — Synthesis layer

1. What a Category Is

A category is not a label. It is a predictive structure: a named pattern of problem-space geometry that allows an agent to load failure-specific priors before committing to a design.

The Contrarian Antithesis: Round 1 Solved the Wrong Problem

Round 2 — March 2026

1. Steel-Manning Round 1, Then Burning It Down

Round 1 produced three complementary lenses on a single question: when should an agent pause before implementing? Decision Theory gave a scoring formula (DPS = R×I×S). Cognitive Ethnography gave a cognitive science grounding (RPD, tacit knowledge, Design Stance Protocol). Signal Archaeology (not completed in R1) would presumably mine git history for risk signals.

The Tacit Judgment: A Cognitive Ethnography of "This Needs Design First"

A cognitive science exploration for the kaizen project Round 1 — March 2026

1. The Expert Cognition Model: How Senior Engineers Actually Make This Judgment

The framing of "checklist vs. intuition" is a false dichotomy, and it points to the exact cognitive science literature we need. Gary Klein's Recognition-Primed Decision (RPD) model describes how experienced practitioners make decisions under time pressure and uncertainty: they don't evaluate options against criteria. They pattern-match a situation to a prototype, which immediately surfaces a candidate action, which they then mentally simulate. If the simulation runs without catastrophic failure, they act. They don't compare alternatives — they evaluate one option.

	# Engineering Output & Quality Audit
	You are auditing a codebase to answer: What was actually built, how complex was it really, how long did it take, and how stable is it?
	Ignore commit counts - they measure activity, not output. Focus on deliverables. FOR THE LAST 90 days.
	## PHASE 1: Identify What Was Actually Built
	### 1.1 Discover Distinct Deliverables
	```bash
	# Find feature areas by looking at what directories changed
	git log --since="YYYY-MM-DD" --name-only --pretty=format: \| grep -E "^[a-z]" \| cut -d'/' -f1-3 \| sort \| uniq -c \| sort -rn \| head -30
	# Find ticket/feature references in commit messages
	git log --since="YYYY-MM-DD" --pretty=format:"%s" \| grep -oE "[A-Z]+-[0-9]+" \| sort \| uniq -c \| sort -rn

	"""
	File: gv/ai/common/llm/json_schema.py
	Author: Aviad Rozenhek

	OpenAI Structured Outputs (`response_format={"type":"json_schema"}`) supports only a subset of JSON Schema.
	Many perfectly valid Pydantic constructs won't fly as-is. Use these patterns:

	-------------------------------------------------------------------------
	1) Optional / nullable / Default fields
	-------------------------------------------------------------------------