esz135888 · May 23, 2026 19:25
diff --git a/acceptance-tests.md b/acceptance-tests.md
diff --git a/ai-prediction-calibration-gate.html b/ai-prediction-calibration-gate.html
 <!doctype html>
 <html lang="en">
 <head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>AI Prediction Verification Calibration Gate</title>
  <style>
    :root {
      --ink: #171717;
      --muted: #66615b;
      --line: #d8d1c7;
      --paper: #faf8f3;
      --panel: #ffffff;
      --amber: #d28a16;
      --blue: #1e5d8f;
      --green: #19705f;
      --red: #b14d42;
      --shadow: 0 16px 40px rgba(38, 33, 26, 0.10);
    }
    * { box-sizing: border-box; }
    body {
      margin: 0;
      background: var(--paper);
      color: var(--ink);
      font-family: ui-serif, Georgia, "Times New Roman", serif;
      line-height: 1.45;
    }
    header {
      padding: 42px 6vw 30px;
      border-bottom: 1px solid var(--line);
      background: linear-gradient(90deg, #fffdf8 0%, #f5efe4 100%);
    }
    .eyebrow {
      color: var(--blue);
      font: 700 12px/1.2 ui-monospace, SFMono-Regular, Menlo, monospace;
      letter-spacing: .08em;
      text-transform: uppercase;
    }
    h1 {
      max-width: 980px;
      margin: 12px 0 12px;
      font-size: clamp(36px, 6vw, 76px);
      line-height: .95;
      letter-spacing: 0;
    }
    .lede {
      max-width: 900px;
      color: var(--muted);
      font-size: 20px;
    }
    main {
      padding: 28px 6vw 56px;
      display: grid;
      gap: 22px;
    }
    section {
      background: var(--panel);
      border: 1px solid var(--line);
      border-radius: 8px;
      box-shadow: var(--shadow);
      padding: 24px;
    }
    h2 {
      margin: 0 0 16px;
      font-size: 24px;
    }
    .grid {
      display: grid;
      grid-template-columns: repeat(4, minmax(0, 1fr));
      gap: 14px;
    }
    .two {
      display: grid;
      grid-template-columns: minmax(0, 1fr) minmax(0, 1fr);
      gap: 16px;
    }
    .card {
      border: 1px solid var(--line);
      border-radius: 8px;
      padding: 16px;
      background: #fffdf8;
    }
    .tag {
      display: inline-block;
      border: 1px solid var(--line);
      border-radius: 999px;
      padding: 3px 9px;
      margin-bottom: 10px;
      color: var(--muted);
      font: 700 11px/1.2 ui-monospace, SFMono-Regular, Menlo, monospace;
    }
    ul, ol { margin: 0; padding-left: 20px; }
    li { margin: 7px 0; }
    table {
      width: 100%;
      border-collapse: collapse;
      font-size: 15px;
    }
    th, td {
      border-bottom: 1px solid var(--line);
      padding: 10px 8px;
      text-align: left;
      vertical-align: top;
    }
    th {
      color: var(--blue);
      font: 700 12px/1.2 ui-monospace, SFMono-Regular, Menlo, monospace;
      text-transform: uppercase;
      letter-spacing: .04em;
    }
    .status-pass { color: var(--green); font-weight: 700; }
    .status-watch { color: var(--amber); font-weight: 700; }
    .status-stop { color: var(--red); font-weight: 700; }
    code {
      background: #f2ece1;
      border: 1px solid var(--line);
      border-radius: 5px;
      padding: 1px 5px;
      font-family: ui-monospace, SFMono-Regular, Menlo, monospace;
      font-size: .92em;
    }
    @media (max-width: 920px) {
      .grid, .two { grid-template-columns: 1fr; }
      header, main { padding-left: 18px; padding-right: 18px; }
    }
  </style>
 </head>
 <body>
  <header>
    <div class="eyebrow">PLS purpose_e2e_toolbox_v2 / primary_artifact</div>
    <h1>AI Prediction Verification Calibration Gate</h1>
    <p class="lede">This pack prevents another vague rebuild of the prediction verification module. It defines the gate that decides whether the team should accept the label policy, seed evidence, run a batch trial, or reopen data-source work before any new engineering cycle is dispatched.</p>
  </header>

  <main>
    <section>
      <h2>Thirty Day Path</h2>
      <div class="grid">
        <div class="card"><span class="tag">D1</span><strong>Policy lock</strong><br>Louis accepts the hit/miss/unknown label policy and selects 10 prior AI review predictions as seed cases.</div>
        <div class="card"><span class="tag">D7</span><strong>Batch trial</strong><br>50 predictions are auto-labeled from signals, action items, commits, worker logs, and review notes. Unknown rate must be below 25%.</div>
        <div class="card"><span class="tag">D14</span><strong>Correction routing</strong><br>Miss reasons route to owners: direction gap, evidence gap, resource gap, authorization gap, or execution drift.</div>
        <div class="card"><span class="tag">D30</span><strong>Review operating loop</strong><br>Calibration score becomes part of the weekly company AI review, with trend, owner, risk, and next action visible in PLS.</div>
      </div>
    </section>

    <section>
      <h2>Purpose To Purpose E2E</h2>
      <div class="two">
        <div class="card">
          <span class="tag">Flow</span>
          <ol>
            <li>AI review produces a prediction with owner, due date, confidence, and expected evidence.</li>
            <li>PLS collects signals, action items, commits, deployment logs, worker completions, and human review notes.</li>
            <li>The matcher assigns <code>hit</code>, <code>miss</code>, or <code>unknown</code> plus evidence links.</li>
            <li>Human reviewer samples labels and records override reasons.</li>
            <li>PLS opens correction tasks for repeated miss reasons.</li>
            <li>Next reviews use calibrated confidence, fewer vague predictions, and better owner routing.</li>
          </ol>
        </div>
        <div class="card">
          <span class="tag">Measurable end state</span>
          <ul>
            <li>Every reviewed prediction has evidence provenance or an explicit data gap.</li>
            <li>Unknown labels fall below 25% by D7 and below 15% by D30.</li>
            <li>Miss reasons create action items instead of passive commentary.</li>
            <li>Weekly review decisions show whether AI predictions improved project, money, or risk indicators.</li>
          </ul>
        </div>
      </div>
    </section>

    <section>
      <h2>Preflight Decision Gate</h2>
      <table>
        <thead><tr><th>Condition</th><th>Decision</th><th>Owner</th><th>Acceptance</th></tr></thead>
        <tbody>
          <tr><td>Prior production pack exists but label policy is not accepted.</td><td class="status-stop">Stop rebuild</td><td>Louis</td><td>Accept label policy and seed set before engineering.</td></tr>
          <tr><td>Policy accepted but fewer than 10 seed predictions exist.</td><td class="status-watch">Seed first</td><td>Louis + zihrou</td><td>10 predictions with expected evidence and review date.</td></tr>
          <tr><td>Seed set exists but D7 batch has not run.</td><td class="status-watch">Run batch trial</td><td>iron</td><td>50 labels, unknown below 25%, reviewer sample completed.</td></tr>
          <tr><td>Unknown rate above 25%.</td><td class="status-stop">Open data-source gap</td><td>iron</td><td>Missing source mapped to API/sync owner and due date.</td></tr>
          <tr><td>Hit/miss labels stable and correction routes active.</td><td class="status-pass">Proceed to productization</td><td>Louis</td><td>D14 correction routing and D30 review dashboard accepted.</td></tr>
        </tbody>
      </table>
    </section>

    <section>
      <h2>Value And Money Path</h2>
      <div class="two">
        <div class="card">
          <span class="tag">Economic logic</span>
          <ul>
            <li>Revenue: better AI review predictions identify accounts and internal projects that are ready for monetizable delivery.</li>
            <li>Cost: fewer repeated AI work dispatches when the real blocker is policy, seed data, or adoption.</li>
            <li>Risk: false confidence is exposed before it shapes staffing, roadmap, or client commitments.</li>
            <li>Conversion: prediction evidence gives teams a clearer reason to adopt AI operating routines.</li>
          </ul>
        </div>
        <div class="card">
          <span class="tag">Human capability lift</span>
          <ul>
            <li>Louis gets a calibration view instead of another text-only status report.</li>
            <li>zihrou can separate direction, resource, and authorization misses.</li>
            <li>iron can see exact evidence gaps for worker, repo, and signal ingestion.</li>
            <li>Project owners learn to write predictions that are testable, not performative.</li>
          </ul>
        </div>
      </div>
    </section>

    <section>
      <h2>Solution Stack</h2>
      <table>
        <thead><tr><th>Layer</th><th>Production decision</th><th>Artifact in pack</th></tr></thead>
        <tbody>
          <tr><td>Context framework</td><td>Prediction is a claim with expected evidence, owner, due date, confidence, and review window.</td><td><code>production-brief.md</code></td></tr>
          <tr><td>Workflow</td><td>Policy accept -> seed -> batch match -> reviewer sample -> correction route -> review dashboard.</td><td>This HTML gate</td></tr>
          <tr><td>Data model</td><td>Prediction, evidence event, match result, reviewer override, correction task, calibration summary.</td><td><code>data-model.md</code></td></tr>
          <tr><td>Tool/app</td><td>PLS dispatcher preflight that blocks duplicate build jobs until acceptance gates are met.</td><td>This HTML gate</td></tr>
          <tr><td>Acceptance</td><td>Unknown threshold, reviewer sample rate, owner/due, and correction routing are testable.</td><td><code>acceptance-tests.md</code></td></tr>
          <tr><td>Adoption upgrade</td><td>Weekly scorecard and learning memory tell the next worker what must happen next.</td><td><code>learning-memory.json</code></td></tr>
        </tbody>
      </table>
    </section>

    <section>
      <h2>Owner, Due, Acceptance</h2>
      <ul>
        <li>Owner: Louis. Reviewers: zihrou and iron.</li>
        <li>Due: 2026-05-27 for policy acceptance and 10 seed predictions; 2026-05-31 for first 50-case batch trial.</li>
        <li>Acceptance: label policy accepted, seed set complete, D7 unknown rate below 25%, reviewer sample finished, decision record present, and next correction task opened.</li>
        <li>People sync: send only short LINE summary; primary durable artifact is this pack and its Gist URL.</li>
      </ul>
    </section>
  </main>
 </body>
 </html>
diff --git a/artifact-url-or-pr.md b/artifact-url-or-pr.md
diff --git a/data-model.md b/data-model.md
diff --git a/decision-record.md b/decision-record.md
diff --git a/learning-memory.json b/learning-memory.json
 {
  "job_id": "a2b47d9e-e9b0-4895-9316-70205f378c54",
  "project": "AI native project: company AI maximization",
  "topic": "AI prediction verification",
  "memory_type": "learning_memory",
  "next_worker_instruction": "Before building another AI prediction verification module, check the calibration gate. If label policy is not accepted, route to policy_acceptance. If fewer than 10 seed predictions exist, route to seed_predictions. If no 50-case trial exists, route to batch_trial. If unknown rate is above 25%, route to data_source_gap. Only productize dashboard/backend after correction routing is active.",
  "owners": {
    "primary": "Louis",
    "reviewers": ["zihrou", "iron"]
  },
  "acceptance": [
    "Label policy accepted",
    "10 seed predictions selected",
    "50-case batch trial completed",
    "Unknown rate below 25% by D7",
    "Reviewer sample completed",
    "Repeated miss reasons create correction tasks",
    "Decision record remains attached"
  ],
  "artifact_files": [
    "ai-prediction-calibration-gate.html",
    "production-brief.md",
    "data-model.md",
    "acceptance-tests.md",
    "decision-record.md",
    "sources.md",
    "artifact-url-or-pr.md"
  ]
 }
diff --git a/production-brief.md b/production-brief.md
diff --git a/sources.md b/sources.md
Test	Method	Pass Criteria
Label policy accepted	Review decision record and owner sync.	Louis has accepted `hit/miss/unknown` policy and miss taxonomy.
Seed set complete	Query `prediction_claim` seed list.	At least 10 seed predictions have owner, due, confidence, expected evidence, impact metric.
Batch trial complete	Run matcher on 50 claims.	50 labels created with evidence provenance or explicit unknown reason.
Unknown rate threshold	Compute `unknown / total`.	Unknown rate below 25% by D7; below 15% by D30.
Reviewer sample	Sample match records.	At least 10% reviewed by Louis, zihrou, iron, or delegate.
Correction routing	Inspect misses.	Every repeated miss reason has a correction task with owner, due, and acceptance.
Duplicate dispatch prevention	Simulate repeat job request.	Dispatcher routes to next missing gate instead of creating another generic build.
Audit evidence	Inspect evidence events.	Every label references source refs or an explicit source gap.
	<!doctype html>
	<html lang="en">
	<head>
	<meta charset="utf-8">
	<meta name="viewport" content="width=device-width, initial-scale=1">
	<title>AI Prediction Verification Calibration Gate</title>
	<style>
	:root {
	--ink: #171717;
	--muted: #66615b;
	--line: #d8d1c7;
	--paper: #faf8f3;
	--panel: #ffffff;
	--amber: #d28a16;
	--blue: #1e5d8f;
	--green: #19705f;
	--red: #b14d42;
	--shadow: 0 16px 40px rgba(38, 33, 26, 0.10);
	}
	* { box-sizing: border-box; }
	body {
	margin: 0;
	background: var(--paper);
	color: var(--ink);
	font-family: ui-serif, Georgia, "Times New Roman", serif;
	line-height: 1.45;
	}
	header {
	padding: 42px 6vw 30px;
	border-bottom: 1px solid var(--line);
	background: linear-gradient(90deg, #fffdf8 0%, #f5efe4 100%);
	}
	.eyebrow {
	color: var(--blue);
	font: 700 12px/1.2 ui-monospace, SFMono-Regular, Menlo, monospace;
	letter-spacing: .08em;
	text-transform: uppercase;
	}
	h1 {
	max-width: 980px;
	margin: 12px 0 12px;
	font-size: clamp(36px, 6vw, 76px);
	line-height: .95;
	letter-spacing: 0;
	}
	.lede {
	max-width: 900px;
	color: var(--muted);
	font-size: 20px;
	}
	main {
	padding: 28px 6vw 56px;
	display: grid;
	gap: 22px;
	}
	section {
	background: var(--panel);
	border: 1px solid var(--line);
	border-radius: 8px;
	box-shadow: var(--shadow);
	padding: 24px;
	}
	h2 {
	margin: 0 0 16px;
	font-size: 24px;
	}
	.grid {
	display: grid;
	grid-template-columns: repeat(4, minmax(0, 1fr));
	gap: 14px;
	}
	.two {
	display: grid;
	grid-template-columns: minmax(0, 1fr) minmax(0, 1fr);
	gap: 16px;
	}
	.card {
	border: 1px solid var(--line);
	border-radius: 8px;
	padding: 16px;
	background: #fffdf8;
	}
	.tag {
	display: inline-block;
	border: 1px solid var(--line);
	border-radius: 999px;
	padding: 3px 9px;
	margin-bottom: 10px;
	color: var(--muted);
	font: 700 11px/1.2 ui-monospace, SFMono-Regular, Menlo, monospace;
	}
	ul, ol { margin: 0; padding-left: 20px; }
	li { margin: 7px 0; }
	table {
	width: 100%;
	border-collapse: collapse;
	font-size: 15px;
	}
	th, td {
	border-bottom: 1px solid var(--line);
	padding: 10px 8px;
	text-align: left;
	vertical-align: top;
	}
	th {
	color: var(--blue);
	font: 700 12px/1.2 ui-monospace, SFMono-Regular, Menlo, monospace;
	text-transform: uppercase;
	letter-spacing: .04em;
	}
	.status-pass { color: var(--green); font-weight: 700; }
	.status-watch { color: var(--amber); font-weight: 700; }
	.status-stop { color: var(--red); font-weight: 700; }
	code {
	background: #f2ece1;
	border: 1px solid var(--line);
	border-radius: 5px;
	padding: 1px 5px;
	font-family: ui-monospace, SFMono-Regular, Menlo, monospace;
	font-size: .92em;
	}
	@media (max-width: 920px) {
	.grid, .two { grid-template-columns: 1fr; }
	header, main { padding-left: 18px; padding-right: 18px; }
	}
	</style>
	</head>
	<body>
	<header>
	<div class="eyebrow">PLS purpose_e2e_toolbox_v2 / primary_artifact</div>
	<h1>AI Prediction Verification Calibration Gate</h1>
	<p class="lede">This pack prevents another vague rebuild of the prediction verification module. It defines the gate that decides whether the team should accept the label policy, seed evidence, run a batch trial, or reopen data-source work before any new engineering cycle is dispatched.</p>
	</header>

	<main>
	<section>
	<h2>Thirty Day Path</h2>
	<div class="grid">
	<div class="card"><span class="tag">D1</span><strong>Policy lock</strong><br>Louis accepts the hit/miss/unknown label policy and selects 10 prior AI review predictions as seed cases.</div>
	<div class="card"><span class="tag">D7</span><strong>Batch trial</strong><br>50 predictions are auto-labeled from signals, action items, commits, worker logs, and review notes. Unknown rate must be below 25%.</div>
	<div class="card"><span class="tag">D14</span><strong>Correction routing</strong><br>Miss reasons route to owners: direction gap, evidence gap, resource gap, authorization gap, or execution drift.</div>
	<div class="card"><span class="tag">D30</span><strong>Review operating loop</strong><br>Calibration score becomes part of the weekly company AI review, with trend, owner, risk, and next action visible in PLS.</div>
	</div>
	</section>

	<section>
	<h2>Purpose To Purpose E2E</h2>
	<div class="two">
	<div class="card">
	<span class="tag">Flow</span>
	<ol>
	<li>AI review produces a prediction with owner, due date, confidence, and expected evidence.</li>
	<li>PLS collects signals, action items, commits, deployment logs, worker completions, and human review notes.</li>
	<li>The matcher assigns <code>hit</code>, <code>miss</code>, or <code>unknown</code> plus evidence links.</li>
	<li>Human reviewer samples labels and records override reasons.</li>
	<li>PLS opens correction tasks for repeated miss reasons.</li>
	<li>Next reviews use calibrated confidence, fewer vague predictions, and better owner routing.</li>
	</ol>
	</div>
	<div class="card">
	<span class="tag">Measurable end state</span>
	<ul>
	<li>Every reviewed prediction has evidence provenance or an explicit data gap.</li>
	<li>Unknown labels fall below 25% by D7 and below 15% by D30.</li>
	<li>Miss reasons create action items instead of passive commentary.</li>
	<li>Weekly review decisions show whether AI predictions improved project, money, or risk indicators.</li>
	</ul>
	</div>
	</div>
	</section>

	<section>
	<h2>Preflight Decision Gate</h2>
	<table>
	<thead><tr><th>Condition</th><th>Decision</th><th>Owner</th><th>Acceptance</th></tr></thead>
	<tbody>
	<tr><td>Prior production pack exists but label policy is not accepted.</td><td class="status-stop">Stop rebuild</td><td>Louis</td><td>Accept label policy and seed set before engineering.</td></tr>
	<tr><td>Policy accepted but fewer than 10 seed predictions exist.</td><td class="status-watch">Seed first</td><td>Louis + zihrou</td><td>10 predictions with expected evidence and review date.</td></tr>
	<tr><td>Seed set exists but D7 batch has not run.</td><td class="status-watch">Run batch trial</td><td>iron</td><td>50 labels, unknown below 25%, reviewer sample completed.</td></tr>
	<tr><td>Unknown rate above 25%.</td><td class="status-stop">Open data-source gap</td><td>iron</td><td>Missing source mapped to API/sync owner and due date.</td></tr>
	<tr><td>Hit/miss labels stable and correction routes active.</td><td class="status-pass">Proceed to productization</td><td>Louis</td><td>D14 correction routing and D30 review dashboard accepted.</td></tr>
	</tbody>
	</table>
	</section>

	<section>
	<h2>Value And Money Path</h2>
	<div class="two">
	<div class="card">
	<span class="tag">Economic logic</span>
	<ul>
	<li>Revenue: better AI review predictions identify accounts and internal projects that are ready for monetizable delivery.</li>
	<li>Cost: fewer repeated AI work dispatches when the real blocker is policy, seed data, or adoption.</li>
	<li>Risk: false confidence is exposed before it shapes staffing, roadmap, or client commitments.</li>
	<li>Conversion: prediction evidence gives teams a clearer reason to adopt AI operating routines.</li>
	</ul>
	</div>
	<div class="card">
	<span class="tag">Human capability lift</span>
	<ul>
	<li>Louis gets a calibration view instead of another text-only status report.</li>
	<li>zihrou can separate direction, resource, and authorization misses.</li>
	<li>iron can see exact evidence gaps for worker, repo, and signal ingestion.</li>
	<li>Project owners learn to write predictions that are testable, not performative.</li>
	</ul>
	</div>
	</div>
	</section>

	<section>
	<h2>Solution Stack</h2>
	<table>
	<thead><tr><th>Layer</th><th>Production decision</th><th>Artifact in pack</th></tr></thead>
	<tbody>
	<tr><td>Context framework</td><td>Prediction is a claim with expected evidence, owner, due date, confidence, and review window.</td><td><code>production-brief.md</code></td></tr>
	<tr><td>Workflow</td><td>Policy accept -> seed -> batch match -> reviewer sample -> correction route -> review dashboard.</td><td>This HTML gate</td></tr>
	<tr><td>Data model</td><td>Prediction, evidence event, match result, reviewer override, correction task, calibration summary.</td><td><code>data-model.md</code></td></tr>
	<tr><td>Tool/app</td><td>PLS dispatcher preflight that blocks duplicate build jobs until acceptance gates are met.</td><td>This HTML gate</td></tr>
	<tr><td>Acceptance</td><td>Unknown threshold, reviewer sample rate, owner/due, and correction routing are testable.</td><td><code>acceptance-tests.md</code></td></tr>
	<tr><td>Adoption upgrade</td><td>Weekly scorecard and learning memory tell the next worker what must happen next.</td><td><code>learning-memory.json</code></td></tr>
	</tbody>
	</table>
	</section>

	<section>
	<h2>Owner, Due, Acceptance</h2>
	<ul>
	<li>Owner: Louis. Reviewers: zihrou and iron.</li>
	<li>Due: 2026-05-27 for policy acceptance and 10 seed predictions; 2026-05-31 for first 50-case batch trial.</li>
	<li>Acceptance: label policy accepted, seed set complete, D7 unknown rate below 25%, reviewer sample finished, decision record present, and next correction task opened.</li>
	<li>People sync: send only short LINE summary; primary durable artifact is this pack and its Gist URL.</li>
	</ul>
	</section>
	</main>
	</body>
	</html>
Field	Type	Required	Notes
id	uuid	yes	Stable prediction id.
project_id	uuid	yes	PLS project or AI-native project id.
review_id	uuid	yes	Source AI review.
claim_text	text	yes	Testable prediction statement.
owner_person_id	text	yes	Human accountable owner.
due_at	timestamptz	yes	Date when evidence should exist.
confidence	numeric	yes	0 to 1 model confidence.
expected_evidence	jsonb	yes	Source types and matching hints.
impact_metric	text	yes	Revenue, cost, risk, conversion, labor, or delivery metric.
status	enum	yes	active, ready_for_match, matched, archived.
Field	Type	Required	Notes
id	uuid	yes	Evidence event id.
source_type	enum	yes	signal, action_item, github_commit, deployment, worker_completion, line_note, drive_doc, review_note.
source_ref	text	yes	URL or source id.
event_at	timestamptz	yes	When evidence happened.
actor_person_id	text	no	Human or worker actor.
project_id	uuid	no	Joined project if known.
payload	jsonb	yes	Raw normalized evidence.
audit_hash	text	yes	Tamper-evident hash of normalized payload.
Field	Type	Required	Notes
id	uuid	yes	Match id.
prediction_claim_id	uuid	yes	Linked claim.
label	enum	yes	hit, miss, unknown.
label_confidence	numeric	yes	0 to 1.
evidence_event_ids	uuid[]	no	Supporting events.
miss_reason	enum	no	direction_gap, evidence_gap, resource_gap, authorization_gap, execution_drift, timing_gap.
matcher_version	text	yes	Model/rule version.
created_at	timestamptz	yes	Match time.
Field	Type	Required	Notes
id	uuid	yes	Override id.
prediction_match_id	uuid	yes	Match reviewed.
reviewer_person_id	text	yes	Louis, zihrou, iron, or delegate.
override_label	enum	no	hit, miss, unknown.
override_reason	text	yes	Why the machine label was accepted or changed.
created_at	timestamptz	yes	Review time.
Field	Type	Required	Notes
id	uuid	yes	PLS action item id.
prediction_match_id	uuid	yes	Source miss.
owner_person_id	text	yes	Responsible owner.
due_at	timestamptz	yes	Correction due date.
task_type	enum	yes	revise_prediction, add_source_adapter, clarify_direction, add_resource, authorize_decision, fix_execution.
acceptance	text	yes	Completion condition.
status	enum	yes	open, blocked, done, cancelled.
Field	Type	Required	Notes
id	uuid	yes	Summary id.
window_start	date	yes	Reporting window.
window_end	date	yes	Reporting window.
total_predictions	integer	yes	Count.
hit_rate	numeric	yes	Hits / matched.
miss_rate	numeric	yes	Misses / matched.
unknown_rate	numeric	yes	Unknown / total.
top_miss_reason	text	no	Repeated issue.
next_gate	enum	yes	policy_acceptance, seed_predictions, batch_trial, data_gap, correction_routing, dashboard_rollout.
	{
	"job_id": "a2b47d9e-e9b0-4895-9316-70205f378c54",
	"project": "AI native project: company AI maximization",
	"topic": "AI prediction verification",
	"memory_type": "learning_memory",
	"next_worker_instruction": "Before building another AI prediction verification module, check the calibration gate. If label policy is not accepted, route to policy_acceptance. If fewer than 10 seed predictions exist, route to seed_predictions. If no 50-case trial exists, route to batch_trial. If unknown rate is above 25%, route to data_source_gap. Only productize dashboard/backend after correction routing is active.",
	"owners": {
	"primary": "Louis",
	"reviewers": ["zihrou", "iron"]
	},
	"acceptance": [
	"Label policy accepted",
	"10 seed predictions selected",
	"50-case batch trial completed",
	"Unknown rate below 25% by D7",
	"Reviewer sample completed",
	"Repeated miss reasons create correction tasks",
	"Decision record remains attached"
	],
	"artifact_files": [
	"ai-prediction-calibration-gate.html",
	"production-brief.md",
	"data-model.md",
	"acceptance-tests.md",
	"decision-record.md",
	"sources.md",
	"artifact-url-or-pr.md"
	]
	}
Horizon	Outcome	Owner	Acceptance
D1	Label policy accepted and 10 prior predictions selected as seed cases.	Louis	Each seed has owner, due date, confidence, expected evidence, and review window.
D7	50 predictions are batch-labeled from available PLS evidence.	iron	Unknown rate below 25%; 10% reviewer sample completed.
D14	Miss reasons create correction tasks.	zihrou + iron	Misses are categorized as direction gap, evidence gap, resource gap, authorization gap, or execution drift.
D30	Calibration becomes part of weekly company AI review.	Louis	Dashboard shows hit rate, unknown rate, repeated miss reason, owner, due date, and money/risk impact.