Skip to content

Instantly share code, notes, and snippets.

@esz135888
Created May 23, 2026 21:23
Show Gist options
  • Select an option

  • Save esz135888/7985708bb3790886c55073c5d050d41d to your computer and use it in GitHub Desktop.

Select an option

Save esz135888/7985708bb3790886c55073c5d050d41d to your computer and use it in GitHub Desktop.
PLS Capability Repair Dispatch Breaker - job 3bc391bc

E2E Acceptance Tests

A. Dispatch Breaker

  1. Given the same ai_native_project_id has 2+ completed durable artifacts and no target_repo_url,
  2. When a new production-delivery job is requested,
  3. Then PLS must create a dispatch_repair_record and block generic project_runner completion as the next lane.

Expected evidence: next job payload contains missing_capability=target_repo_url and next_worker_kind=repo_change/github_pr.

B. Repo Route

  1. Given target_repo_url exists,
  2. When POST /api/dispatch/repair-route runs,
  3. Then the job is routed to repo_change or github_pr, not project_runner.

Expected evidence: job requires.worker_kinds includes repo_change or github_pr.

C. Prediction Verification

  1. Given ai_review_predictions has predictions for a review,
  2. And GitHub/LINE/action items contain signals within the evidence window,
  3. When verification runs,
  4. Then each prediction receives hit/miss/partial/unknown with linked evidence.

Expected evidence: scorecard contains hit rate, miss reasons, and correction actions.

D. People Adoption

  1. Given scorecard is generated,
  2. When PLS syncs people,
  3. Then Louis receives decision summary; zihrou/iron receive action-oriented scorecard and expected reply signal.

Expected evidence: people_sync artifact records owner, due date, and expected signal.

E. Anti-Fake Completion

  1. Given no verified GitHub PR URL or deployment URL exists,
  2. Then artifacts_json must not claim github_pr or deployment.

Expected evidence: artifact-url-or-pr.md explicitly states no PR/deployment and explains blocker.

Current Verification

  • Gist URL must return HTTP 200.
  • Gist files must include HTML console, production brief, data model, acceptance tests, decision record, learning memory, sources, and artifact URL/PR note.
  • PLS upload-files must report uploaded: 8.

Artifact URL / PR Status

Primary artifact URL: https://gist.github.com/esz135888/7985708bb3790886c55073c5d050d41d

No GitHub PR was created in this job.

Reason: the PLS context does not provide a backend repository URL, local repo path, branch, deployment target, or permission boundary for the AI prediction verification module. Claiming github_pr or deployment would be fake success.

Next valid step: provide target_repo_url and dispatch a repo_change or github_pr worker to implement the dispatch guard and prediction verification scorecard.

<!doctype html>
<html lang="zh-Hant">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>PLS Capability Repair Dispatch Console</title>
<style>
:root { color-scheme: light; --ink:#18212f; --muted:#647084; --line:#d9e0ea; --bg:#f7f9fc; --panel:#ffffff; --ok:#0f7b45; --warn:#aa5a00; --bad:#b42318; --blue:#1f5fbf; }
body { margin:0; font:14px/1.55 -apple-system,BlinkMacSystemFont,"Segoe UI",sans-serif; color:var(--ink); background:var(--bg); }
header { padding:28px 32px 18px; background:var(--panel); border-bottom:1px solid var(--line); }
h1 { margin:0 0 6px; font-size:26px; letter-spacing:0; }
h2 { margin:0 0 12px; font-size:17px; }
main { max-width:1180px; margin:0 auto; padding:24px 20px 40px; display:grid; gap:16px; }
section { background:var(--panel); border:1px solid var(--line); border-radius:8px; padding:18px; }
.grid { display:grid; grid-template-columns:repeat(4,minmax(0,1fr)); gap:12px; }
.card { border:1px solid var(--line); border-radius:8px; padding:14px; min-height:96px; }
.label { color:var(--muted); font-size:12px; text-transform:uppercase; }
.value { font-size:20px; font-weight:700; margin-top:4px; }
.ok { color:var(--ok); } .warn { color:var(--warn); } .bad { color:var(--bad); } .blue { color:var(--blue); }
table { width:100%; border-collapse:collapse; }
th,td { text-align:left; vertical-align:top; border-bottom:1px solid var(--line); padding:10px 8px; }
th { color:var(--muted); font-size:12px; }
code { background:#eef2f7; padding:1px 5px; border-radius:4px; }
.flow { display:grid; grid-template-columns:repeat(5,1fr); gap:10px; }
.step { border:1px solid var(--line); border-radius:8px; padding:12px; background:#fbfcff; }
.small { color:var(--muted); font-size:12px; }
@media (max-width:900px){ .grid,.flow{grid-template-columns:1fr;} header{padding:22px 18px;} }
</style>
</head>
<body>
<header>
<h1>AI 預測驗證模組:Capability Repair Dispatch Console</h1>
<div class="small">Job 3bc391bc-a6cd-459b-ad29-ccb2afdeb6b1 · deliverable 7308e028-d7e2-4912-b666-5839e9dba97d · owner Louis · due 2026-05-25</div>
</header>
<main>
<section>
<h2>本輪判斷</h2>
<div class="grid">
<div class="card"><div class="label">目前狀態</div><div class="value warn">派工卡住</div><div class="small">PLS 仍以 project_runner 反覆要求 static production pack。</div></div>
<div class="card"><div class="label">下一個有效 lane</div><div class="value blue">repo_change / github_pr</div><div class="small">需要 target_repo_url 或 repo path。</div></div>
<div class="card"><div class="label">禁止完成假象</div><div class="value bad">No fake PR</div><div class="small">沒有 repo 就不得聲稱 GitHub PR 或部署成功。</div></div>
<div class="card"><div class="label">E2E 目標</div><div class="value ok">Dispatch breaker</div><div class="small">讓下一輪真正修改 PLS 後台派工條件。</div></div>
</div>
</section>
<section>
<h2>D1 / D7 / D14 / D30 路徑</h2>
<div class="flow">
<div class="step"><b>D1</b><br>停止同專案 project_runner static loop;補上 <code>target_repo_url</code>、owner、驗收。</div>
<div class="step"><b>D7</b><br>repo_change worker 實作 dispatch guard:重複完成 2 次後改派 repo_change/github_pr。</div>
<div class="step"><b>D14</b><br>建立 staging PR,驗證 signals/action_items/reviews 預測命中資料鏈。</div>
<div class="step"><b>D30</b><br>週節奏上線:每週產生 prediction scorecard,低命中自動開 correction action。</div>
<div class="step"><b>採用</b><br>Louis 拍板 repo;zihrou/iron 驗證採用訊號;PLS 回寫 weekly scorecard。</div>
</div>
</section>
<section>
<h2>目的到目的 E2E</h2>
<table>
<tr><th>層</th><th>內容</th><th>可測證據</th></tr>
<tr><td>原始目的</td><td>AI 自己驗證上次 review 的預測是否命中。</td><td>review_predictions 與 evidence_signals 可 join。</td></tr>
<tr><td>產出物</td><td>repo dispatch spec、資料模型、驗收測試、決策紀錄。</td><td>本 Gist + PLS uploaded files。</td></tr>
<tr><td>人採用</td><td>Louis 指定 repo;PLS 改派 repo_change;zihrou/iron 用 scorecard 判斷 AI 工具選擇。</td><td>next job type = repo_change/github_pr,且含 target_repo_url。</td></tr>
<tr><td>指標改善</td><td>減少重複 static pack、提高 PR/部署產出率、降低 fake success 風險。</td><td>7 天內同 loop 不再新增 project_runner generic job。</td></tr>
</table>
</section>
<section>
<h2>Dispatch Guard 規格</h2>
<table>
<tr><th>欄位/規則</th><th>值</th></tr>
<tr><td>trigger</td><td><code>ai_native_project_id=a8befe83-b818-482a-a6bb-3df58f50c3a2</code> 且同 lane 已有 durable artifact ≥ 2</td></tr>
<tr><td>block</td><td>拒絕再派 <code>project_runner</code> static pack,要求 capability repair 或 repo target。</td></tr>
<tr><td>route</td><td>若 <code>target_repo_url</code> 存在,派 <code>repo_change</code>;若 PR needed,派 <code>github_pr</code>。</td></tr>
<tr><td>audit</td><td>記錄上一輪 artifact URL、decision-record、未完成原因與下一步 owner/due。</td></tr>
</table>
</section>
</main>
</body>
</html>

Data Model / API / Sync / Permission Spec

Core Tables

ai_review_predictions

column type note
id uuid prediction id
project_id uuid source project
review_id uuid last review source
prediction_text text what AI predicted
expected_signal_type text github_commit, line_reply, action_item_done, deployment
expected_by timestamptz due time
owner_profile_id uuid accountable person
created_by_worker_id text audit

prediction_evidence_links

column type note
id uuid link id
prediction_id uuid FK
signal_source text signals/action_items/github/line
signal_ref text URL/id
evidence_score numeric 0-1
matched_terms jsonb why matched
verdict text hit/miss/partial/unknown

dispatch_repair_records

column type note
id uuid repair id
session_loop_key text loop bucket
repeat_count int same generic dispatch count
last_artifact_url text prior durable artifact
missing_capability text target_repo_url, repo_change, github_pr
next_worker_kind text repo_change/github_pr/project_review
owner text Louis/PLS
due_at timestamptz repair deadline
status text proposed/accepted/implemented/rejected

API Contract

  • POST /api/prediction-verification/run
    • input: { project_id, review_id, evidence_window_days }
    • output: { scorecard_id, hit_rate, misses, correction_actions }
  • POST /api/dispatch/repair-route
    • input: { session_loop_key, ai_native_project_id, target_repo_url? }
    • output: { route: "repo_change" | "github_pr" | "project_runner_blocked", reason }
  • GET /api/prediction-verification/scorecards/:id
    • returns weekly scorecard with verdicts and owner actions.

Sync

  • Ingest from GitHub commits, LINE signals, action items, strategy deliverables, deployment logs.
  • Join by project_id, profile_id, time window, matched terms, and explicit artifact refs.
  • Store immutable evidence refs; update verdicts by appending new evidence, not overwriting old evidence.

Permissions

  • Louis: approve repo target, force route override, accept scorecard.
  • PLS admin/worker: create repair records, route jobs, write audit logs.
  • Project members: read scorecard, comment on verdict, attach evidence.
  • External/shared artifacts: read-only URL only; no secrets or .env.

Audit / Rollback

  • Every route decision writes dispatch_repair_records.
  • If repo task fails, fall back to project_review with failure reason, not another generic project_runner.
  • Fake success guard: artifact kind github_pr requires verified PR URL; deployment requires verified URL/log.

Decision Record

Decision

Adopt Capability Repair Dispatch Breaker for this round.

Problem

The project repeatedly receives project_runner production-pack jobs even after durable artifacts already exist. The next useful step is not another document; it is a PLS backend/repo change that routes this loop to repo_change/github_pr once repo target information exists.

Options Considered

  1. Produce another D7/D14/D30 production pack.
    • Rejected: duplicates previous artifacts and does not move the system toward PR/deployment.
  2. Claim GitHub PR/deployment success.
    • Rejected: no verified repo URL or deploy target was provided.
  3. Produce a capability repair dispatch pack with data model, API route, acceptance tests, and adoption path.
    • Recommended: turns the blocker into an actionable backend/repo task and leaves an audit trail.

Recommendation

PLS should route the next job to repo_change when target_repo_url is present. If absent, PLS should ask Louis/PLS admin for that repo target instead of dispatching another generic project_runner task.

Adoption Status

Proposed in this deliverable. Acceptance requires the next PLS job for this loop to include target_repo_url and repo_change/github_pr, or an explicit rejection reason from Louis/PLS admin.

Landing Path

Owner: Louis / PLS capability owner. Due: 2026-05-25. Acceptance: no more same-loop static production packs; next execution is PR/staging or explicit repo blocker.

Feedback If Not Adopted

If PLS still needs a project_runner pack, return the missing production decision: which person will use it, which repo cannot be touched, and why a code change is not allowed yet.

{
"job_id": "3bc391bc-a6cd-459b-ad29-ccb2afdeb6b1",
"project": "AI 自建專案:公司AI化 的最大化推進",
"learned_at": "2026-05-24T05:25:00+08:00",
"market_context": [
{
"source": "OpenTelemetry Signals",
"lesson": "成熟 observability 會把 traces, metrics, logs/events 視為可關聯 signals;PLS 的 prediction verification 應採多來源 evidence link,而不是單一文字摘要。"
},
{
"source": "LangSmith evaluation docs",
"lesson": "LLM/agent 評估的成熟做法是用 dataset、experiment、trace 連接回歸測試;PLS 應將 review predictions 固化成可重跑資料集。"
},
{
"source": "OpenAI Evals",
"lesson": "evaluation 應可重複、可比較、可版本化;PLS 的每次 review prediction 應保留 eval version 與 evidence window。"
}
],
"pls_next_checks": [
"如果同 session_loop_key 已完成多個 production artifacts,先檢查 target_repo_url 是否存在。",
"若 target_repo_url 不存在,不要再派 project_runner;建立 dispatch_repair_record 並要求 owner 補 repo。",
"若 artifact claim 包含 github_pr/deployment,必須驗證 URL 或命令結果。"
],
"assumptions_overturned": [
"不是缺 production pack;已經有多輪 pack。",
"不是 market research 不足;真正缺的是可修改 repo 與 worker lane routing。",
"project_runner 無法單獨把此專案推到 production,必須接 repo_change/github_pr。"
],
"next_iteration_condition": "下一輪 job 必須含 target_repo_url 或明確 repo blocker;否則標示 capability repair unresolved。"
}

Production Brief:AI 預測驗證模組 Dispatch Breaker

場景

PLS 已連續把「AI 自建專案:公司AI化 的最大化推進」派成 project_runner production pack。前幾輪已產出 D7、D14、D30、dashboard、repo handoff、capability repair pack;本輪 context 仍未提供 target_repo_url 或可修改的 repo path,因此真正卡點不是缺文件,而是 PLS dispatch lane 沒有轉成 repo_change/github_pr

30 天路徑

  • D1:Louis 或 PLS 後台補 target_repo_url;停止同 loop 的 generic project_runner 靜態包。
  • D7:repo_change worker 實作 dispatch guard,偵測同專案重複 artifact 後改派 repo task。
  • D14:開 PR 到 staging,實作 prediction review/evidence join、weekly scorecard、correction action。
  • D30:上線週節奏;AI 預測命中率、修正 action 完成率、重複派工率納入 PLS scorecard。

目的到目的 E2E

原始目的:讓 AI 自動核對上次 review 的預測是否命中。 產出物:本輪交付 dispatch breaker console、data model、acceptance tests、decision record、learning memory。 人採用:Louis 指定 repo;PLS 派 repo_change/github_pr;zihrou/iron 用 scorecard 看 AI 工具選擇是否有效。 指標改善:重複 project_runner job 歸零、PR/部署產出率提升、fake PR/deployment 風險下降、worker 成本降低。

價值/錢路徑

  • 營收:AI 工具選擇能用命中證據排序,避免人力在低命中工具上延誤 2 週。
  • 成本:停止反覆產 static artifact,降低 session worker 與人工 review 成本。
  • 風險:用 e2e evidence 與 decision record 防止「看似完成但沒有 repo/部署」。
  • 轉換:把 signals/action items/reviews 變成可信 scorecard,幫 Louis 更快拍板。
  • 釋放人力:PLS 自動將卡住任務轉成正確 worker lane,人只補授權或 repo。

提升人的能力

這不是文字堆疊,而是讓 owner 能判斷:

  • 哪些 AI 預測真的命中。
  • 哪些工具選擇應保留、合併、停用。
  • 哪些專案因缺 repo/owner/due 不該再派文件工。
  • 下一輪應由人補授權,還是由 repo worker 直接改程式。

Solution Stack

  • 脈絡框架:AI review prediction → evidence signals → verdict → correction action → scorecard。
  • 作業流程:project_runner 若偵測重複交付,必須產生 capability repair record,並轉派 repo lane。
  • 資料/DB 模型:見 data-model.md
  • 可操作工具:capability-repair-dispatch-console.html
  • 驗收指標:見 acceptance-tests.md
  • 採用與升級:下一輪需 target_repo_url + repo_change/github_pr,不可再以 generic project_runner 完成。

People Sync / LINE 草稿

Louis:這輪已確認 AI 預測驗證不是缺 production pack,而是 PLS 派工 lane 卡住。請補 target_repo_url 或讓 PLS 後台把同 loop 改派 repo_change/github_pr;驗收是 7 天內看到 PR/staging,不再新增同題 static pack。

zihrou / iron:下一輪請看 scorecard 是否能把 AI 工具選擇分歧收斂成命中率與修正 action,不再用主觀偏好爭工具。

不足資料

  • 缺 backend repo URL/path。
  • 缺可執行部署環境。
  • 缺目前 prediction tables 的真實 schema。

本輪結論

本輪產出是 dispatch breaker,不假裝已開 PR。下一個有效動作是讓 PLS 後台或 repo worker 修改派工規則。

Market Context Sources

Current practice check date: 2026-05-24 Asia/Taipei.

  1. OpenTelemetry Signals documentation, last modified 2026-03-10, accessed via web search on 2026-05-24. URL: https://opentelemetry.io/docs/concepts/signals/ Use: confirms mature observability treats telemetry as traces, metrics, logs/events that can be collected and correlated.

  2. OpenTelemetry Logs specification, accessed via web search on 2026-05-24. URL: https://opentelemetry.io/docs/specs/otel/logs/ Use: supports log correlation with traces/metrics and source attribution, relevant to PLS evidence linking.

  3. LangSmith evaluation guide, crawled recently by search index, accessed 2026-05-24. URL: https://docs.langchain.com/langsmith/evaluate-llm-application Use: supports dataset/experiment based LLM application evaluation.

  4. OpenAI Evals API reference, accessed 2026-05-24. URL: https://platform.openai.com/docs/api-reference/evals Use: supports treating evals as structured, repeatable tests rather than ad hoc summaries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment