|
<!doctype html> |
|
<html lang="zh-Hant"> |
|
<head> |
|
<meta charset="utf-8"> |
|
<meta name="viewport" content="width=device-width, initial-scale=1"> |
|
<title>AI 預測驗證 Evidence Trial Runner</title> |
|
<style> |
|
:root { |
|
--bg: #f7f4ee; |
|
--ink: #1f2320; |
|
--muted: #666b63; |
|
--line: #d7d0c5; |
|
--panel: #fffef9; |
|
--green: #17715f; |
|
--blue: #245d8f; |
|
--red: #a94e43; |
|
--gold: #bb7a16; |
|
} |
|
* { box-sizing: border-box; } |
|
body { |
|
margin: 0; |
|
background: var(--bg); |
|
color: var(--ink); |
|
font-family: "PingFang TC", "Noto Sans TC", ui-sans-serif, system-ui, sans-serif; |
|
line-height: 1.55; |
|
} |
|
header { |
|
padding: 40px 6vw 28px; |
|
background: #fffaf0; |
|
border-bottom: 1px solid var(--line); |
|
} |
|
.kicker { |
|
color: var(--blue); |
|
font-size: 12px; |
|
font-weight: 800; |
|
letter-spacing: .06em; |
|
text-transform: uppercase; |
|
} |
|
h1 { |
|
max-width: 1040px; |
|
margin: 10px 0; |
|
font-size: clamp(34px, 5vw, 68px); |
|
line-height: 1.04; |
|
letter-spacing: 0; |
|
} |
|
.lede { |
|
max-width: 920px; |
|
color: var(--muted); |
|
font-size: 19px; |
|
} |
|
main { |
|
padding: 28px 6vw 60px; |
|
display: grid; |
|
gap: 18px; |
|
} |
|
section { |
|
background: var(--panel); |
|
border: 1px solid var(--line); |
|
border-radius: 8px; |
|
padding: 22px; |
|
box-shadow: 0 12px 34px rgba(40, 34, 24, .08); |
|
} |
|
h2 { margin: 0 0 14px; font-size: 24px; } |
|
h3 { margin: 0 0 8px; font-size: 17px; } |
|
.grid { |
|
display: grid; |
|
grid-template-columns: repeat(4, minmax(0, 1fr)); |
|
gap: 12px; |
|
} |
|
.two { |
|
display: grid; |
|
grid-template-columns: repeat(2, minmax(0, 1fr)); |
|
gap: 14px; |
|
} |
|
.card { |
|
border: 1px solid var(--line); |
|
border-radius: 8px; |
|
padding: 15px; |
|
background: #fffdf8; |
|
min-height: 120px; |
|
} |
|
.tag { |
|
display: inline-flex; |
|
align-items: center; |
|
border: 1px solid var(--line); |
|
border-radius: 99px; |
|
padding: 2px 9px; |
|
margin-bottom: 9px; |
|
color: var(--muted); |
|
font-size: 12px; |
|
font-weight: 800; |
|
} |
|
table { |
|
width: 100%; |
|
border-collapse: collapse; |
|
font-size: 14px; |
|
} |
|
th, td { |
|
padding: 10px 8px; |
|
border-bottom: 1px solid var(--line); |
|
text-align: left; |
|
vertical-align: top; |
|
} |
|
th { |
|
color: var(--blue); |
|
font-size: 12px; |
|
text-transform: uppercase; |
|
letter-spacing: .04em; |
|
} |
|
ul, ol { margin: 0; padding-left: 20px; } |
|
li { margin: 6px 0; } |
|
code { |
|
padding: 1px 5px; |
|
border: 1px solid var(--line); |
|
border-radius: 5px; |
|
background: #f1eadf; |
|
font-family: ui-monospace, SFMono-Regular, Menlo, monospace; |
|
font-size: .92em; |
|
} |
|
.pass { color: var(--green); font-weight: 800; } |
|
.warn { color: var(--gold); font-weight: 800; } |
|
.stop { color: var(--red); font-weight: 800; } |
|
@media (max-width: 920px) { |
|
header, main { padding-left: 18px; padding-right: 18px; } |
|
.grid, .two { grid-template-columns: 1fr; } |
|
} |
|
</style> |
|
</head> |
|
<body> |
|
<header> |
|
<div class="kicker">primary_artifact / production_readiness / e2e_verification</div> |
|
<h1>AI 預測驗證 Evidence Trial Runner</h1> |
|
<p class="lede">本輪把「多來源證據核對預測命中」從概念推到可執行的 D7 試跑:定義可採信證據、同步欄位、權限稽核、50 筆批次驗收、LINE 採納訊號,讓下一個 worker 不再重做說明,而是直接跑第一輪資料。</p> |
|
</header> |
|
<main> |
|
<section> |
|
<h2>D1 / D7 / D14 / D30 發展路徑</h2> |
|
<div class="grid"> |
|
<div class="card"><span class="tag">D1</span><h3>證據白名單</h3>Louis 拍板 6 種證據來源與命中規則;zihrou 確認 miss taxonomy;iron 確認可同步欄位。</div> |
|
<div class="card"><span class="tag">D7</span><h3>50 筆 Trial Run</h3>抽 50 筆 review prediction,跑出 hit / miss / unknown,unknown 必須低於 25%。</div> |
|
<div class="card"><span class="tag">D14</span><h3>修正閉環</h3>對連續 miss reason 建 correction action item,並把 source gap 變成 adapter 工單。</div> |
|
<div class="card"><span class="tag">D30</span><h3>週會指標化</h3>把 prediction hit rate、unknown rate、top miss reason 放進公司 AI 化週會 scorecard。</div> |
|
</div> |
|
</section> |
|
|
|
<section> |
|
<h2>目的到目的 E2E</h2> |
|
<div class="two"> |
|
<div class="card"> |
|
<h3>原始目的到產出物</h3> |
|
<ol> |
|
<li>上次 review 產生 prediction claim。</li> |
|
<li>Evidence adapters 拉 signals、action items、GitHub、worker completions、deployment、human notes。</li> |
|
<li>Normalizer 轉成同一張 evidence ledger。</li> |
|
<li>Matcher 產生 label、confidence、source_ref、miss_reason。</li> |
|
<li>Reviewer sample 修正標籤,形成 calibration summary。</li> |
|
</ol> |
|
</div> |
|
<div class="card"> |
|
<h3>人採用到指標改善</h3> |
|
<ul> |
|
<li>Louis 用 calibration summary 決定 AI review 是否可信。</li> |
|
<li>zihrou 用 miss reason 判斷卡點是方向、資源、授權或執行。</li> |
|
<li>iron 用 source gap 決定下一個 adapter 或 API 同步工單。</li> |
|
<li>改善指標:少重複派工、少錯誤信心、縮短 review 到 correction 的時間。</li> |
|
</ul> |
|
</div> |
|
</div> |
|
</section> |
|
|
|
<section> |
|
<h2>Evidence Adapter 白名單</h2> |
|
<table> |
|
<thead><tr><th>來源</th><th>命中證據</th><th>必要欄位</th><th>稽核</th></tr></thead> |
|
<tbody> |
|
<tr><td>signals</td><td>新訊號與 prediction 主題、owner、日期相符</td><td><code>signal_type</code>, <code>summary</code>, <code>project_id</code>, <code>created_at</code></td><td>保留 raw payload hash</td></tr> |
|
<tr><td>action items</td><td>任務完成、延期或狀態變更能驗證 prediction</td><td><code>title</code>, <code>status</code>, <code>due_date</code>, <code>assignee_id</code></td><td>append-only status snapshot</td></tr> |
|
<tr><td>GitHub</td><td>commit / PR / workflow run 支撐技術進度</td><td><code>sha</code>, <code>author</code>, <code>message</code>, <code>url</code></td><td>URL 必須可驗證</td></tr> |
|
<tr><td>worker completions</td><td>PLS job 完成、失敗或 artifact 上傳</td><td><code>job_id</code>, <code>summary</code>, <code>artifacts</code>, <code>completed_at</code></td><td>不得只採 summary,需 artifact ref</td></tr> |
|
<tr><td>deployment</td><td>production URL 或 deploy log 支撐上線預測</td><td><code>environment</code>, <code>url</code>, <code>status</code>, <code>run_id</code></td><td>驗證 HTTP 或平台回傳</td></tr> |
|
<tr><td>human review notes</td><td>Louis / zihrou / iron 明確採納、否決、修正</td><td><code>person_id</code>, <code>decision</code>, <code>note</code>, <code>created_at</code></td><td>reviewer override append-only</td></tr> |
|
</tbody> |
|
</table> |
|
</section> |
|
|
|
<section> |
|
<h2>D7 Trial Run 操作流程</h2> |
|
<div class="two"> |
|
<div class="card"> |
|
<h3>批次規則</h3> |
|
<ul> |
|
<li>抽樣 50 筆:P0/P1 優先、近 30 天、至少 3 位 owner。</li> |
|
<li>每筆 prediction 必須有 owner、due、expected evidence、impact metric。</li> |
|
<li>label 只能是 <code>hit</code>、<code>miss</code>、<code>unknown</code>。</li> |
|
<li>unknown 超過 25% 時不得進 dashboard productization。</li> |
|
</ul> |
|
</div> |
|
<div class="card"> |
|
<h3>Go / No-Go</h3> |
|
<ul> |
|
<li><span class="pass">Go:</span> unknown < 25%,reviewer sample 完成,top miss reason 可路由。</li> |
|
<li><span class="warn">Hold:</span> unknown 25-40%,先補 source adapter。</li> |
|
<li><span class="stop">Stop:</span> prediction 缺 owner/due/expected evidence,回到 D1 policy。</li> |
|
</ul> |
|
</div> |
|
</div> |
|
</section> |
|
|
|
<section> |
|
<h2>採用路徑與 LINE 草稿</h2> |
|
<div class="card"> |
|
<p><strong>推送對象:</strong>Louis(owner)、zihrou(supervisor)、iron(owner)。</p> |
|
<p><strong>LINE 草稿:</strong>「這輪不是再做一份 AI 預測驗證說明,而是把 D7 試跑規則做出來:6 種 evidence adapter 白名單、50 筆抽樣規則、unknown <25% 驗收、miss reason 對應修正工單。請 Louis 今天拍板 label policy;zihrou 確認 miss taxonomy;iron 確認哪幾個來源目前可同步,5/31 前跑第一批 50 筆。」</p> |
|
<p><strong>期待回覆訊號:</strong>「同意 label policy」、「缺哪個 source adapter」、「50 筆 seed 清單已建立」、「unknown rate」。</p> |
|
</div> |
|
</section> |
|
</main> |
|
</body> |
|
</html> |