|
<!doctype html> |
|
<html lang="zh-Hant"> |
|
<head> |
|
<meta charset="utf-8"> |
|
<meta name="viewport" content="width=device-width, initial-scale=1"> |
|
<title>公司AI化 預測驗證作戰台</title> |
|
<style> |
|
:root{--ink:#17202a;--muted:#607080;--line:#d8e0e7;--paper:#f6f8fb;--card:#fff;--blue:#1d4ed8;--green:#0f7f5c;--amber:#a16207;--red:#b3361d;--purple:#6d28d9} |
|
*{box-sizing:border-box} body{margin:0;background:var(--paper);color:var(--ink);font-family:Inter,ui-sans-serif,system-ui,-apple-system,BlinkMacSystemFont,"Segoe UI",sans-serif;line-height:1.5} |
|
header{background:#fff;border-bottom:1px solid var(--line);padding:28px clamp(20px,4vw,56px)} main{padding:24px clamp(20px,4vw,56px) 48px} |
|
h1{margin:0 0 12px;font-size:clamp(30px,4vw,52px);line-height:1.05;max-width:1050px} h2{margin:0 0 12px;font-size:22px} h3{margin:0 0 6px;font-size:16px} p{margin-top:0} code{background:#eef3f8;padding:1px 5px;border-radius:4px} |
|
.sub{max-width:1060px;color:var(--muted);font-size:17px}.grid{display:grid;gap:16px}.kpis{grid-template-columns:repeat(4,minmax(0,1fr));margin-top:22px}.two{grid-template-columns:1.1fr .9fr}.three{grid-template-columns:repeat(3,minmax(0,1fr))}.timeline{grid-template-columns:repeat(4,minmax(0,1fr))}.flow{grid-template-columns:repeat(5,minmax(0,1fr))} |
|
.card{background:var(--card);border:1px solid var(--line);border-radius:8px;padding:18px;box-shadow:0 1px 2px rgba(23,32,42,.04)}.metric{font-size:34px;font-weight:780}.label{color:var(--muted);font-size:13px} |
|
.pill{display:inline-flex;border:1px solid var(--line);border-radius:999px;padding:4px 10px;font-size:12px;background:#fff;margin:0 6px 8px 0;white-space:nowrap}.ok{color:var(--green)}.warn{color:var(--amber)}.bad{color:var(--red)}.info{color:var(--blue)} |
|
table{width:100%;border-collapse:collapse;font-size:14px} th,td{text-align:left;padding:10px;border-bottom:1px solid var(--line);vertical-align:top} th{color:var(--muted);font-size:12px;text-transform:uppercase} |
|
.day{border-left:4px solid var(--purple)}.step{border:1px solid var(--line);border-radius:8px;padding:12px;min-height:126px;background:#fbfdff}.step strong{display:block;color:var(--purple);margin-bottom:6px}.source a{color:var(--blue);word-break:break-word} |
|
@media(max-width:920px){.kpis,.two,.three,.timeline,.flow{grid-template-columns:1fr}h1{font-size:34px}} |
|
</style> |
|
</head> |
|
<body> |
|
<header> |
|
<span class="pill info">PLS production delivery pack</span><span class="pill ok">Solution: eval / system</span> |
|
<h1>公司AI化 AI 預測驗證作戰台</h1> |
|
<p class="sub">把「新增 AI 預測驗證模組」推成 production 級驗收系統:將上次 review 的預測變成 bets ledger,透過 signals、action items、GitHub commits、worker health、LINE/Drive 訊號自動核對命中、偏差、資料缺口與下一輪校正。</p> |
|
<section class="grid kpis"> |
|
<div class="card"><div class="metric">99%</div><div class="label">公司AI化進度訊號曾推進到 98% → 99%</div></div> |
|
<div class="card"><div class="metric ok">Bets</div><div class="label">策略預測必須進 ledger 才能驗證</div></div> |
|
<div class="card"><div class="metric warn">2週</div><div class="label">zihrou / iron / Louis 工具選擇對焦風險</div></div> |
|
<div class="card"><div class="metric">D30</div><div class="label">AI 管理層預測校正迴路成形</div></div> |
|
</section> |
|
</header> |
|
<main class="grid"> |
|
<section class="grid two"> |
|
<div class="card"> |
|
<h2>本輪問題</h2> |
|
<p>公司AI化已累積 persona 思路圖、heartbeat 節點、worker 隔離、預測深度生命體徵等 commit;但如果 AI review 的預測沒有被下一輪 signals 和 action items 自動核對,AI 會變成「很會說」而不是「會校正」。</p> |
|
<span class="pill">Owner: Louis</span><span class="pill">Stakeholders: zihrou / iron</span><span class="pill">Due: D7 first scorecard</span><span class="pill">Acceptance: 命中/偏差可重跑</span> |
|
</div> |
|
<div class="card"> |
|
<h2>解法選型</h2> |
|
<p><strong>eval / system</strong>。這不是單一報告或提醒,而是 AI 管理層的回歸測試系統:預測、證據、命中判定、校正、權限、稽核都要可追溯。</p> |
|
</div> |
|
</section> |
|
|
|
<section class="card"> |
|
<h2>D1 / D7 / D14 / D30</h2> |
|
<div class="grid timeline"> |
|
<div class="card day"><h3>D1</h3><p>定義 prediction bet schema、evidence mapping、hit/miss/partial/unknown rubric,並發布此作戰台。</p></div> |
|
<div class="card day"><h3>D7</h3><p>接 20 筆上次 review bets,使用 signals / action items / GitHub commits 自動核對第一版命中率。</p></div> |
|
<div class="card day"><h3>D14</h3><p>在公司AI化 weekly review 中加入 prediction calibration section,列出偏差原因與下輪模型調整。</p></div> |
|
<div class="card day"><h3>D30</h3><p>建立 AI 管理層 scorecard:預測品質、工具選型一致性、worker 品質、專案推進命中率同表治理。</p></div> |
|
</div> |
|
</section> |
|
|
|
<section class="card"> |
|
<h2>Purpose-to-Purpose E2E</h2> |
|
<div class="grid flow"> |
|
<div class="step"><strong>原始目的</strong>公司AI化要讓 AI 成為決策層,而非只產出建議。</div> |
|
<div class="step"><strong>預測</strong>每次 review 的「預期成果、風險、owner、期限」寫入 bets ledger。</div> |
|
<div class="step"><strong>證據</strong>signals、action items、commits、worker health、LINE/Drive 訊號自動對照。</div> |
|
<div class="step"><strong>校正</strong>命中、部分命中、未命中、無資料;輸出下一輪 route 與工具選型修正。</div> |
|
<div class="step"><strong>價值</strong>降低錯誤決策延續、提升工具統一、縮短 AI 化對焦時間、提高 worker 產出可信度。</div> |
|
</div> |
|
</section> |
|
|
|
<section class="grid two"> |
|
<div class="card"> |
|
<h2>驗證 Rubric</h2> |
|
<table> |
|
<thead><tr><th>Result</th><th>判定</th><th>下一步</th></tr></thead> |
|
<tbody> |
|
<tr><td><strong>hit</strong></td><td>期限內有直接證據支持預測。</td><td>提高該 route / signal 權重。</td></tr> |
|
<tr><td><strong>partial</strong></td><td>方向正確但 scope、時間或 owner 偏差。</td><td>回填偏差原因,調整下一輪 ask。</td></tr> |
|
<tr><td><strong>miss</strong></td><td>證據顯示預測沒有發生或方向錯誤。</td><td>要求反事實分析與 decision record。</td></tr> |
|
<tr><td><strong>unknown</strong></td><td>缺資料,不能判定。</td><td>產生 data_gap task,不准算命中。</td></tr> |
|
</tbody> |
|
</table> |
|
</div> |
|
<div class="card"> |
|
<h2>資料 / API / 權限</h2> |
|
<p><strong>Tables:</strong> <code>prediction_bets</code>, <code>prediction_evidence_links</code>, <code>prediction_verdicts</code>, <code>calibration_runs</code>, <code>tool_alignment_risks</code>.</p> |
|
<p><strong>APIs:</strong> <code>POST /ai/reviews/:id/bets</code>, <code>POST /ai/predictions/verify</code>, <code>GET /ai/predictions/scorecard</code>.</p> |
|
<p><strong>Permissions:</strong> AI worker 可提出 verdict;Louis 可 override;zihrou/iron 可補證據;所有 override 要 audit reason。</p> |
|
</div> |
|
</section> |
|
|
|
<section class="grid three"> |
|
<div class="card"><h2>價值 / 錢路徑</h2><p>讓 AI 推進不靠感覺,能把錯誤預測及早止損,把準確 route 加碼,降低工具分歧與重工成本。</p></div> |
|
<div class="card"><h2>人的能力提升</h2><p>Louis 看見 AI 何時準、何時偏;zihrou/iron 能用證據對焦工具選型,而不是各自憑經驗拉扯。</p></div> |
|
<div class="card"><h2>下一輪升級</h2><p>接真實 review bet ledger,產生 weekly calibration report 與 worker/route 信任分數。</p></div> |
|
</section> |
|
|
|
<section class="card source"> |
|
<h2>Market Maturity Inputs</h2> |
|
<p>Evidently documents data and prediction drift monitoring for production AI quality checks: <a href="https://docs.evidentlyai.com/metrics/preset_data_drift">Evidently data and prediction drift</a>.</p> |
|
<p>Google's ML Test Score offers an actionable production-readiness rubric for ML systems: <a href="https://research.google/pubs/whats-your-ml-test-score-a-rubric-for-ml-production-systems/">Google ML Test Score</a>.</p> |
|
<p>Evidently monitoring overview emphasizes batch evaluation and continuous collaboration around AI quality: <a href="https://docs.evidentlyai.com/docs/platform/monitoring_overview">Evidently monitoring overview</a>.</p> |
|
</section> |
|
</main> |
|
</body> |
|
</html> |