Skip to content

Instantly share code, notes, and snippets.

@esz135888
Created May 24, 2026 01:25
Show Gist options
  • Select an option

  • Save esz135888/8cfb5236ec413388087538adc4428768 to your computer and use it in GitHub Desktop.

Select an option

Save esz135888/8cfb5236ec413388087538adc4428768 to your computer and use it in GitHub Desktop.
PLS job e5a8a4ef project completion heartbeat operating production pack

Production Acceptance Tests

Pass / Fail Gates

  1. Primary artifact URL returns HTTP 200.
  2. Main deliverable is not plain text; it is an openable console artifact.
  3. Required artifacts exist: production brief, data model, acceptance tests, decision record, and artifact URL record.
  4. D1 / D7 / D14 / D30 path is explicit.
  5. Purpose-to-purpose E2E path is explicit and measurable.
  6. At least two current or comparable external practices are cited.
  7. Owner, due, and acceptance are present.
  8. Data model includes schema, API, sync, permissions, and audit boundary.
  9. People sync draft exists.
  10. Learning memory exists.

E2E Scenario Test

Given a project has no verified primary artifact and a heartbeat older than two intervals, when the watchdog runs, then completion_gates.primary_artifact_openable fails, completion_gates.heartbeat_fresh warns, and a repair_proposals row is created with owner, due, and acceptance.

Next Integration Test

Connect one real system_settings heartbeat row and one strategy-map heartbeat node. The scorecard must display the same freshness state and a timeline event with the same source reference.

Data Model

Tables

project_heartbeats

  • id: uuid primary key.
  • project_id: uuid.
  • source: enum system_settings, worker, github, line, artifact, review.
  • heartbeat_type: enum cron, progress, commit, reply, upload, decision.
  • status: enum healthy, late, missing, failed, recovered.
  • observed_at: timestamptz.
  • source_ref: text.
  • evidence_json: jsonb.

completion_gates

  • id: uuid primary key.
  • project_id: uuid.
  • gate_key: text.
  • status: enum pass, warn, fail, blocked.
  • owner_id: uuid nullable.
  • due_at: timestamptz nullable.
  • acceptance: text.
  • last_checked_at: timestamptz.
  • failure_reason: text nullable.

repair_proposals

  • id: uuid primary key.
  • project_id: uuid.
  • trigger_gate_id: uuid.
  • route: enum communication, doc, project, tool, system, agent, watchdog, eval, governance.
  • proposal: text.
  • status: enum draft, sent, accepted, rejected, done.
  • created_by_worker_id: text.
  • audit_ref: text.

artifact_verifications

  • id: uuid primary key.
  • job_id: uuid.
  • artifact_url: text.
  • http_status: integer.
  • verified_at: timestamptz.
  • verified_by: text.
  • result: enum pass, fail.

API Surface

  • GET /projects/:project_id/heartbeat-status
  • GET /projects/:project_id/completion-gates
  • POST /projects/:project_id/repair-proposals
  • POST /artifacts/verify
  • POST /project-reviews/:review_id/decision

Sync Boundary

Read from system_settings, GitHub commit memory, worker job logs, artifact upload events, LINE response events, and project review decisions. Write only gate state, repair proposals, verification records, and audit events.

Permissions / Audit

Workers may append evidence and propose repairs. Project owners may accept or reject repairs. Supervisors may approve escalations. Audit records are append-only and include actor, timestamp, previous state, next state, and source evidence.

Decision Record

Decision

Use watchdog / watchdog_config as the primary solution route for the project completion operating system.

Options Considered

  • General dashboard: easy to read, but weak on self-heal and verification.
  • Project SOP: useful for human cadence, but does not catch broken worker, queue, artifact, or deployment states.
  • Watchdog operating console: best fit because the core risk is system, queue, project, and artifact drift.

Recommendation

Ship a watchdog-backed operating console with completion gates, heartbeat status, repair proposals, and project review decision cadence.

Adoption Status

Recommended for D1 production pack; next round should connect real system_settings heartbeat rows and strategy map node events.

Feedback Needed If Not Adopted

If rejected, provide the missing production constraint: whether the blocker is data access, owner workflow, UI location in PLS, or governance authority.

{
"project": "AI 共同訊號專案:專案完成度與 AI 推進節奏",
"job_id": "e5a8a4ef-6539-4369-a350-bc47106f9e5b",
"learned_signal": "Strategy map now supports heartbeat node type from system_settings cron heartbeat data and timeline visualization.",
"selected_solution": "watchdog/watchdog_config",
"next_run_bias": "Prefer production watchdog, completion gate, repair proposal, artifact verification, and project review cadence over generic summaries.",
"must_preserve": [
"primary artifact URL must be openable",
"owner/due/acceptance must exist",
"D1/D7/D14/D30 path must be explicit",
"data model and audit boundary must be included"
]
}

Market Maturity

Comparable Practices

PLS Gap

PLS already has strategy map heartbeat nodes, but completion governance still needs gates, owner/due/acceptance, repair proposals, artifact URL checks, and review decisions.

This Round Upgrade

This pack turns heartbeat data into a production operating model with data schema, API surface, self-heal rules, acceptance tests, and an openable console.

People Sync

LINE Draft

Louis,這輪我把「專案完成度與 AI 推進節奏」收斂成一個 heartbeat operating console。它會把 system_settings 的 cron heartbeat、策略地圖 heartbeat node、artifact 是否可開、owner/due/acceptance 是否完整,轉成 completion gate 和 repair proposal。D7 建議接 6 個關聯專案做 scorecard;若某專案心跳過期、沒有主成果、沒有 owner 或驗收,就自動產生下一步補件/升級訊息。

Ask

請確認 D7 要先接哪 6 個專案,以及 completion gate 失敗時是否允許 worker 自動產生 LINE unblock 草稿。

Escalation

如果 D7 前沒有資料源權限,先用 GitHub commit memory + worker logs + artifact upload events 做最小 scorecard。

Production Brief

Scene

Project: AI 共同訊號專案:專案完成度與 AI 推進節奏.

Latest signal: GitHub commits added heartbeat node type to the strategy map, read cron heartbeat data from system_settings, and visualized status as graph nodes and timeline events.

Productized Outcome

Build a Project Completion Heartbeat Operating Console that turns heartbeat events into completion gates, repair proposals, and project review cadence.

D1 / D7 / D14 / D30

  • D1: publish openable console and define heartbeat, gate, artifact verification, repair proposal, and audit schema.
  • D7: connect six related projects into scorecard with owner, due, acceptance, heartbeat freshness, and verified artifact state.
  • D14: connect strategy map timeline events to weekly review decisions and repair queue.
  • D30: operate portfolio decision cadence for keep, merge, pause, kill, or fund choices.

Purpose-to-Purpose E2E

Original purpose -> system and human signals -> heartbeat console -> owner action -> project completion, cost reduction, risk reduction, and higher verified delivery throughput.

Owner / Due / Acceptance

  • Owner: Louis / PLS project owner.
  • Due: D7 for real scorecard connection.
  • Acceptance: primary artifact opens, required docs exist, completion gates are pass/fail, and e2e verification references a checked URL.

Production Readiness

Ready Now

  • Openable console artifact.
  • D1/D7/D14/D30 path.
  • Data model and API surface.
  • Completion gates and acceptance tests.
  • Decision record and people sync.

Integration Required

  • Read real heartbeat state from system_settings.
  • Map strategy heartbeat nodes to project_heartbeats.
  • Store gate state in PLS backend.
  • Trigger repair proposals as session worker jobs or LINE unblock drafts.

Risk Controls

  • Never mark complete without verified primary artifact URL or uploaded artifact.
  • Repair proposals require owner and due date.
  • Audit log is append-only.
  • High-risk repair actions require supervisor approval.
<!doctype html>
<html lang="zh-Hant">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Project Completion Heartbeat Operating Console</title>
<style>
:root {
--ink: #17202a;
--muted: #5d6b78;
--line: #d8e0e7;
--paper: #f7f9fb;
--card: #ffffff;
--blue: #2563eb;
--green: #0f8a5f;
--amber: #b7791f;
--red: #c2410c;
--violet: #6d28d9;
}
* { box-sizing: border-box; }
body {
margin: 0;
font-family: Inter, ui-sans-serif, system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif;
background: var(--paper);
color: var(--ink);
line-height: 1.5;
}
header {
background: #ffffff;
border-bottom: 1px solid var(--line);
padding: 28px clamp(20px, 4vw, 56px);
}
main { padding: 24px clamp(20px, 4vw, 56px) 48px; }
h1, h2, h3, p { margin-top: 0; }
h1 { font-size: clamp(30px, 4vw, 52px); line-height: 1.05; max-width: 980px; }
h2 { font-size: 22px; margin-bottom: 12px; }
h3 { font-size: 16px; margin-bottom: 8px; }
.sub { color: var(--muted); max-width: 980px; font-size: 17px; }
.grid { display: grid; gap: 16px; }
.kpis { grid-template-columns: repeat(4, minmax(0, 1fr)); margin-top: 22px; }
.two { grid-template-columns: 1.15fr .85fr; }
.three { grid-template-columns: repeat(3, minmax(0, 1fr)); }
.card {
background: var(--card);
border: 1px solid var(--line);
border-radius: 8px;
padding: 18px;
box-shadow: 0 1px 2px rgba(23, 32, 42, .04);
}
.metric { font-size: 34px; font-weight: 760; }
.label { color: var(--muted); font-size: 13px; }
.pill {
display: inline-flex;
align-items: center;
border: 1px solid var(--line);
border-radius: 999px;
padding: 4px 10px;
font-size: 12px;
background: #fff;
margin: 0 6px 8px 0;
white-space: nowrap;
}
.ok { color: var(--green); }
.warn { color: var(--amber); }
.bad { color: var(--red); }
.info { color: var(--blue); }
table { width: 100%; border-collapse: collapse; font-size: 14px; }
th, td { text-align: left; padding: 10px; border-bottom: 1px solid var(--line); vertical-align: top; }
th { color: var(--muted); font-size: 12px; font-weight: 700; text-transform: uppercase; }
.flow {
display: grid;
grid-template-columns: repeat(5, minmax(0, 1fr));
gap: 10px;
}
.step {
border: 1px solid var(--line);
border-radius: 8px;
padding: 12px;
background: #fbfdff;
min-height: 124px;
}
.step strong { display: block; margin-bottom: 6px; color: var(--violet); }
.timeline {
display: grid;
grid-template-columns: repeat(4, minmax(0, 1fr));
gap: 12px;
}
.day { border-left: 4px solid var(--blue); }
.source a { color: var(--blue); word-break: break-word; }
@media (max-width: 900px) {
.kpis, .two, .three, .flow, .timeline { grid-template-columns: 1fr; }
h1 { font-size: 34px; }
}
</style>
</head>
<body>
<header>
<p class="pill info">PLS production delivery pack</p>
<p class="pill ok">Route: watchdog / watchdog_config</p>
<h1>Project Completion Heartbeat Operating Console</h1>
<p class="sub">把「專案完成度與 AI 推進節奏」從聊天式追蹤升級成可驗收的 operating system:從 <code>system_settings</code> cron heartbeat、策略地圖 heartbeat node、timeline event、completion gate 到 self-heal proposal,全鏈路都有 owner、due、acceptance 與 audit evidence。</p>
<section class="grid kpis">
<div class="card"><div class="metric">18</div><div class="label">近期共同訊號納入節奏判斷</div></div>
<div class="card"><div class="metric">6</div><div class="label">關聯專案進入 heartbeat scorecard</div></div>
<div class="card"><div class="metric ok">200</div><div class="label">主 artifact Gist URL 驗證目標</div></div>
<div class="card"><div class="metric">D30</div><div class="label">portfolio decision cadence 成形</div></div>
</section>
</header>
<main class="grid">
<section class="grid two">
<div class="card">
<h2>Operating Problem</h2>
<p>最新 GitHub 訊號已把 heartbeat node 類型接進策略地圖,並從 <code>system_settings</code> 讀取 cron 心跳資料。下一個 production 缺口不是再做摘要,而是把心跳變成專案完成度的可靠指揮層:誰卡住、哪個 artifact 無法驗證、哪個 owner 沒回、哪個 worker 長時間無進展,必須可見、可升級、可追溯。</p>
<span class="pill">Owner: Louis / PLS project owner</span>
<span class="pill">Due: D7 接 6 專案 scorecard</span>
<span class="pill">Acceptance: gate pass/fail 可重跑</span>
</div>
<div class="card">
<h2>Selected Solution</h2>
<p><strong>watchdog / watchdog_config</strong>。這輪交付 watchdog 作戰台與規格,因為問題核心是系統、佇列、專案節奏和 artifact verification 的 drift;若直接做一般 app,會漏掉最重要的自癒與升級規則。</p>
<p class="label">Next upgrade: watchdog emits repair proposal; accepted proposal becomes worker task, PR, deployment, or LINE unblock message.</p>
</div>
</section>
<section class="card">
<h2>D1 / D7 / D14 / D30 Path</h2>
<div class="timeline">
<div class="card day"><h3>D1</h3><p>定義 heartbeat schema、completion gate、repair proposal schema;用本輪 Gist + upload evidence 證明 artifact 可打開。</p></div>
<div class="card day"><h3>D7</h3><p>將 6 個關聯專案接入 scorecard,對 no heartbeat、no owner、no due、no verified artifact 自動產生修復建議。</p></div>
<div class="card day"><h3>D14</h3><p>策略地圖 timeline 顯示 heartbeat event、review event、repair event;每週 review 直接看 gate 失敗原因。</p></div>
<div class="card day"><h3>D30</h3><p>把 completion OS 變成 portfolio decision cadence:保留、合併、暫停、加碼、砍掉都有資料證據。</p></div>
</div>
</section>
<section class="card">
<h2>Purpose-to-Purpose E2E</h2>
<div class="flow">
<div class="step"><strong>Original Purpose</strong>專案完成度與 AI 推進節奏要可見,不再靠人工追問。</div>
<div class="step"><strong>Signals</strong>GitHub commit、worker heartbeat、artifact upload、LINE response、project review 事件。</div>
<div class="step"><strong>Operating Console</strong>狀態圖、timeline、completion gates、repair proposals、owner/due/acceptance。</div>
<div class="step"><strong>Human Adoption</strong>Louis 看下一個應處理的卡點;owner 收到明確補件或決策 ask。</div>
<div class="step"><strong>Outcome</strong>假完成下降、專案停滯時間下降、verified artifact rate 上升、管理追蹤成本下降。</div>
</div>
</section>
<section class="grid two">
<div class="card">
<h2>Completion Gates</h2>
<table>
<thead><tr><th>Gate</th><th>Pass Rule</th><th>Self-Heal</th></tr></thead>
<tbody>
<tr><td>heartbeat_fresh</td><td>最近心跳小於 2 個 interval。</td><td>建立 worker health task,附 claim/context/progress output。</td></tr>
<tr><td>primary_artifact_openable</td><td>主 URL 回 HTTP 200,且不是純文字 summary。</td><td>要求 upload-files 或重新發布 Gist/PR/deployment。</td></tr>
<tr><td>owner_due_acceptance</td><td>每個 active project 都有 owner、due、acceptance。</td><td>產生 LINE unblock 訊息與 D1 補件 task。</td></tr>
<tr><td>e2e_evidence</td><td>從 signal 到 artifact 到 adoption metric 有可追溯證據。</td><td>補 e2e-verification.md,未補不得 complete。</td></tr>
<tr><td>portfolio_decision</td><td>D14 後有 keep/merge/pause/kill/fund 之一。</td><td>升級到 project review decision memo。</td></tr>
</tbody>
</table>
</div>
<div class="card">
<h2>Data / API / Permissions</h2>
<p><strong>Tables:</strong> <code>project_heartbeats</code>, <code>completion_gates</code>, <code>repair_proposals</code>, <code>artifact_verifications</code>, <code>project_review_events</code>.</p>
<p><strong>APIs:</strong> <code>GET /projects/:id/heartbeat-status</code>, <code>POST /projects/:id/repair-proposals</code>, <code>POST /artifacts/verify</code>.</p>
<p><strong>Permissions:</strong> owners can resolve gates; workers can append evidence; supervisors can approve escalation; audit log is append-only.</p>
</div>
</section>
<section class="grid three">
<div class="card">
<h2>Value / Money Path</h2>
<p>把人工追進度改成可重跑的 gate,可節省 PM 追蹤時間、降低假完成造成的返工、提早發現部署與 worker 失效,讓可賣的專案更快進入提案、收款或交付。</p>
</div>
<div class="card">
<h2>Human Capability Upgrade</h2>
<p>使用者不只看到「紅黃綠」,而是知道下一句要問誰、缺什麼證據、哪個決策最晚要拍板,以及該用哪種 solution route 補洞。</p>
</div>
<div class="card">
<h2>Next-Round Upgrade</h2>
<p>將 <code>system_settings</code> heartbeat 與 strategy map node 實際資料接進 scorecard,讓每個 project review 自動產生 repair queue。</p>
</div>
</section>
<section class="card source">
<h2>Market Maturity Inputs</h2>
<p>Google SRE monitoring practice emphasizes service health visibility and alerting that keeps signal high and noise low: <a href="https://sre.google/resources/book-update/monitoring-distributed-systems/">Google SRE Monitoring Distributed Systems</a>.</p>
<p>Atlassian incident management metrics use MTTA/MTTR-style measures to expose response and resolution bottlenecks: <a href="https://www.atlassian.com/incident-management/kpis/common-metrics">Atlassian Common Incident Metrics</a>.</p>
<p>DORA software delivery metrics connect throughput, stability, change failure, and restore time into an operating cadence: <a href="https://dora.dev/guides/dora-metrics/">DORA metrics guide</a>.</p>
</section>
</main>
</body>
</html>

Skill / Tool Usage

Selected Tools

  • PLS session helper: doctor, touch, claim, context, progress, upload-files, complete.
  • Web search: checked comparable practices from Google SRE, Atlassian incident metrics, and DORA metrics.
  • GitHub CLI: publishes the production artifact pack as a public Gist.
  • curl: verifies that the primary artifact URL is openable.

Evidence

The job was claimed through the fixed helper, context was read, progress was written, external practices were checked, and the final artifact pack is verified through an HTTP status check before completion.

Solution Selection

Selected route: watchdog / watchdog_config.

Reason: The latest evidence is a heartbeat node implementation in the strategy map. The project risk is not lack of writing; it is unreliable completion, stale project state, missing verification, and unclear repair ownership. A watchdog route can detect, escalate, and trigger the next action.

Production stack:

  • Context framework: project completion operating system.
  • Workflow: heartbeat -> gate check -> repair proposal -> owner resolution -> review decision.
  • Data model: heartbeat, gate, repair proposal, artifact verification, review event.
  • Tool: openable operating console.
  • Acceptance: pass/fail gates and HTTP-verified artifact.
  • Upgrade: real PLS backend integration and repair queue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment