Skip to content

Instantly share code, notes, and snippets.

@arajkumar
Created May 22, 2026 10:58
Show Gist options
  • Select an option

  • Save arajkumar/1c2d105d80215b7328af344e71971a9d to your computer and use it in GitHub Desktop.

Select an option

Save arajkumar/1c2d105d80215b7328af344e71971a9d to your computer and use it in GitHub Desktop.
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>Tigerlake vs live-sync — starting with 4 tables</title>
<style>
:root {
--bg: #0f1320;
--panel: #161c2e;
--panel-2: #1d2540;
--accent: #6cb9ff;
--accent-2: #ff9f6c;
--accent-3: #7be3a4;
--accent-4: #f17ad6;
--warn: #ffd166;
--text: #e7ecf5;
--muted: #9aa4bd;
--line: #2b3656;
}
html, body { background: var(--bg); color: var(--text); font: 14px/1.5 -apple-system, "SF Pro Text", "Inter", sans-serif; margin: 0; }
main { max-width: 1240px; margin: 28px auto; padding: 0 20px 64px; }
h1 { font-size: 22px; margin: 0 0 4px; }
h2 { font-size: 18px; margin: 28px 0 6px; color: var(--accent); }
h3 { font-size: 15px; margin: 16px 0 6px; color: var(--accent-2); }
h4 { font-size: 13px; margin: 12px 0 4px; color: var(--accent-3); letter-spacing: .2px; text-transform: uppercase; }
p { margin: 6px 0; }
code, .mono { font: 12.5px/1.4 ui-monospace, "JetBrains Mono", Menlo, Consolas, monospace; color: #d6e0ff; background: #0c1020; padding: 1px 5px; border-radius: 4px; border: 1px solid #1d2540; }
.lede { color: var(--muted); margin: 0 0 8px; }
.meta { color: var(--muted); font-size: 12.5px; }
ol, ul { margin: 6px 0 0 18px; padding: 0; }
li { margin: 4px 0; }
svg { width: 100%; height: auto; background: linear-gradient(180deg, #0c1020, #10162a); border-radius: 12px; border: 1px solid var(--line); }
.card { background: var(--panel); border: 1px solid var(--line); border-radius: 10px; padding: 14px 16px; margin: 10px 0; }
.twocol { display: grid; grid-template-columns: 1fr 1fr; gap: 16px; }
.tag { display: inline-block; padding: 1px 7px; border-radius: 999px; font-size: 11.5px; border: 1px solid var(--line); background: #0c1020; color: #b3c2e0; }
.tag.ok { color: var(--accent-3); border-color: #1c5a3a; background: #08231a; }
.tag.warn { color: var(--warn); border-color: #5a4a1c; background: #1d1a08; }
.tag.bad { color: #ff7e7e; border-color: #5a1c1c; background: #230808; }
table { border-collapse: collapse; width: 100%; margin-top: 8px; font-size: 13px; }
th, td { border: 1px solid var(--line); padding: 7px 10px; text-align: left; vertical-align: top; }
th { background: #0c1020; color: var(--muted); font-weight: 500; white-space: nowrap; }
.legend { display: flex; flex-wrap: wrap; gap: 14px; margin: 6px 0 8px; }
.legend span { display: inline-flex; align-items: center; gap: 6px; color: var(--muted); font-size: 12.5px; }
.swatch { width: 12px; height: 10px; border-radius: 2px; display: inline-block; }
.verdict { background: #131a30; border: 1px solid #2b3656; border-left: 3px solid var(--accent-3); padding: 12px 14px; border-radius: 8px; }
</style>
</head>
<body>
<main>
<h1>Starting with 4 tables: Tigerlake vs live-sync</h1>
<p class="lede">Same starting state (T1..T4, none synced yet); fundamentally different shape on the wall clock because of single-snapshotter vs worker-pool design.</p>
<h2>Setup assumption</h2>
<div class="twocol">
<div class="card">
<h3>Tigerlake</h3>
<ul>
<li>4 rows in <code>_timescaledb_lake_catalog.sync_tables</code>, none have a matching row in <code>_timescaledb_lake_tables.snapshot</code> → all 4 are <code>SnapshotStatus.Needed</code>.</li>
<li>1 logical replication slot <code>timescaledb_lake_slot</code> shared by everything.</li>
<li>1 snapshotter thread, 1 CDC thread.</li>
</ul>
</div>
<div class="card">
<h3>live-sync</h3>
<ul>
<li>4 rows in <code>_ts_live_sync.subscription_rel</code> with <code>state ∈ {INIT, DATASYNC}</code> (anything but <code>READY</code>/<code>NONE</code>).</li>
<li>1 leader slot for the subscription + 1 per-table replication slot per active tablesync worker.</li>
<li>Worker pool of size <code>tableSyncWorkers</code> (let's say <b>4</b> to match table count; can be smaller).</li>
</ul>
</div>
</div>
<h2>Wall-clock timeline (all 4 fresh, never snapshotted)</h2>
<div class="legend">
<span><span class="swatch" style="background:#6cb9ff"></span>Bulk copy / SELECT</span>
<span><span class="swatch" style="background:#7be3a4"></span>Catchup / branch merge</span>
<span><span class="swatch" style="background:#ff9f6c"></span>Steady-state CDC on main</span>
<span><span class="swatch" style="background:#9aa4bd"></span>Waiting (Needed; CDC discarded)</span>
</div>
<svg viewBox="0 0 1200 500" role="img" aria-label="Wall-clock timeline for 4 fresh tables">
<defs>
<style>
.lane { stroke: #2b3656; stroke-dasharray: 4 4; }
.title { font: 600 13px -apple-system, sans-serif; fill: #e7ecf5; }
.sub { font: 11.5px -apple-system, sans-serif; fill: #cfd9f5; }
.axis { stroke: #2b3656; stroke-width: 1; }
.axisLabel { font: 11px ui-monospace, Menlo, monospace; fill: #9aa4bd; }
.barWait { fill: #2b3656; stroke: #3a4566; }
.barCopy { fill: #1d3a55; stroke: #6cb9ff; }
.barCatch { fill: #15402a; stroke: #7be3a4; }
.barSteady { fill: #3a2614; stroke: #ff9f6c; }
.barLabel { font: 11.5px -apple-system, sans-serif; fill: #e7ecf5; }
.tagS { font: 11px ui-monospace, Menlo, monospace; fill: #9aa4bd; }
</style>
</defs>
<!-- Tigerlake half -->
<text class="title" x="20" y="28">Tigerlake — single snapshotter (serial)</text>
<text class="sub" x="20" y="46">CDC thread runs throughout; only one table is actively being copied at any moment.</text>
<line class="axis" x1="80" y1="240" x2="1180" y2="240"/>
<g class="axisLabel">
<text x="80" y="258">t0</text>
<text x="280" y="258">copy(T1)</text>
<text x="480" y="258">copy(T2)</text>
<text x="680" y="258">copy(T3)</text>
<text x="880" y="258">copy(T4)</text>
<text x="1100" y="258">→ t</text>
</g>
<!-- T1 row -->
<text class="sub" x="20" y="74">T1</text>
<rect class="barCopy" x="80" y="60" width="200" height="22" rx="3"/>
<text class="barLabel" x="92" y="76">snapshot</text>
<rect class="barCatch" x="280" y="60" width="20" height="22" rx="3"/>
<rect class="barSteady" x="300" y="60" width="880" height="22" rx="3"/>
<text class="barLabel" x="312" y="76">CDC on main</text>
<!-- T2 row -->
<text class="sub" x="20" y="104">T2</text>
<rect class="barWait" x="80" y="90" width="200" height="22" rx="3"/>
<text class="barLabel" x="92" y="106">wait (CDC discarded)</text>
<rect class="barCopy" x="280" y="90" width="200" height="22" rx="3"/>
<text class="barLabel" x="292" y="106">snapshot</text>
<rect class="barCatch" x="480" y="90" width="20" height="22" rx="3"/>
<rect class="barSteady" x="500" y="90" width="680" height="22" rx="3"/>
<text class="barLabel" x="512" y="106">CDC on main</text>
<!-- T3 row -->
<text class="sub" x="20" y="134">T3</text>
<rect class="barWait" x="80" y="120" width="400" height="22" rx="3"/>
<text class="barLabel" x="92" y="136">wait (CDC discarded)</text>
<rect class="barCopy" x="480" y="120" width="200" height="22" rx="3"/>
<text class="barLabel" x="492" y="136">snapshot</text>
<rect class="barCatch" x="680" y="120" width="20" height="22" rx="3"/>
<rect class="barSteady" x="700" y="120" width="480" height="22" rx="3"/>
<text class="barLabel" x="712" y="136">CDC on main</text>
<!-- T4 row -->
<text class="sub" x="20" y="164">T4</text>
<rect class="barWait" x="80" y="150" width="600" height="22" rx="3"/>
<text class="barLabel" x="92" y="166">wait (CDC discarded)</text>
<rect class="barCopy" x="680" y="150" width="200" height="22" rx="3"/>
<text class="barLabel" x="692" y="166">snapshot</text>
<rect class="barCatch" x="880" y="150" width="20" height="22" rx="3"/>
<rect class="barSteady" x="900" y="150" width="280" height="22" rx="3"/>
<text class="barLabel" x="912" y="166">CDC on main</text>
<!-- Snapshotter status -->
<text class="sub" x="20" y="200">snap thread</text>
<rect class="barCopy" x="80" y="186" width="800" height="22" rx="3"/>
<text class="barLabel" x="92" y="202">busy: T1 → T2 → T3 → T4 (serial, single thread)</text>
<!-- CDC thread status -->
<text class="sub" x="20" y="226">cdc thread</text>
<rect class="barSteady" x="80" y="212" width="1100" height="22" rx="3"/>
<text class="barLabel" x="92" y="228">decoding+routing throughout (events for Needed tables decoded then DISCARDED)</text>
<!-- live-sync half -->
<text class="title" x="20" y="296">live-sync — worker pool (parallel up to tableSyncWorkers)</text>
<text class="sub" x="20" y="314">All 4 workers can start concurrently; bounded by source bandwidth and target apply throughput.</text>
<line class="axis" x1="80" y1="490" x2="1180" y2="490"/>
<g class="axisLabel">
<text x="80" y="508">t0</text>
<text x="600" y="508">all 4 sync in parallel</text>
<text x="1100" y="508">→ t</text>
</g>
<text class="sub" x="20" y="344">T1</text>
<rect class="barCopy" x="80" y="330" width="220" height="22" rx="3"/>
<text class="barLabel" x="92" y="346">COPY (slot_T1, USE_SNAPSHOT)</text>
<rect class="barCatch" x="300" y="330" width="60" height="22" rx="3"/>
<text class="barLabel" x="305" y="346">catchup</text>
<rect class="barSteady" x="360" y="330" width="820" height="22" rx="3"/>
<text class="barLabel" x="372" y="346">leader applies (READY)</text>
<text class="sub" x="20" y="374">T2</text>
<rect class="barCopy" x="80" y="360" width="180" height="22" rx="3"/>
<text class="barLabel" x="92" y="376">COPY (slot_T2)</text>
<rect class="barCatch" x="260" y="360" width="60" height="22" rx="3"/>
<rect class="barSteady" x="320" y="360" width="860" height="22" rx="3"/>
<text class="barLabel" x="332" y="376">leader applies (READY)</text>
<text class="sub" x="20" y="404">T3</text>
<rect class="barCopy" x="80" y="390" width="260" height="22" rx="3"/>
<text class="barLabel" x="92" y="406">COPY (slot_T3)</text>
<rect class="barCatch" x="340" y="390" width="60" height="22" rx="3"/>
<rect class="barSteady" x="400" y="390" width="780" height="22" rx="3"/>
<text class="barLabel" x="412" y="406">leader applies (READY)</text>
<text class="sub" x="20" y="434">T4</text>
<rect class="barCopy" x="80" y="420" width="160" height="22" rx="3"/>
<text class="barLabel" x="92" y="436">COPY (slot_T4)</text>
<rect class="barCatch" x="240" y="420" width="60" height="22" rx="3"/>
<rect class="barSteady" x="300" y="420" width="880" height="22" rx="3"/>
<text class="barLabel" x="312" y="436">leader applies (READY)</text>
<text class="sub" x="20" y="464">leader</text>
<rect class="barSteady" x="80" y="450" width="1100" height="22" rx="3"/>
<text class="barLabel" x="92" y="466">streaming main slot; skips DML for non-READY rels, takes them over as each goes SYNCDONE</text>
</svg>
<h2>Step-by-step: Tigerlake</h2>
<ol>
<li><b>Startup:</b> <code>SyncTableSnapshotter.snapshotSyncTables()</code> opens REPEATABLE READ, captures startup horizon <code>S0</code>, joins <code>sync_tables</code> with <code>snapshot</code>. All 4 rows have <code>snapshot_state IS NULL</code> → <code>tableAlreadySnapshotted() = false</code>.</li>
<li><b>Registry seeded</b> with 4 <code>TigerlakeIcebergTable</code>s, each <code>status = Needed</code>, <code>postgresSnapshot = null</code>, <code>icebergTable = null</code>. Plus <code>TigerlakeSyncTables(S0)</code> and <code>HeartbeatTable</code>.</li>
<li><b>Threads spawn:</b> replication / cdc / snapshotter / heartbeat. CDC thread immediately starts pulling from the queue; snapshotter calls <code>nextSnapshot()</code>.</li>
<li><b>nextSnapshot iteration order</b> is <code>ConcurrentHashMap</code>-iteration (hash-bucket, effectively undefined) — <code>findFirst</code> of those with <code>requestSnapshot()</code> non-empty. <b>Whichever bucket comes first wins.</b> Call it T<sub>π(1)</sub>.</li>
<li><b>Snapshot of T<sub>π(1)</sub>:</b> new RR tx, <code>PostgresSnapshot.read</code> gives horizon S<sub>π(1)</sub>, <code>initiateSnapshot</code> flips to <code>InSnapshot</code>, <code>createTable</code> drops+recreates Iceberg table, <code>SELECT *</code>, append files.</li>
<li><b>During that snapshot, CDC for the other three:</b>
<ul>
<li>T<sub>π(1)</sub> CDC: <code>shouldDiscardEvent</code> uses S<sub>π(1)</sub> → discard pre-horizon, write post-horizon to T<sub>π(1)</sub>.pending-branch.</li>
<li>T<sub>π(2..4)</sub> CDC: <code>postgresSnapshot == null</code> → <span class="tag warn">discarded</span>. LSN still advances per batch.</li>
<li>CDC thread is doing real work (decode + lookup + filter) but throwing the data away.</li>
</ul>
</li>
<li><b>T<sub>π(1)</sub> completes:</b> atomic Iceberg transaction appends bulk to <code>main</code>, replays pending branch, drops branch, expires snapshots. T<sub>π(1)</sub>.status = <code>SnapshotComplete</code>.</li>
<li><b>Snapshotter loops</b> → picks the next <code>Needed</code> table. Repeats steps 5–7.</li>
<li><b>After all 4 complete:</b> all four flow to <code>main</code>. Wall-clock total ≈ Σ snapshot times. Iceberg storage holds 4× (bulk + pending branch lifetimes), but pending branches were tiny because most of the CDC was discarded during the wait.</li>
</ol>
<h2>Step-by-step: live-sync</h2>
<ol>
<li><b>Startup:</b> <code>Subscriber.Sync</code> brings up the applier leader on the subscription's main slot, and <code>startTableSync</code> spawns <code>tableSyncWorkers</code> goroutines (each calling <code>copyTable</code> in a loop reading from <code>syncCh</code>).</li>
<li><b>Feeder</b> calls <code>FetchForTableSync</code> → returns all 4 rels where <code>state NOT IN ('r','N')</code>, sends them on <code>syncCh</code>.</li>
<li><b>Workers grab tables off the channel</b> — up to <code>tableSyncWorkers</code> in flight at once. With 4 workers + 4 tables → all 4 start simultaneously.</li>
<li><b>Each worker independently:</b>
<ol>
<li>State <code>DATASYNC</code>.</li>
<li>Pre-create chunks on target (if hypertable).</li>
<li>Open RR tx on source connection; <code>CREATE_REPLICATION_SLOT slot_T<sub>i</sub> LOGICAL pgoutput USE_SNAPSHOT</code> → gives <code>consistent_point</code> = <code>startLSN<sub>i</sub></code>, binds tx to that snapshot atomically.</li>
<li>Begin target tx; <code>OriginCreate(origin_T<sub>i</sub>)</code>; <code>OriginAdvance(startLSN<sub>i</sub>)</code>.</li>
<li><code>COPY T<sub>i</sub> TO STDOUT</code> → <code>COPY T<sub>i</sub> FROM STDIN</code>. State <code>FINISHEDCOPY</code>.</li>
<li><code>SendFinishedCopy</code> to leader (channel). Leader responds with <code>Catchup(currentLeaderLSN)</code>.</li>
<li>Start replication on <code>slot_T<sub>i</sub></code> from <code>startLSN<sub>i</sub></code> → apply T<sub>i</sub> DML only → stop at <code>Catchup.LSN</code>.</li>
<li>State <code>SYNCDONE</code>; <code>DROP slot_T<sub>i</sub></code>; <code>DROP origin_T<sub>i</sub></code>; <code>SendSyncDone</code>.</li>
<li>Leader marks T<sub>i</sub> <code>READY</code> and starts applying T<sub>i</sub> DML on the main slot, skipping LSNs ≤ <code>SYNCDONE.LSN</code>.</li>
</ol>
</li>
<li><b>Source PG load during sync:</b> 4 simultaneous <code>COPY TO STDOUT</code> readers + 4 logical slots holding WAL + 1 main slot. <span class="tag warn">Peak source-side pressure.</span></li>
<li><b>Target PG load:</b> 4 simultaneous <code>COPY FROM STDIN</code> writers + leader applying skipped/passed DML.</li>
<li><b>Wall-clock total</b> ≈ <code>max(individual sync times)</code> if <code>tableSyncWorkers ≥ N_tables</code>. If <code>tableSyncWorkers &lt; N_tables</code>, the slowest table sets the lower bound but others queue.</li>
</ol>
<h2>Where time and resources go</h2>
<table>
<tr><th>Dimension</th><th>Tigerlake (4 fresh tables)</th><th>live-sync (4 fresh tables, 4 workers)</th></tr>
<tr><td>Bulk-copy concurrency</td><td><span class="tag bad">1</span> (serial)</td><td><span class="tag ok">4</span> (parallel up to pool size)</td></tr>
<tr><td>Wall-clock to all-READY/Complete</td><td>Σ snapshot times</td><td>max(snapshot times)</td></tr>
<tr><td>Source PG replication slots</td><td>1 (always)</td><td>1 + 4 (peak; one per active worker)</td></tr>
<tr><td>Source PG concurrent COPY readers</td><td>1</td><td>4</td></tr>
<tr><td>Source PG WAL retention pressure</td><td>Bounded by single slot's confirmed_flush_lsn</td><td>Held by 4 tablesync slots until each catches up; can be large for long-running copies</td></tr>
<tr><td>Target write concurrency</td><td>1 Iceberg writer (snapshotter) + 1 CDC writer (serial through queue)</td><td>4 simultaneous <code>COPY FROM STDIN</code> + leader's apply</td></tr>
<tr><td>CDC events for not-yet-snapshotted tables</td><td>Decoded then <span class="tag bad">discarded</span>; recovered later via bulk SELECT</td><td>Held in per-table slot WAL; drained during catchup. No discard.</td></tr>
<tr><td>Wasted decode/route work</td><td>3×duration of T<sub>π(1)</sub> + 2×T<sub>π(2)</sub> + 1×T<sub>π(3)</sub> worth of discarded CDC</td><td>None — every decoded message is applied somewhere</td></tr>
<tr><td>Recovery on restart mid-copy</td><td>Per-table catalog row in <code>_timescaledb_lake_tables.snapshot</code>; mid-snapshot crash → table back to Needed, bulk redone</td><td>Per-table state machine; <code>FINISHEDCOPY</code> resumes catchup from origin; <code>DATASYNC</code> recopies (slot+origin recreated)</td></tr>
<tr><td>Failure isolation between tables</td><td><span class="tag bad">Low</span> — one snapshotter, one CDC thread; any uncaught exception in a long-lived thread → whole connector exits via <code>Quarkus.asyncExit(1)</code></td><td><span class="tag ok">High</span> — per-table worker; one worker's failure is isolated (recorded in <code>last_error</code>); retry on next 30s tick</td></tr>
<tr><td>Iceberg/target snapshot count</td><td>Per-table commit count: bulk + final merge + ongoing CDC commits (SnapshotCleaner expires excess)</td><td>n/a (target is PG)</td></tr>
</table>
<h2>What about restart with 4 already-complete tables?</h2>
<div class="twocol">
<div class="card">
<h3>Tigerlake</h3>
<ul>
<li><code>SyncTableSnapshotter</code> finds <code>snapshot_state = FULL</code> for all 4 → <code>tableAlreadySnapshotted() = true</code>.</li>
<li><code>fromRecordedSnapshotState</code> loads each existing Iceberg table; status set to <code>SnapshotComplete</code>, <code>postgresSnapshot = null</code>.</li>
<li><code>shouldDiscardEvent</code> branch <code>(SnapshotComplete &amp;&amp; postgresSnapshot == null) → false</code> → CDC flows directly to <code>main</code> for all 4.</li>
<li>Snapshotter blocks forever on <code>snapshotReady.await()</code> — nothing to do.</li>
<li>Restart is cheap; resumes from the slot's <code>confirmed_flush_lsn</code>.</li>
</ul>
</div>
<div class="card">
<h3>live-sync</h3>
<ul>
<li>Catalog has 4 rows in <code>READY</code>. <code>FetchForTableSync</code> excludes them (<code>state NOT IN ('r','N')</code>) → returns empty.</li>
<li>Tablesync workers idle on <code>syncCh</code> waiting (no work).</li>
<li>Applier leader resumes from main slot's <code>confirmed_flush_lsn</code>; applies DML for all 4 directly.</li>
<li>Symmetric to Tigerlake's restart — both are cheap.</li>
</ul>
</div>
</div>
<h2>Mixed case — 2 complete, 2 fresh</h2>
<p>Behavior is the same as above, applied per-table:</p>
<ul>
<li><b>Tigerlake:</b> the 2 complete tables flow to <code>main</code> immediately; the 2 fresh tables are snapshotted serially, each going through the <code>Needed → InSnapshot → SnapshotComplete</code> dance with their own pending branch. CDC for the 2 fresh tables is dropped while they wait.</li>
<li><b>live-sync:</b> the 2 complete tables (<code>READY</code>) ride the leader's main slot directly; the 2 fresh tables are picked up by workers in parallel (up to pool size). Their pre-sync CDC is held in their respective per-table slot WAL, never dropped.</li>
</ul>
<h2>The big picture</h2>
<div class="verdict">
<p><b>Tigerlake's 4-table behavior is dominated by serialisation</b>: one snapshotter thread, one CDC thread, no per-table parallelism. The single-slot model means it can't easily do parallel bulk reads without coordinating multiple consistent points itself; today it sidesteps that by doing one table at a time and discarding the CDC for tables still in <code>Needed</code>.</p>
<p><b>live-sync's 4-table behavior is dominated by parallelism</b>: every table gets its own consistent point via <code>USE_SNAPSHOT</code>, its own slot, its own worker, its own origin. The wall-clock saving is large; the cost is paid in source-side WAL retention (peak: 1 + N slots) and source/target IO concurrency.</p>
<p>If your operational priority is <b>fast onboarding of many tables at once</b>, live-sync's design wins decisively. If your priority is <b>predictable, single-knob source pressure with cheap object-storage buffering of in-flight CDC</b> (the lakehouse case), Tigerlake's design is appropriate — but it would still benefit from running the snapshotter in a small pool keyed by table, and from grabbing each table's horizon via <code>USE_SNAPSHOT</code> on a temp slot so the seam is a clean LSN comparison rather than the current txid filter.</p>
</div>
</main>
</body>
</html>
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>Tigerlake PR #325 vs live-sync — snapshot/CDC handoff compared</title>
<style>
:root {
--bg: #0f1320;
--panel: #161c2e;
--panel-2: #1d2540;
--accent: #6cb9ff;
--accent-2: #ff9f6c;
--accent-3: #7be3a4;
--accent-4: #f17ad6;
--warn: #ffd166;
--text: #e7ecf5;
--muted: #9aa4bd;
--line: #2b3656;
}
html, body { background: var(--bg); color: var(--text); font: 14px/1.5 -apple-system, "SF Pro Text", "Inter", sans-serif; margin: 0; }
main { max-width: 1240px; margin: 28px auto; padding: 0 20px 64px; }
h1 { font-size: 22px; margin: 0 0 6px; }
h2 { font-size: 17px; margin: 28px 0 8px; color: var(--accent); letter-spacing: .2px; }
h3 { font-size: 14px; margin: 14px 0 6px; color: var(--accent-2); }
p { margin: 6px 0; }
code, .mono { font: 12.5px/1.4 ui-monospace, "JetBrains Mono", Menlo, Consolas, monospace; color: #d6e0ff; background: #0c1020; padding: 1px 5px; border-radius: 4px; border: 1px solid #1d2540; }
.lede { color: var(--muted); margin: 0 0 8px; }
.meta { color: var(--muted); font-size: 12.5px; }
.grid { display: grid; grid-template-columns: 1fr 1fr; gap: 16px; }
.card { background: var(--panel); border: 1px solid var(--line); border-radius: 10px; padding: 14px 16px; }
.card h3:first-child { margin-top: 0; }
.card.alt { background: var(--panel-2); }
ul, ol { margin: 6px 0 0 18px; padding: 0; }
li { margin: 4px 0; }
svg { width: 100%; height: auto; background: linear-gradient(180deg, #0c1020, #10162a); border-radius: 12px; border: 1px solid var(--line); }
table { border-collapse: collapse; width: 100%; margin-top: 8px; font-size: 13px; }
th, td { border: 1px solid var(--line); padding: 7px 10px; text-align: left; vertical-align: top; }
th { background: #0c1020; color: var(--muted); font-weight: 500; white-space: nowrap; }
th.col1 { width: 220px; }
td b { color: #d6e0ff; }
.tag { display: inline-block; padding: 1px 7px; border-radius: 999px; font-size: 11.5px; border: 1px solid var(--line); background: #0c1020; color: #b3c2e0; }
.tag.ok { color: var(--accent-3); border-color: #1c5a3a; background: #08231a; }
.tag.warn { color: var(--warn); border-color: #5a4a1c; background: #1d1a08; }
.tag.bad { color: #ff7e7e; border-color: #5a1c1c; background: #230808; }
.verdict { background: #131a30; border: 1px solid #2b3656; border-left: 3px solid var(--accent-3); padding: 12px 14px; border-radius: 8px; }
</style>
</head>
<body>
<main>
<h1>Tigerlake (PR #325) vs live-sync — how the snapshot ↔ CDC seam is sewn</h1>
<p class="lede">Same problem (lossless initial copy + continuous CDC); different targets force different designs. Here's the side-by-side and where each one wins.</p>
<p class="meta">Compares <code>timescaledb-tigerlake</code> PR #325 (Iceberg sink, Java/Quarkus) with <code>live-sync</code> (PG→PG/TimescaleDB, Go) — <code>pkg/subscription/{tablesync,applier}</code>.</p>
<h2>Side-by-side architecture</h2>
<svg viewBox="0 0 1240 540" role="img" aria-label="Tigerlake vs live-sync architecture comparison">
<defs>
<marker id="a" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse"><path d="M0,0 L10,5 L0,10 z" fill="#9aa4bd"/></marker>
<marker id="aB" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse"><path d="M0,0 L10,5 L0,10 z" fill="#6cb9ff"/></marker>
<marker id="aG" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse"><path d="M0,0 L10,5 L0,10 z" fill="#7be3a4"/></marker>
<marker id="aO" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse"><path d="M0,0 L10,5 L0,10 z" fill="#ff9f6c"/></marker>
<style>
.title { font: 600 13px -apple-system, sans-serif; fill: #e7ecf5; }
.sub { font: 11.5px -apple-system, sans-serif; fill: #cfd9f5; }
.sml { font: 11px ui-monospace, Menlo, monospace; fill: #9aa4bd; }
.box { fill: #1d2540; stroke: #2b3656; }
.boxA { fill: #102035; stroke: #6cb9ff; }
.boxB { fill: #2a1c14; stroke: #ff9f6c; }
.boxC { fill: #102a1c; stroke: #7be3a4; }
.boxD { fill: #2a1429; stroke: #f17ad6; }
.boxX { fill: #161c2e; stroke: #2b3656; stroke-dasharray: 4 3; }
.e { stroke: #9aa4bd; stroke-width: 1.4; fill: none; }
.eB { stroke: #6cb9ff; stroke-width: 1.5; fill: none; }
.eG { stroke: #7be3a4; stroke-width: 1.5; fill: none; }
.eO { stroke: #ff9f6c; stroke-width: 1.5; fill: none; }
</style>
</defs>
<!-- ===== LEFT: Tigerlake ===== -->
<g>
<rect class="boxX" x="20" y="20" width="600" height="500" rx="10"/>
<text class="title" x="34" y="42">Tigerlake (PR #325) — single slot, Iceberg branches</text>
<!-- PG -->
<rect class="boxB" x="40" y="60" width="160" height="440" rx="8"/>
<text class="title" x="52" y="80">PostgreSQL</text>
<rect class="box" x="56" y="96" width="128" height="58" rx="6"/>
<text class="sub" x="64" y="116">1 logical slot</text>
<text class="sml" x="64" y="134">timescaledb_lake_slot</text>
<text class="sml" x="64" y="148">pub timescaledb_lake</text>
<rect class="box" x="56" y="166" width="128" height="58" rx="6"/>
<text class="sub" x="64" y="186">pg_current_snapshot</text>
<text class="sml" x="64" y="202">xmin / xmax / xips</text>
<text class="sml" x="64" y="216">REPEATABLE READ tx</text>
<rect class="box" x="56" y="236" width="128" height="50" rx="6"/>
<text class="sub" x="64" y="256">SELECT * (bulk)</text>
<text class="sml" x="64" y="274">in snapshot tx</text>
<rect class="box" x="56" y="298" width="128" height="50" rx="6"/>
<text class="sub" x="64" y="318">sync_tables</text>
<text class="sml" x="64" y="336">drives registration</text>
<rect class="box" x="56" y="360" width="128" height="58" rx="6"/>
<text class="sub" x="64" y="380">dbz_heartbeat</text>
<text class="sml" x="64" y="398">keeps LSN moving</text>
<text class="sml" x="64" y="412">when idle</text>
<!-- Connector -->
<rect class="boxA" x="220" y="60" width="200" height="440" rx="8"/>
<text class="title" x="232" y="80">Quarkus connector</text>
<rect class="box" x="234" y="96" width="172" height="80" rx="6"/>
<text class="sub" x="244" y="116">ReplicationCDCReader</text>
<text class="sml" x="244" y="134">pgoutput decode</text>
<text class="sml" x="244" y="150">→ BlockingQueue</text>
<text class="sml" x="244" y="166">single producer</text>
<rect class="box" x="234" y="188" width="172" height="92" rx="6"/>
<text class="sub" x="244" y="208">TigerlakeSink (1 thread)</text>
<text class="sml" x="244" y="226">resolve table → registry</text>
<text class="sml" x="244" y="242">writeCDC → branch</text>
<text class="sml" x="244" y="258">commitCDC + advance LSN</text>
<text class="sml" x="244" y="274">txidIsBefore filter</text>
<rect class="boxC" x="234" y="292" width="172" height="78" rx="6"/>
<text class="sub" x="244" y="312">Snapshotter (1 thread)</text>
<text class="sml" x="244" y="330">bulk SELECT once per table</text>
<text class="sml" x="244" y="346">writes via SnapshotWriter</text>
<text class="sml" x="244" y="362">serial across tables</text>
<rect class="box" x="234" y="382" width="172" height="74" rx="6"/>
<text class="sub" x="244" y="402">TigerlakeRegistry</text>
<text class="sml" x="244" y="420">name → TigerlakeTable</text>
<text class="sml" x="244" y="436">snapshotPendingLock,</text>
<text class="sml" x="244" y="452">icebergWriteLock</text>
<!-- Iceberg -->
<rect class="boxD" x="440" y="60" width="160" height="440" rx="8"/>
<text class="title" x="452" y="80">Iceberg / S3</text>
<rect class="box" x="456" y="96" width="128" height="100" rx="6"/>
<text class="sub" x="466" y="116">Per table:</text>
<text class="sml" x="466" y="134">branch main</text>
<text class="sml" x="466" y="150"> ← bulk + steady CDC</text>
<text class="sml" x="466" y="170">branch pending</text>
<text class="sml" x="466" y="186"> ← CDC during snapshot</text>
<rect class="box" x="456" y="208" width="128" height="78" rx="6"/>
<text class="sub" x="466" y="228">Atomic merge</text>
<text class="sml" x="466" y="246">newTransaction:</text>
<text class="sml" x="466" y="262"> bulk → main,</text>
<text class="sml" x="466" y="278"> replay pending → main</text>
<rect class="box" x="456" y="298" width="128" height="58" rx="6"/>
<text class="sub" x="466" y="318">commit ordering</text>
<text class="sml" x="466" y="336">via icebergWriteLock</text>
<rect class="box" x="456" y="368" width="128" height="100" rx="6"/>
<text class="sub" x="466" y="388">SnapshotCleaner</text>
<text class="sml" x="466" y="406">expire snapshots</text>
<text class="sml" x="466" y="422">keep maxSnapshots</text>
<!-- edges -->
<path class="eO" d="M 184,124 L 234,124" marker-end="url(#aO)"/>
<path class="eO" d="M 184,260 L 234,310" marker-end="url(#aO)"/>
<path class="eB" d="M 406,224 L 456,160" marker-end="url(#aB)"/>
<path class="eG" d="M 406,320 L 456,250" marker-end="url(#aG)"/>
</g>
<!-- ===== RIGHT: live-sync ===== -->
<g transform="translate(0,0)">
<rect class="boxX" x="640" y="20" width="580" height="500" rx="10"/>
<text class="title" x="654" y="42">live-sync — per-table slot + LSN watermark (PG-native)</text>
<!-- PG source -->
<rect class="boxB" x="660" y="60" width="160" height="440" rx="8"/>
<text class="title" x="672" y="80">Source PG</text>
<rect class="box" x="676" y="96" width="128" height="70" rx="6"/>
<text class="sub" x="684" y="116">Leader slot</text>
<text class="sml" x="684" y="134">main pgoutput stream</text>
<text class="sml" x="684" y="150">drives applier leader</text>
<rect class="box" x="676" y="178" width="128" height="86" rx="6"/>
<text class="sub" x="684" y="198">Tablesync slot</text>
<text class="sml" x="684" y="216">CREATE_REPLICATION_SLOT</text>
<text class="sml" x="684" y="232">USE_SNAPSHOT</text>
<text class="sml" x="684" y="248">one per table</text>
<text class="sml" x="684" y="262">→ consistent point LSN</text>
<rect class="box" x="676" y="276" width="128" height="58" rx="6"/>
<text class="sub" x="684" y="296">COPY TO STDOUT</text>
<text class="sml" x="684" y="314">same REPEATABLE READ</text>
<text class="sml" x="684" y="328">tx as the slot</text>
<rect class="box" x="676" y="346" width="128" height="78" rx="6"/>
<text class="sub" x="684" y="366">WAL retention</text>
<text class="sml" x="684" y="384">N slots hold WAL</text>
<text class="sml" x="684" y="400">until each catches up</text>
<!-- Connector -->
<rect class="boxA" x="840" y="60" width="200" height="440" rx="8"/>
<text class="title" x="852" y="80">Connector (Go)</text>
<rect class="box" x="854" y="96" width="172" height="92" rx="6"/>
<text class="sub" x="864" y="116">Applier leader (1)</text>
<text class="sml" x="864" y="134">decodes pgoutput</text>
<text class="sml" x="864" y="150">applies SQL on target</text>
<text class="sml" x="864" y="166">skips DML for non-READY</text>
<text class="sml" x="864" y="182">relations</text>
<rect class="boxC" x="854" y="200" width="172" height="102" rx="6"/>
<text class="sub" x="864" y="220">TableSync workers (N)</text>
<text class="sml" x="864" y="238">COPY → catchup → SYNCDONE</text>
<text class="sml" x="864" y="254">own slot, own connection</text>
<text class="sml" x="864" y="270">parallel across tables</text>
<text class="sml" x="864" y="288">re-uses applier for catchup</text>
<rect class="box" x="854" y="314" width="172" height="98" rx="6"/>
<text class="sub" x="864" y="334">SyncState channels</text>
<text class="sml" x="864" y="352">FinishedCopy →</text>
<text class="sml" x="864" y="368">Catchup(LSN) ←</text>
<text class="sml" x="864" y="384">SyncDone →</text>
<text class="sml" x="864" y="400">leader picks up rel</text>
<rect class="box" x="854" y="422" width="172" height="64" rx="6"/>
<text class="sub" x="864" y="442">Catalog state machine</text>
<text class="sml" x="864" y="460">DATASYNC → FINISHEDCOPY</text>
<text class="sml" x="864" y="476">→ SYNCDONE → READY</text>
<!-- Target -->
<rect class="boxD" x="1060" y="60" width="140" height="440" rx="8"/>
<text class="title" x="1072" y="80">Target PG/TS</text>
<rect class="box" x="1076" y="96" width="108" height="80" rx="6"/>
<text class="sub" x="1084" y="116">User table</text>
<text class="sml" x="1084" y="134">COPY FROM</text>
<text class="sml" x="1084" y="150">STDIN</text>
<text class="sml" x="1084" y="166">(mutable)</text>
<rect class="box" x="1076" y="188" width="108" height="78" rx="6"/>
<text class="sub" x="1084" y="208">_ts_live_sync</text>
<text class="sml" x="1084" y="226">catalog: per-rel</text>
<text class="sml" x="1084" y="242">state + LSN</text>
<rect class="box" x="1076" y="278" width="108" height="78" rx="6"/>
<text class="sub" x="1084" y="298">Replication</text>
<text class="sml" x="1084" y="316">origin per rel</text>
<text class="sml" x="1084" y="332">tracks LSN</text>
<rect class="box" x="1076" y="368" width="108" height="78" rx="6"/>
<text class="sub" x="1084" y="388">Trigger model</text>
<text class="sml" x="1084" y="406">ALWAYS triggers</text>
<text class="sml" x="1084" y="422">session_replication</text>
<text class="sml" x="1084" y="438">_role=replica</text>
<!-- edges -->
<path class="eO" d="M 804,130 L 854,140" marker-end="url(#aO)"/>
<path class="eO" d="M 804,220 L 854,250" marker-end="url(#aO)"/>
<path class="eO" d="M 804,300 L 1076,130" marker-end="url(#aO)"/>
<path class="eB" d="M 1026,150 L 1076,150" marker-end="url(#aB)"/>
<path class="eG" d="M 1026,260 L 1076,225" marker-end="url(#aG)"/>
</g>
</svg>
<h2>The same problem, two different primitives</h2>
<div class="grid">
<div class="card">
<h3>Tigerlake: txid filter + Iceberg branches</h3>
<ol>
<li>Snapshotter captures <code>pg_current_snapshot()</code> in REPEATABLE READ → <code>{xmin, xmax, xips, epoch}</code>.</li>
<li>Bulk <code>SELECT *</code> under same tx → snapshot files appended (no commit yet).</li>
<li>CDC events arriving in parallel: <code>txidIsBefore</code> drops everything the bulk already saw; the rest goes to <code>snapshot-pending-branch</code>.</li>
<li>One Iceberg transaction appends bulk to <code>main</code>, replays pending branch onto <code>main</code>, drops branch.</li>
</ol>
<p class="meta">Watermark unit: <b>transaction id</b>. Buffering medium: <b>Iceberg branch</b>.</p>
</div>
<div class="card">
<h3>live-sync: USE_SNAPSHOT + LSN watermark</h3>
<ol>
<li>Tablesync worker opens REPEATABLE READ tx, then <code>CREATE_REPLICATION_SLOT … USE_SNAPSHOT</code> — slot's <em>consistent point LSN</em> and tx snapshot are atomically aligned by PG.</li>
<li>Same tx does <code>COPY TO STDOUT</code> → <code>COPY FROM STDIN</code> on the target. State <code>DATASYNC → FINISHEDCOPY</code>.</li>
<li>Worker signals applier leader; leader returns its current LSN as <code>CatchupMsg</code>.</li>
<li>Worker starts replication on its own slot from <code>consistentPointLSN</code> up to <code>catchupLSN</code>, applying only DML for this relation. State <code>FINISHEDCOPY → SYNCDONE</code>.</li>
<li>Leader sees <code>SYNCDONE</code>, takes over that relation on the main stream from <code>SYNCDONE.lsn</code>. State <code>SYNCDONE → READY</code>.</li>
</ol>
<p class="meta">Watermark unit: <b>LSN</b>. Buffering medium: <b>WAL on the source (per-table slot)</b>.</p>
</div>
</div>
<h2>Comparison matrix</h2>
<table>
<tr><th class="col1">Dimension</th><th>Tigerlake (PR #325)</th><th>live-sync</th></tr>
<tr><td>Source slots</td><td>1 (single global)</td><td>1 leader + 1 per-table tablesync (dropped on READY)</td></tr>
<tr><td>Consistency primitive</td><td>Custom <code>txidIsBefore(xmin,xmax,xips,epoch)</code></td><td>PG-native <code>USE_SNAPSHOT</code> at slot creation; LSN watermark thereafter</td></tr>
<tr><td>Snapshot ↔ stream join</td><td>Iceberg <code>snapshot-pending-branch</code> buffers in-flight CDC, atomically merged to <code>main</code> on completion</td><td>Per-table slot replays WAL from consistent point to leader's current LSN; then leader takes over</td></tr>
<tr><td>Where in-flight CDC sits</td><td>Iceberg pending branch (object storage; cheap)</td><td>WAL on source (slot retention; expensive if many large tables)</td></tr>
<tr><td>Initial-copy parallelism</td><td>Serial — one snapshotter thread</td><td>Parallel — N tablesync workers, configurable concurrency</td></tr>
<tr><td>CDC apply parallelism</td><td>Serial — one CDC thread drains the queue</td><td>Serial leader for steady state; parallel during catchup</td></tr>
<tr><td>Pre-snapshot CDC</td><td><b>Discarded</b> by the table's filter (bulk SELECT covers them later)</td><td>Leader <b>skips DML</b> for relations whose state ≠ READY; rows arrive via tablesync slot</td></tr>
<tr><td>Restartability</td><td>Catalog row in <code>_timescaledb_lake_tables.snapshot</code>; Iceberg branch survives. Mid-snapshot crash → re-snap from scratch</td><td>Catalog state machine + slot survives. <code>FINISHEDCOPY</code> resumes catchup from origin's LSN; <code>DATASYNC</code> recopies</td></tr>
<tr><td>WAL pressure on source</td><td>Low — single slot, heartbeat keeps it advancing</td><td>Can spike — every tablesync slot pins WAL until it reaches READY</td></tr>
<tr><td>Target write cost</td><td>File append + occasional Iceberg commit (batched)</td><td>Per-row SQL <code>INSERT/UPDATE/DELETE</code> through pgx (mutable target)</td></tr>
<tr><td>Target mutability</td><td>Append/equality-delete on Iceberg; eventually-consistent via row-keyed deletes</td><td>Native MVCC; reads always see consistent target state</td></tr>
<tr><td>TOAST</td><td>Requires <code>REPLICA IDENTITY FULL</code>; <code>'u'</code> values throw</td><td>Standard pgoutput handling; full row when needed</td></tr>
<tr><td>DDL during sync</td><td>Not yet handled in PR #325 (logical-decoding messages dropped)</td><td>Catalog tracks columns; schema changes propagated</td></tr>
<tr><td>Code size of bridge</td><td>Small — ~400 LOC reader + ~150 LOC snapshot task + ~250 LOC table state</td><td>Larger — ~1000 LOC tablesync + ~1600 LOC applier + state machine + catalog migrations</td></tr>
<tr><td>Operational ergonomics</td><td>1 slot to monitor; Iceberg snapshot history is the audit trail</td><td>N slots to monitor; catalog table is the audit trail; needs slot/origin cleanup logic</td></tr>
<tr><td>Failure blast radius</td><td>One thread dies → whole connector restarts (intentional, via <code>Quarkus.asyncExit</code>)</td><td>Failures isolated per worker; leader survives single-table failures</td></tr>
</table>
<h2>Which is better?</h2>
<p>They're optimizing for different targets, so "better" depends on what you're sinking into.</p>
<div class="grid">
<div class="card">
<h3>Tigerlake's design wins when…</h3>
<ul>
<li><b>Target is a lakehouse</b>. Iceberg branches are <em>the</em> natural way to stage in-flight writes; you get an atomic ref-swap for free.</li>
<li><b>Source WAL retention is tight</b>. One slot, one watermark — easy to monitor and protect with a heartbeat.</li>
<li><b>Iceberg commits are expensive</b>, so you want few large, batched commits — exactly what the single CDC consumer gives you.</li>
<li><b>You don't need to query during snapshot</b> — the pending branch keeps <code>main</code> stable but invisible. Iceberg query engines see "the old world" until the merge lands.</li>
</ul>
</div>
<div class="card">
<h3>live-sync's design wins when…</h3>
<ul>
<li><b>Target is mutable (PG/TimescaleDB)</b>. Direct SQL apply with origin tracking gives strong, queryable consistency throughout.</li>
<li><b>Per-table parallelism matters</b> — many independent tables, want to copy them concurrently with isolated failure domains.</li>
<li><b>You can spend source WAL</b> to avoid carrying a custom txid filter. <code>USE_SNAPSHOT</code> hands you a free, exact, LSN-aligned consistency point — no <code>xmin/xmax/xips</code> arithmetic and no epoch wraparound to worry about.</li>
<li><b>Standard pgoutput semantics (TOAST, DDL, types)</b> are needed and you don't want to reimplement them.</li>
</ul>
</div>
</div>
<h2>Honest tradeoffs (the parts each design pays for)</h2>
<div class="grid">
<div class="card">
<h3>What PR #325 trades away</h3>
<ul>
<li><b>Hand-rolled txid filter</b>. <code>PostgresSnapshot.txidIsBefore</code> reimplements PG's MVCC visibility — including epoch arithmetic. Subtle and not free to maintain. <code>USE_SNAPSHOT</code> would avoid this entirely, but Iceberg doesn't need LSN alignment as strongly as a PG target does.</li>
<li><b>Single CDC consumer</b> is a throughput ceiling. Once the queue fills, all tables stall together.</li>
<li><b>Serial snapshotter</b>. One big table blocks every other pending snapshot.</li>
<li><b>Pre-snapshot CDC is silently discarded</b> for new tables. Correct because the bulk read recovers it, but the invariant — "discarded only when bulk SELECT will see it" — depends on the right txid filter being installed before any non-discardable event arrives. Subtle.</li>
<li><b>Iceberg-table-create race</b> (<code>waitUntilTigerlakeTableIsAvailable</code> spin) — PR notes this as known.</li>
<li><b>DDL not handled</b>; <code>'M'</code> logical-decoding messages dropped.</li>
</ul>
</div>
<div class="card">
<h3>What live-sync trades away</h3>
<ul>
<li><b>WAL retention per table</b>. A stuck or slow tablesync worker can pin many GB of WAL on the source until the slot is dropped. Operational risk grows with table count.</li>
<li><b>More moving parts</b>. Two replication slots per table during sync, an origin, a per-table catalog row, three channel types coordinating leader/worker — more states, more races to reason about.</li>
<li><b>Target schema overhead</b>. Needs <code>_ts_live_sync</code> schema, ALWAYS triggers, migrations. Not zero-touch on the target.</li>
<li><b>SQL apply is row-by-row through pgx</b> — slower per record than a parquet file append, even with batching.</li>
<li><b>Permissions on source</b>. Needs to create/drop slots — not always allowed by managed providers.</li>
</ul>
</div>
</div>
<h2>My read</h2>
<div class="verdict">
<p>Neither is universally "better" — they're targeting fundamentally different sinks. <b>For a lakehouse sink, Tigerlake's design is the right shape</b>: one slot, branch-as-buffer, batched commits. The Iceberg branch primitive is genuinely cleaner than holding WAL on the source, and the atomic merge is easier to reason about than a per-table catchup state machine.</p>
<p>But the <em>internals</em> would be sturdier if Tigerlake adopted live-sync's <b><code>USE_SNAPSHOT</code> trick</b> instead of the custom <code>txidIsBefore</code> filter: open the replication slot with <code>USE_SNAPSHOT</code> (or use <code>pg_export_snapshot()</code> from inside the slot's bootstrap), do the bulk <code>SELECT</code> against that exported snapshot, and the seam becomes a pure LSN comparison — no MVCC arithmetic, no epoch handling, no race between "snapshot installed" and "first CDC event past the horizon." That's the one durable lesson from live-sync that PR #325 could lift wholesale without changing its branching strategy.</p>
<p>The other live-sync idea worth borrowing — <b>per-table parallelism for the initial snapshot</b> — would help with large user catalogs. Today the single snapshotter thread is the bottleneck. Multiple snapshotter threads each writing to their own table's <code>snapshot-pending-branch</code> would parallelise cleanly because the Iceberg writes are already keyed by table.</p>
</div>
</main>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment