Skip to content

Instantly share code, notes, and snippets.

@adubovikov
Created February 15, 2026 10:47
Show Gist options
  • Select an option

  • Save adubovikov/feff91e7978451423b9917330059a280 to your computer and use it in GitHub Desktop.

Select an option

Save adubovikov/feff91e7978451423b9917330059a280 to your computer and use it in GitHub Desktop.
hepic-lake Sharding Benchmark — 55.8M rec/s with 3 shards (+24% vs no sharding, +88% vs baseline)
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>hepic-lake Sharding Benchmark — 55.8M rec/s with 3 Shards</title>
<style>
:root {
--bg: #0f1117;
--card: #1a1d2e;
--border: #2a2d3e;
--text: #e1e4ed;
--muted: #8b8fa3;
--accent: #6c5ce7;
--green: #00b894;
--green-bg: rgba(0,184,148,.12);
--blue: #74b9ff;
--orange: #fdcb6e;
--red: #ff7675;
}
* { margin:0; padding:0; box-sizing:border-box; }
body {
font-family: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
background: var(--bg); color: var(--text); padding: 2rem; line-height: 1.6;
}
.container { max-width: 1200px; margin: 0 auto; }
header { text-align:center; margin-bottom:2.5rem; padding-bottom:1.5rem; border-bottom:1px solid var(--border); }
header h1 { font-size:1.8rem; font-weight:700; letter-spacing:-.02em; margin-bottom:.4rem; }
header .subtitle { color:var(--muted); font-size:.95rem; }
.badge { display:inline-block; padding:.25rem .75rem; border-radius:20px; font-size:.8rem; font-weight:600; text-transform:uppercase; letter-spacing:.04em; margin-top:.8rem; }
.badge-pass { background:var(--green-bg); color:var(--green); border:1px solid rgba(0,184,148,.3); }
.badge-new { background:rgba(108,92,231,.15); color:var(--accent); border:1px solid rgba(108,92,231,.3); }
.kpi-row { display:grid; grid-template-columns:repeat(auto-fit,minmax(170px,1fr)); gap:1rem; margin-bottom:2rem; }
.kpi { background:var(--card); border:1px solid var(--border); border-radius:12px; padding:1.25rem 1.5rem; text-align:center; }
.kpi .label { font-size:.75rem; text-transform:uppercase; letter-spacing:.06em; color:var(--muted); margin-bottom:.4rem; }
.kpi .value { font-size:1.5rem; font-weight:700; letter-spacing:-.02em; }
.kpi .unit { font-size:.8rem; color:var(--muted); font-weight:400; }
.kpi .delta { font-size:.75rem; margin-top:.3rem; }
.kpi .value.green { color:var(--green); }
.kpi .value.blue { color:var(--blue); }
.kpi .value.orange { color:var(--orange); }
.kpi .value.accent { color:var(--accent); }
.delta-up { color:var(--green); }
.delta-down { color:var(--red); }
section { margin-bottom:2rem; }
section h2 { font-size:1.15rem; font-weight:600; margin-bottom:1rem; padding-bottom:.5rem; border-bottom:1px solid var(--border); }
section h3 { font-size:.95rem; font-weight:600; margin:1rem 0 .5rem; color:var(--blue); }
table { width:100%; border-collapse:collapse; background:var(--card); border-radius:12px; overflow:hidden; border:1px solid var(--border); font-size:.9rem; margin-bottom:1.5rem; }
thead th { background:rgba(108,92,231,.15); text-align:left; padding:.75rem 1rem; font-weight:600; font-size:.8rem; text-transform:uppercase; letter-spacing:.04em; color:var(--muted); }
tbody td { padding:.65rem 1rem; border-top:1px solid var(--border); }
tbody tr:hover { background:rgba(255,255,255,.02); }
.mono { font-family:'JetBrains Mono','Fira Code',monospace; font-size:.85rem; }
.text-right { text-align:right; }
.text-center { text-align:center; }
.tag { display:inline-block; padding:.15rem .5rem; border-radius:6px; font-size:.75rem; font-weight:600; }
.tag-pass { background:var(--green-bg); color:var(--green); }
.tag-warn { background:rgba(253,203,110,.12); color:var(--orange); }
.tag-best { background:rgba(108,92,231,.15); color:var(--accent); }
.bar-chart { margin:1rem 0; }
.bar-row { display:flex; align-items:center; margin-bottom:.5rem; }
.bar-label { width:110px; font-size:.8rem; color:var(--muted); text-align:right; padding-right:.75rem; flex-shrink:0; }
.bar-track { flex:1; height:28px; background:rgba(255,255,255,.04); border-radius:6px; overflow:hidden; position:relative; }
.bar-fill { height:100%; border-radius:6px; display:flex; align-items:center; padding-left:.6rem; font-size:.75rem; font-weight:600; color:#fff; white-space:nowrap; }
.bar-noshard { background:var(--orange); }
.bar-sharded { background:var(--green); }
.bar-baseline { background:var(--accent); }
.callout { background:rgba(108,92,231,.08); border:1px solid rgba(108,92,231,.25); border-radius:10px; padding:1rem 1.25rem; margin:1rem 0; font-size:.9rem; }
.callout strong { color:var(--accent); }
.callout-green { background:rgba(0,184,148,.08); border-color:rgba(0,184,148,.25); }
.callout-green strong { color:var(--green); }
.pipeline { display:flex; align-items:center; justify-content:center; gap:0; flex-wrap:wrap; margin:1.5rem 0; }
.pipeline-step { background:var(--card); border:1px solid var(--border); border-radius:10px; padding:.75rem 1rem; text-align:center; min-width:110px; }
.pipeline-step .step-name { font-weight:600; font-size:.85rem; }
.pipeline-step .step-detail { font-size:.7rem; color:var(--muted); margin-top:.2rem; }
.pipeline-step.step-new { border-color:var(--green); box-shadow:0 0 8px rgba(0,184,148,.2); }
.pipeline-step.step-new .step-name { color:var(--green); }
.pipeline-arrow { font-size:1.2rem; color:var(--accent); padding:0 .3rem; flex-shrink:0; }
.two-col { display:grid; grid-template-columns:1fr 1fr; gap:1.5rem; }
@media (max-width:768px) { .two-col { grid-template-columns:1fr; } }
footer { text-align:center; color:var(--muted); font-size:.8rem; margin-top:2rem; padding-top:1rem; border-top:1px solid var(--border); }
footer a { color:var(--accent); }
</style>
</head>
<body>
<div class="container">
<header>
<h1>Sharding Benchmark Report</h1>
<div class="subtitle">hepic-lake v5.0.2 &mdash; No Sharding vs 3 Shards &mdash; 2026-02-15</div>
<div>
<span class="badge badge-new">&#9650; 55.8M rec/s peak (8 tables, sharded)</span>
<span class="badge badge-pass">&#9650; +24% vs no sharding</span>
</div>
</header>
<!-- KPI -->
<div class="kpi-row">
<div class="kpi">
<div class="label">8 Tables (Sharded)</div>
<div class="value green">55.8M <span class="unit">rec/s</span></div>
<div class="delta delta-up">&#9650; +24% vs no sharding</div>
</div>
<div class="kpi">
<div class="label">8 Tables (No Shard)</div>
<div class="value orange">45.0M <span class="unit">rec/s</span></div>
<div class="delta" style="color:var(--muted)">parallel flush only</div>
</div>
<div class="kpi">
<div class="label">4 Tables (Sharded)</div>
<div class="value green">47.9M <span class="unit">rec/s</span></div>
<div class="delta delta-up">&#9650; +14% vs no sharding</div>
</div>
<div class="kpi">
<div class="label">1 Table (No Shard)</div>
<div class="value accent">36.3M <span class="unit">rec/s</span></div>
<div class="delta" style="color:var(--muted)">best for single table</div>
</div>
<div class="kpi">
<div class="label">Per-Table INSERT</div>
<div class="value green">7.7M <span class="unit">rec/s</span></div>
<div class="delta delta-up">&#9650; +133% vs no sharding</div>
</div>
<div class="kpi">
<div class="label">Original Baseline</div>
<div class="value" style="color:var(--muted)">29.7M <span class="unit">rec/s</span></div>
<div class="delta" style="color:var(--muted)">Feb 13 gist</div>
</div>
</div>
<!-- Architecture -->
<section>
<h2>Sharding Architecture</h2>
<div class="pipeline">
<div class="pipeline-step">
<div class="step-name">IPC Client</div>
<div class="step-detail">8 workers &times; 10k</div>
</div>
<div class="pipeline-arrow">&#8594;</div>
<div class="pipeline-step">
<div class="step-name">Coalescer</div>
<div class="step-detail">100ms adaptive</div>
</div>
<div class="pipeline-arrow">&#8594;</div>
<div class="pipeline-step step-new">
<div class="step-name">Shard Router</div>
<div class="step-detail">table &rarr; shard DB</div>
</div>
<div class="pipeline-arrow">&#8594;</div>
<div class="pipeline-step step-new">
<div class="step-name">DuckDB #1</div>
<div class="step-detail">sip: call, reg, default</div>
</div>
<div class="pipeline-arrow" style="color:var(--green)">&#8593;</div>
<div class="pipeline-step step-new">
<div class="step-name">DuckDB #2</div>
<div class="step-detail">media: 5, 34, 35</div>
</div>
<div class="pipeline-arrow" style="color:var(--green)">&#8593;</div>
<div class="pipeline-step step-new">
<div class="step-name">DuckDB #3</div>
<div class="step-detail">logs: 53, 100</div>
</div>
</div>
<div class="callout-green callout">
<strong>3 shards</strong> &mdash; each with its own DuckDB in-memory instance.
All shards share the same DuckLake catalog (SQLite) and Parquet data path.
Tables are routed by suffix: <code>1_call</code> &rarr; sip shard, <code>5_default</code> &rarr; media shard, etc.
</div>
</section>
<!-- Main comparison -->
<section>
<h2>Throughput Comparison &mdash; Client-Side</h2>
<table>
<thead>
<tr><th>Tables</th><th class="text-right">No Sharding (best)</th><th class="text-right">With 3 Shards (best)</th><th class="text-right">Delta</th></tr>
</thead>
<tbody>
<tr>
<td>1 table</td>
<td class="text-right mono"><strong>36,331,248</strong></td>
<td class="text-right mono">29,603,775</td>
<td class="text-right"><span class="tag tag-warn">&minus;18%</span></td>
</tr>
<tr>
<td>4 tables</td>
<td class="text-right mono">41,853,382</td>
<td class="text-right mono" style="color:var(--green)"><strong>47,921,161</strong></td>
<td class="text-right"><span class="tag tag-pass">&#9650; +14%</span></td>
</tr>
<tr>
<td>8 tables</td>
<td class="text-right mono">45,047,294</td>
<td class="text-right mono" style="color:var(--green)"><strong>55,779,272</strong></td>
<td class="text-right"><span class="tag tag-pass">&#9650; +24%</span></td>
</tr>
</tbody>
</table>
<h3>Visual: 8 Tables Throughput</h3>
<div class="bar-chart">
<div class="bar-row">
<div class="bar-label">Baseline (gist)</div>
<div class="bar-track"><div class="bar-fill bar-baseline" style="width:53.2%">29.7M rec/s</div></div>
</div>
<div class="bar-row">
<div class="bar-label">No Sharding</div>
<div class="bar-track"><div class="bar-fill bar-noshard" style="width:80.7%">45.0M rec/s</div></div>
</div>
<div class="bar-row">
<div class="bar-label">3 Shards</div>
<div class="bar-track"><div class="bar-fill bar-sharded" style="width:100%">55.8M rec/s &#9733;</div></div>
</div>
</div>
<h3>Visual: All Configurations</h3>
<div class="bar-chart">
<div class="bar-row">
<div class="bar-label">1T no shard</div>
<div class="bar-track"><div class="bar-fill bar-noshard" style="width:65.1%">36.3M</div></div>
</div>
<div class="bar-row">
<div class="bar-label">1T sharded</div>
<div class="bar-track"><div class="bar-fill bar-sharded" style="width:53.1%">29.6M</div></div>
</div>
<div style="height:.4rem"></div>
<div class="bar-row">
<div class="bar-label">4T no shard</div>
<div class="bar-track"><div class="bar-fill bar-noshard" style="width:75.0%">41.9M</div></div>
</div>
<div class="bar-row">
<div class="bar-label">4T sharded</div>
<div class="bar-track"><div class="bar-fill bar-sharded" style="width:85.9%">47.9M</div></div>
</div>
<div style="height:.4rem"></div>
<div class="bar-row">
<div class="bar-label">8T no shard</div>
<div class="bar-track"><div class="bar-fill bar-noshard" style="width:80.7%">45.0M</div></div>
</div>
<div class="bar-row">
<div class="bar-label">8T sharded</div>
<div class="bar-track"><div class="bar-fill bar-sharded" style="width:100%">55.8M &#9733;</div></div>
</div>
</div>
</section>
<!-- All runs -->
<section>
<h2>All Runs (raw data)</h2>
<div class="two-col">
<div>
<h3>No Sharding</h3>
<table>
<thead><tr><th>Tables</th><th>Run</th><th class="text-right">rec/s</th></tr></thead>
<tbody>
<tr><td>1</td><td>R1</td><td class="text-right mono">31,551,154</td></tr>
<tr><td>1</td><td>R2</td><td class="text-right mono">30,197,979</td></tr>
<tr><td>1</td><td>R3</td><td class="text-right mono"><strong>36,331,248</strong></td></tr>
<tr><td>4</td><td>R1</td><td class="text-right mono">27,405,706</td></tr>
<tr><td>4</td><td>R2</td><td class="text-right mono"><strong>41,853,382</strong></td></tr>
<tr><td>8</td><td>R1</td><td class="text-right mono">32,140,652</td></tr>
<tr><td>8</td><td>R2</td><td class="text-right mono"><strong>45,047,294</strong></td></tr>
</tbody>
</table>
</div>
<div>
<h3>With 3 Shards</h3>
<table>
<thead><tr><th>Tables</th><th>Run</th><th class="text-right">rec/s</th></tr></thead>
<tbody>
<tr><td>1</td><td>R1</td><td class="text-right mono">28,864,077</td></tr>
<tr><td>1</td><td>R2</td><td class="text-right mono"><strong>29,603,775</strong></td></tr>
<tr><td>1</td><td>R3</td><td class="text-right mono">28,774,577</td></tr>
<tr><td>4</td><td>R1</td><td class="text-right mono">28,290,034</td></tr>
<tr><td>4</td><td>R2</td><td class="text-right mono"><strong>47,921,161</strong></td></tr>
<tr><td>8</td><td>R1</td><td class="text-right mono"><strong>55,779,272</strong></td></tr>
<tr><td>8</td><td>R2</td><td class="text-right mono">38,528,544</td></tr>
</tbody>
</table>
</div>
</div>
</section>
<!-- Server-side -->
<section>
<h2>Server-Side DuckDB INSERT (Coalescer)</h2>
<div class="two-col">
<div>
<h3>No Sharding &mdash; 8 tables</h3>
<table>
<thead><tr><th>Table</th><th class="text-right">Records</th><th class="text-right">INSERT rec/s</th></tr></thead>
<tbody>
<tr><td class="mono">53_default</td><td class="text-right mono">3,510,000</td><td class="text-right mono">3,309,847</td></tr>
<tr><td class="mono">5_default</td><td class="text-right mono">3,950,000</td><td class="text-right mono">3,076,908</td></tr>
<tr><td class="mono">34_default</td><td class="text-right mono">4,060,000</td><td class="text-right mono">2,717,538</td></tr>
<tr><td class="mono">1_default</td><td class="text-right mono">4,130,000</td><td class="text-right mono">2,491,551</td></tr>
<tr><td class="mono">1_registration</td><td class="text-right mono">4,260,000</td><td class="text-right mono">2,199,131</td></tr>
</tbody>
</table>
<div class="callout">
<strong>Best per-table INSERT:</strong> 3.3M rec/s &mdash; all tables compete for one DuckDB instance.
</div>
</div>
<div>
<h3>With 3 Shards &mdash; 8 tables</h3>
<table>
<thead><tr><th>Table</th><th class="text-right">Records</th><th class="text-right">INSERT rec/s</th></tr></thead>
<tbody>
<tr><td class="mono">100_default</td><td class="text-right mono">4,900,000</td><td class="text-right mono" style="color:var(--green)"><strong>7,680,859</strong></td></tr>
<tr><td class="mono">35_default</td><td class="text-right mono">4,930,000</td><td class="text-right mono" style="color:var(--green)">6,247,032</td></tr>
<tr><td class="mono">1_registration</td><td class="text-right mono">4,940,000</td><td class="text-right mono">4,697,848</td></tr>
<tr><td class="mono">53_default</td><td class="text-right mono">4,910,000</td><td class="text-right mono">4,275,131</td></tr>
<tr><td class="mono">5_default</td><td class="text-right mono">4,910,000</td><td class="text-right mono">3,180,740</td></tr>
<tr><td class="mono">34_default</td><td class="text-right mono">4,910,000</td><td class="text-right mono">3,142,393</td></tr>
<tr><td class="mono">1_call</td><td class="text-right mono">4,940,000</td><td class="text-right mono">2,930,274</td></tr>
<tr><td class="mono">1_default</td><td class="text-right mono">4,930,000</td><td class="text-right mono">2,782,623</td></tr>
</tbody>
</table>
<div class="callout-green callout">
<strong>Best per-table INSERT:</strong> 7.7M rec/s &mdash; logs shard (2 tables) gets near-exclusive DuckDB access.
</div>
</div>
</div>
</section>
<!-- Shard config -->
<section>
<h2>Shard Configuration</h2>
<table>
<thead><tr><th>Shard</th><th>Tables</th><th>DuckDB Instance</th></tr></thead>
<tbody>
<tr><td class="mono" style="color:var(--blue)">sip</td><td class="mono">1_call, 1_registration, 1_default</td><td>DuckDB #1 (in-memory)</td></tr>
<tr><td class="mono" style="color:var(--green)">media</td><td class="mono">5_default, 34_default, 35_default</td><td>DuckDB #2 (in-memory)</td></tr>
<tr><td class="mono" style="color:var(--orange)">logs</td><td class="mono">53_default, 100_default</td><td>DuckDB #3 (in-memory)</td></tr>
</tbody>
</table>
<p style="color:var(--muted);font-size:.85rem;margin-top:.5rem">All shards share the same DuckLake catalog (SQLite) and Parquet data path. Unmapped tables fall back to the main DuckDB instance.</p>
</section>
<!-- Evolution -->
<section>
<h2>Performance Evolution (8 tables)</h2>
<table>
<thead>
<tr><th>Version</th><th class="text-right">Throughput</th><th class="text-right">vs Baseline</th><th>Key Changes</th></tr>
</thead>
<tbody>
<tr>
<td>Baseline (Feb 13)</td>
<td class="text-right mono">29.7M rec/s</td>
<td class="text-right">&mdash;</td>
<td>1 table, coalesce 50ms/200</td>
</tr>
<tr>
<td>+ DuckDB tuning + adaptive coalescer</td>
<td class="text-right mono">34.9M rec/s</td>
<td class="text-right"><span class="tag tag-pass">+18%</span></td>
<td>object_cache, checkpoint 1GB, 100ms/1000</td>
</tr>
<tr>
<td>+ Parallel flush (no sharding)</td>
<td class="text-right mono">45.0M rec/s</td>
<td class="text-right"><span class="tag tag-pass">+52%</span></td>
<td>Per-table concurrent INSERT + flush</td>
</tr>
<tr style="background:rgba(0,184,148,.06)">
<td><strong>+ Table sharding (3 shards)</strong></td>
<td class="text-right mono" style="color:var(--green)"><strong>55.8M rec/s</strong></td>
<td class="text-right"><span class="tag tag-pass">&#9650; +88%</span></td>
<td>3 DuckDB instances, sip/media/logs</td>
</tr>
</tbody>
</table>
<div class="bar-chart">
<div class="bar-row">
<div class="bar-label">Baseline</div>
<div class="bar-track"><div class="bar-fill bar-baseline" style="width:53.2%">29.7M</div></div>
</div>
<div class="bar-row">
<div class="bar-label">+ Tuning</div>
<div class="bar-track"><div class="bar-fill bar-baseline" style="width:62.5%">34.9M</div></div>
</div>
<div class="bar-row">
<div class="bar-label">+ Par. flush</div>
<div class="bar-track"><div class="bar-fill bar-noshard" style="width:80.7%">45.0M</div></div>
</div>
<div class="bar-row">
<div class="bar-label">+ Sharding</div>
<div class="bar-track"><div class="bar-fill bar-sharded" style="width:100%">55.8M &#9733;</div></div>
</div>
</div>
</section>
<!-- Test config -->
<section>
<h2>Test Environment</h2>
<table>
<thead><tr><th>Parameter</th><th>Value</th></tr></thead>
<tbody>
<tr><td>hepic-lake version</td><td class="mono">5.0.2</td></tr>
<tr><td>DuckDB version</td><td class="mono">v1.4.4</td></tr>
<tr><td>Go version</td><td class="mono">go1.25.0</td></tr>
<tr><td>CPU</td><td class="mono">22 cores</td></tr>
<tr><td>RAM</td><td class="mono">64 GB</td></tr>
<tr><td>DuckDB memory_limit</td><td class="mono">8 GB (per instance)</td></tr>
<tr><td>Workers</td><td class="mono">8</td></tr>
<tr><td>Batch size</td><td class="mono">10,000 records</td></tr>
<tr><td>Multi-table batches</td><td class="mono">500/worker &times; N tables = 5M per table</td></tr>
<tr><td>Single-table records</td><td class="mono">8,080,000 (808 batches &times; 8 workers)</td></tr>
<tr><td>coalesce_ms / max</td><td class="mono">100 / 1000 (adaptive)</td></tr>
<tr><td>Write mode</td><td class="mono">memory_arrow</td></tr>
<tr><td>IPC transport</td><td class="mono">TCP localhost:50062</td></tr>
<tr><td>flush_interval_sec</td><td class="mono">30</td></tr>
</tbody>
</table>
</section>
<!-- Recommendations -->
<section>
<h2>When to Enable Sharding</h2>
<table>
<thead><tr><th>Scenario</th><th>Recommendation</th></tr></thead>
<tbody>
<tr><td>Low traffic (&lt; 100K rec/s)</td><td>Not needed &mdash; single DuckDB is sufficient</td></tr>
<tr><td>1&ndash;3 tables</td><td>Not needed &mdash; sharding adds overhead for single-table workloads</td></tr>
<tr><td>4+ tables, high traffic</td><td><span class="tag tag-pass">Recommended</span> &mdash; +14% throughput</td></tr>
<tr><td>8+ tables, high traffic</td><td><span class="tag tag-pass">Strongly recommended</span> &mdash; +24% throughput, +133% per-table INSERT</td></tr>
</tbody>
</table>
</section>
<footer>
hepic-lake Sharding Benchmark &mdash; Generated 2026-02-15 11:45 CET<br>
<a href="https://gist.github.com/adubovikov/0b79fd8915595f410bb67dbe35e79899">Previous report: Performance Optimizations (34.9M, Feb 14)</a> &bull;
<a href="https://gist.github.com/adubovikov/fcf21585935ba327e13721a8a2ac5ccf">Original baseline (29.7M, Feb 13)</a>
</footer>
</div>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment