Created
May 22, 2026 10:53
-
-
Save arajkumar/71c410fb3f73bde7a46e1a3c0dcfeea7 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <!doctype html> | |
| <html lang="en"> | |
| <head> | |
| <meta charset="utf-8" /> | |
| <title>commitLSN vs endLSN in logical replication</title> | |
| <style> | |
| body { font: 14px/1.5 -apple-system, system-ui, sans-serif; max-width: 980px; margin: 32px auto; padding: 0 16px; color: #1e293b; } | |
| h1 { font-size: 22px; } | |
| h2 { font-size: 17px; margin-top: 32px; border-bottom: 1px solid #e2e8f0; padding-bottom: 4px; } | |
| code, pre { font-family: "SF Mono", Menlo, monospace; font-size: 13px; } | |
| pre { background: #0f172a; color: #e2e8f0; padding: 12px 14px; border-radius: 6px; overflow-x: auto; } | |
| .ref { color: #64748b; font-size: 12px; } | |
| svg { display: block; margin: 12px 0 18px; } | |
| .legend { font-size: 12px; } | |
| .legend span { display: inline-block; padding: 2px 8px; border-radius: 3px; margin-right: 10px; } | |
| .lc { background: #fee2e2; color: #b91c1c; } | |
| .le { background: #dcfce7; color: #166534; } | |
| .lf { background: #dbeafe; color: #1e40af; } | |
| .note { background: #fef3c7; border-left: 3px solid #f59e0b; padding: 10px 14px; margin: 12px 0; font-size: 13px; } | |
| </style> | |
| </head> | |
| <body> | |
| <h1>commitLSN vs endLSN in logical replication</h1> | |
| <p class="ref">References: | |
| <code>src/backend/replication/logical/decode.c:763</code> | |
| (<code>ReorderBufferCommit(ctx, xid, buf->origptr, buf->endptr, ...)</code>), | |
| <code>src/backend/replication/logical/proto.c:78</code> | |
| (<code>logicalrep_write_commit</code> wires <code>commit_lsn</code> + <code>txn->end_lsn</code>).</p> | |
| <h2>1. Where each LSN points inside the WAL</h2> | |
| <p>A COMMIT WAL record occupies a contiguous byte range on disk. The two LSNs identify the two ends of that range:</p> | |
| <svg viewBox="0 0 920 200" width="920" height="200"> | |
| <!-- WAL stream baseline --> | |
| <line x1="20" y1="120" x2="900" y2="120" stroke="#94a3b8" stroke-width="1" /> | |
| <text x="20" y="138" font-size="11" fill="#64748b">WAL stream (bytes →)</text> | |
| <!-- Records as boxes --> | |
| <g font-size="12"> | |
| <rect x="50" y="95" width="100" height="50" fill="#f1f5f9" stroke="#94a3b8" /> | |
| <text x="100" y="125" text-anchor="middle">INSERT</text> | |
| <rect x="160" y="95" width="100" height="50" fill="#f1f5f9" stroke="#94a3b8" /> | |
| <text x="210" y="125" text-anchor="middle">UPDATE</text> | |
| <rect x="270" y="95" width="90" height="50" fill="#f1f5f9" stroke="#94a3b8" /> | |
| <text x="315" y="125" text-anchor="middle">DELETE</text> | |
| <rect x="370" y="80" width="160" height="80" fill="#fee2e2" stroke="#b91c1c" stroke-width="2" /> | |
| <text x="450" y="116" text-anchor="middle" font-weight="600" fill="#7f1d1d">COMMIT record</text> | |
| <text x="450" y="132" text-anchor="middle" fill="#7f1d1d">(xid, time, …)</text> | |
| <rect x="540" y="95" width="100" height="50" fill="#f1f5f9" stroke="#94a3b8" /> | |
| <text x="590" y="125" text-anchor="middle">next txn…</text> | |
| </g> | |
| <!-- commitLSN marker --> | |
| <g> | |
| <line x1="370" y1="40" x2="370" y2="80" stroke="#b91c1c" stroke-width="2" /> | |
| <polygon points="370,80 366,74 374,74" fill="#b91c1c" /> | |
| <text x="370" y="32" text-anchor="middle" font-size="13" font-weight="600" fill="#b91c1c">commitLSN</text> | |
| <text x="370" y="18" text-anchor="middle" font-size="11" fill="#64748b">= start of COMMIT record (xlog buf->origptr)</text> | |
| </g> | |
| <!-- endLSN marker --> | |
| <g> | |
| <line x1="530" y1="40" x2="530" y2="80" stroke="#166534" stroke-width="2" /> | |
| <polygon points="530,80 526,74 534,74" fill="#166534" /> | |
| <text x="530" y="32" text-anchor="middle" font-size="13" font-weight="600" fill="#166534">endLSN</text> | |
| <text x="530" y="18" text-anchor="middle" font-size="11" fill="#64748b">= byte after COMMIT (xlog buf->endptr)</text> | |
| </g> | |
| <!-- delta --> | |
| <g> | |
| <line x1="370" y1="170" x2="530" y2="170" stroke="#64748b" stroke-width="1" stroke-dasharray="3 3" /> | |
| <text x="450" y="186" text-anchor="middle" font-size="11" fill="#64748b"> | |
| Δ = size of the COMMIT WAL record (always > 0) | |
| </text> | |
| </g> | |
| </svg> | |
| <p>So the invariant is always <code>commitLSN < endLSN</code>. <code>endLSN</code> is the LSN at which the <em>next</em> WAL record begins.</p> | |
| <h2>2. Where each LSN ends up in the protocol message</h2> | |
| <p>From <code>proto.c:78</code> in <code>logicalrep_write_commit()</code>:</p> | |
| <pre>pq_sendint64(out, commit_lsn); /* commitLSN — start of the COMMIT record */ | |
| pq_sendint64(out, txn->end_lsn); /* endLSN — byte after the COMMIT record */ | |
| pq_sendint64(out, txn->commit_time);</pre> | |
| <p>On the client side (pglogrepl):</p> | |
| <pre>type CommitMessage struct { | |
| CommitLSN pglogrepl.LSN // ← what Postgres calls commit_lsn | |
| TransactionEndLSN pglogrepl.LSN // ← what Postgres calls end_lsn | |
| CommitTime time.Time | |
| }</pre> | |
| <h2>3. Where this matters for our applier</h2> | |
| <p>The applier exits when it reaches <code>a.end</code>. The interesting case is when <code>a.end</code> was captured via <code>RunUntil(WalFlushLSN(ctx))</code> — a physical flush position taken <em>after</em> the last data commit:</p> | |
| <svg viewBox="0 0 920 230" width="920" height="230"> | |
| <line x1="20" y1="120" x2="900" y2="120" stroke="#94a3b8" stroke-width="1" /> | |
| <text x="20" y="138" font-size="11" fill="#64748b">WAL stream (bytes →)</text> | |
| <g font-size="12"> | |
| <rect x="80" y="95" width="180" height="50" fill="#f1f5f9" stroke="#94a3b8" /> | |
| <text x="170" y="125" text-anchor="middle">last data txn (INSERTs)</text> | |
| <rect x="270" y="80" width="160" height="80" fill="#fee2e2" stroke="#b91c1c" stroke-width="2" /> | |
| <text x="350" y="125" text-anchor="middle" font-weight="600" fill="#7f1d1d">COMMIT</text> | |
| <rect x="450" y="95" width="220" height="50" fill="#e2e8f0" stroke="#94a3b8" stroke-dasharray="4 3" /> | |
| <text x="560" y="118" text-anchor="middle" fill="#475569">unpublished WAL</text> | |
| <text x="560" y="134" text-anchor="middle" fill="#475569" font-size="11">(autovacuum, internal xlog, …)</text> | |
| </g> | |
| <!-- commitLSN --> | |
| <g> | |
| <line x1="270" y1="50" x2="270" y2="80" stroke="#b91c1c" stroke-width="2" /> | |
| <polygon points="270,80 266,74 274,74" fill="#b91c1c" /> | |
| <text x="270" y="42" text-anchor="middle" font-size="12" font-weight="600" fill="#b91c1c">commitLSN</text> | |
| </g> | |
| <!-- endLSN --> | |
| <g> | |
| <line x1="430" y1="50" x2="430" y2="80" stroke="#166534" stroke-width="2" /> | |
| <polygon points="430,80 426,74 434,74" fill="#166534" /> | |
| <text x="430" y="42" text-anchor="middle" font-size="12" font-weight="600" fill="#166534">endLSN</text> | |
| </g> | |
| <!-- WalFlushLSN --> | |
| <g> | |
| <line x1="670" y1="160" x2="670" y2="190" stroke="#1e40af" stroke-width="2" /> | |
| <polygon points="670,160 666,166 674,166" fill="#1e40af" /> | |
| <text x="670" y="208" text-anchor="middle" font-size="12" font-weight="600" fill="#1e40af">a.end = WalFlushLSN</text> | |
| <text x="670" y="222" text-anchor="middle" font-size="11" fill="#64748b">captured by RunUntil(…) after the commit</text> | |
| </g> | |
| <!-- relations --> | |
| <text x="350" y="172" text-anchor="middle" font-size="11" fill="#64748b">commitLSN < endLSN ≤ WalFlushLSN</text> | |
| </svg> | |
| <p class="legend"> | |
| <span class="lc">commitLSN</span> | |
| <span class="le">endLSN</span> | |
| <span class="lf">a.end (WalFlushLSN)</span> | |
| </p> | |
| <h2>4. Why the check has to be on endLSN</h2> | |
| <table cellspacing="0" cellpadding="8" style="border-collapse: collapse; width: 100%; font-size: 13px;"> | |
| <tr style="background: #f1f5f9;"> | |
| <th align="left" style="border: 1px solid #cbd5e1;">condition</th> | |
| <th align="left" style="border: 1px solid #cbd5e1;">real commit</th> | |
| <th align="left" style="border: 1px solid #cbd5e1;">keepalive (synthetic commit)</th> | |
| </tr> | |
| <tr> | |
| <td style="border: 1px solid #cbd5e1;"><code>commitLSN >= a.end</code></td> | |
| <td style="border: 1px solid #cbd5e1; color: #b91c1c;">~never true (commitLSN ≪ WalFlushLSN)</td> | |
| <td style="border: 1px solid #cbd5e1; color: #166534;">true (we pass walEndLSN as both args)</td> | |
| </tr> | |
| <tr> | |
| <td style="border: 1px solid #cbd5e1;"><code>endLSN >= a.end</code></td> | |
| <td style="border: 1px solid #cbd5e1; color: #166534;">true at the last commit (endLSN ≈ WalFlushLSN)</td> | |
| <td style="border: 1px solid #cbd5e1; color: #166534;">identical to commitLSN case (they're equal)</td> | |
| </tr> | |
| </table> | |
| <div class="note"> | |
| <strong>The bug:</strong> the old code checked <code>commitLSN >= a.end</code>. For real commits that's almost never true, so the <em>only</em> thing that could trigger applier exit was a keepalive whose <code>walEndLSN</code> caught up to <code>a.end</code>. That keepalive could fire while a real commit's error path (conflict → audit row insert) was still in flight, masking the error. | |
| <br><br> | |
| <strong>The fix:</strong> check <code>endLSN >= a.end</code>. The last real commit's <code>endLSN</code> actually reaches <code>a.end</code>, so the apply loop exits via the real-commit path — which has already finished applying (and recording any error). The keepalive path remains a fallback for truly-idle sources, with the same guarantee, since for it <code>commitLSN == endLSN == walEndLSN</code>. | |
| </div> | |
| <h2>5. Mapping to the Go code</h2> | |
| <pre>// pkg/subscription/applier/applier.go — single end-of-apply check inside commit() | |
| if endLSN >= a.end { | |
| a.endLSNReached = true | |
| a.timer.Stop() | |
| return a.flush(ctx) | |
| } | |
| // real-commit caller (applyDMLV1): | |
| a.commit(ctx, logicalMsg.CommitLSN, logicalMsg.TransactionEndLSN, logicalMsg.CommitTime) | |
| // ^^^ commitLSN ^^^ endLSN (TransactionEndLSN) | |
| // keepalive caller (sendKeepAlive): commitLSN == endLSN == walEndLSN | |
| a.commit(ctx, walEndLSN, walEndLSN, serverTime)</pre> | |
| </body> | |
| </html> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment