Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save jlevy/ed975222723c6764aee9e39f020ba87e to your computer and use it in GitHub Desktop.

Select an option

Save jlevy/ed975222723c6764aee9e39f020ba87e to your computer and use it in GitHub Desktop.
Research doc from codex-gpt-5.5: SQLite File Coordination for Portable Server-Free Databases [codex-gpt-5.5]
title Research: SQLite File Coordination for Portable Server-Free Databases
description Consolidated research on using ordinary SQLite database files as durable, portable, server-free archives with cooperative WAL-free concurrency, safe backup, and cross-platform file coordination.
author Levy + Codex
status Complete: background consolidated; recommendations tentative
related
../specs/active/plan-2026-06-06-sqlite-session-safe-package.md
research-2026-05-15-secure-file-backed-tool-access.md
research-2026-04-19-filestore-mount-alternatives.md
research-2026-04-09-large-tree-and-nfs-browser-monitoring.md
../specs/active/plan-2026-05-15-eia-dataroom-agent-access.md
../specs/done/plan-2026-05-11-metabrowser-dataroom-plugin.md
beads
trading-l0bb
trading-hoqk
source_checkouts
repo path commit
sqlite/sqlite
attic/sqlite
6bf7825ac5afd870fbe69c0eb82cddbb031de171
repo path commit
WiseLibs/better-sqlite3
attic/better-sqlite3
f93f490cdd51731271534ad470d09dfd7388c6f1
repo path commit
moxystudio/node-proper-lockfile
attic/node-proper-lockfile
9f8c303c91998e8404a911dc11c54029812bca69

Research: SQLite File Coordination for Portable Server-Free Databases

Date: 2026-06-06 (last updated 2026-06-06)

Author: Levy + Codex

Status: Complete: background consolidated; recommendations tentative.

Overview

We want a durable, portable, file-based data format that can be accessed directly from local APIs without requiring a database server. SQLite is the strongest candidate because its primary artifact is an ordinary, stable database file, readable by mature tooling in Node, Python, Rust, and many other languages.

The hard part is not the SQLite format. The hard part is safe coordination when the database is on local disks, network filesystems, synced folders, or shared file trees where OS locking, WAL shared memory, or generic file-copy backups can fail in ways that are difficult for application code to observe.

The realistic target is a cooperative wrapper:

  • keep the authoritative data in an ordinary SQLite database file;
  • avoid WAL by default for portability and network-filesystem compatibility;
  • serialize package-controlled writes through a cross-platform directory lock;
  • keep transactions short and explicit;
  • when available, capture SQLite session changesets as the package-level mutation artifact for audit, export, sync, or downstream replay;
  • expose safe backup/snapshot helpers as first-class APIs;
  • publish the lock protocol so other languages can participate;
  • document that uncoordinated raw clients can still corrupt or race the database.

This report supports the active package plan in plan-2026-06-06-sqlite-session-safe-package.md. The plan can stay short because this report carries the background, source facts, constraints, and tentative direction.

Essential Goals

The target is not "a database that does everything." The target is a narrow local-first primitive with these essential properties:

  1. Durable archival format. The authoritative data lives in a well-known, documented, durable, file-based format. SQLite is the leading candidate because ordinary tools can inspect it and because the format is stable enough to serve as an application file format.
  2. Direct local APIs. Applications open a local path directly through normal language APIs, especially Node at first and ideally Python, Rust, and other bindings later. Users should not need a database server just to read or write the file.
  3. Cross-platform operation. The package must work on macOS, Linux, and Windows. The core protocol cannot depend on Unix-only flock or fcntl as its outer coordination mechanism.
  4. Cooperative local concurrency with explicit granularity. The default target is many concurrent readers plus one admitted writer per SQLite database file, not arbitrary parallel writes to different tables inside the same file. In trusted-lock mode, reads can use SQLite's normal shared-lock behavior while package writes are serialized by an outer protocol and short SQLite transactions. In weak-lock mode, all live package access may need a reader/writer package lock: many cooperative readers or one cooperative writer, with raw live access limited to snapshots.
  5. Server-free default. The first-class mode is embedded and file-based. A sidecar may be useful later for high contention, in-memory state, indexing, or sync, but it should be optional and should not make the file stop being the authority.
  6. Weak-lock-filesystem compatibility. The package should not rely on OS byte-range locks as the outer admission protocol. It should be able to run on validated NFS-like or shared filesystems where flock/fcntl cannot be trusted, using a directory-lock protocol plus conservative SQLite modes. This is not a claim that every filesystem is safe; atomic mkdir, metadata updates, writes, renames, and sync behavior still need validation.
  7. Managed backup and crash safety. Package-created backups should be consistent snapshots, and interrupted package transactions should recover to the last committed SQLite state under the configured sync policy. The safety contract should be backup(), VACUUM INTO, or copyUnderLock(), not a folklore promise that arbitrary backup software can copy only the live .sqlite file mid-transaction and always get a valid current snapshot.

These goals deliberately prioritize a cooperative wrapper over arbitrary live writers. Plain SQLite readers remain important, especially for snapshots and supported read modes, but live writers must either use the wrapper or implement the same protocol.

Concurrency Guarantees and Granularity

The package contract should be precise about what it can and cannot improve. For a single ordinary SQLite database file, neither rollback journal mode nor WAL mode provides true simultaneous writers on different tables, pages, or logical sections. SQLite serializes write transactions per database file. WAL improves reader/writer overlap on local filesystems, but it still has one writer at a time.

SQLite's shared-cache mode has table-level locks, but it is not a good foundation for this package. The SQLite docs call shared-cache obsolete for most uses, it is connection/cache scoped, and even there only one write transaction may be open for a shared cache at a time. It does not solve cross-process, weak-filesystem, arbitrary-client coordination.

The useful concurrency contract is therefore:

Mode Read concurrency Write concurrency Read/write overlap Safety boundary
Raw SQLite, rollback journal, trusted locks Many readers via SQLite shared locks One writer per database file Readers can coexist with write intent, but writers need exclusive access to write/commit and can be delayed by readers SQLite VFS and filesystem locks must work
Raw SQLite, WAL, trusted local filesystem Many snapshot readers One writer per database file Strong local overlap: readers do not block the writer and the writer does not block readers; checkpoints can be constrained by readers Local filesystem with working WAL shared memory and sidecar handling
Package default, trusted native locks Many package or raw readers One package writer per database file, serialized by the outer lock plus SQLite transaction Inherits rollback-journal behavior; readers may delay final commit, so busy timeouts and retries matter Writers must use the package/protocol; raw readers are acceptable in documented modes
Package network-exclusive, weak native locks Many package readers under a shared package read lock One package writer under an exclusive package write lock No live reader/writer overlap if SQLite locks are disabled or untrusted All live readers and writers must be cooperative; raw tools use snapshots
Local sidecar or application server Many API readers, depending on server design Server can queue, batch, and retry writes Server can hide contention and keep memory state Direct file writes are no longer arbitrary; clients use the service API
Multiple SQLite files or shards Many readers per file One writer per file, so writers to different files can proceed in parallel Per-file behavior applies Cross-file transactions, queries, backup, and schema management become harder

This means the package can offer better local ergonomics and safer admission, but it cannot turn one SQLite file into a multi-writer database with table-level commit parallelism. If true concurrent writers to different logical areas are required, the realistic options are sharding into multiple SQLite files or introducing a sidecar/server that serializes final commits while doing more work outside the database lock.

Optimistic Concurrency and Retry

Optimistic concurrency is possible and likely valuable, but it should be described as application/change-level optimism with serialized final commits. The package can shorten lock hold time and retry conflicts without allowing two writers to commit to the same SQLite file at the same instant.

A useful API shape is:

await db.optimisticWrite(async (ctx) => {
  const item = await ctx.read("select id, value, version from item where id = ?", id);
  const next = compute(item);

  return ctx.commitIf({
    reads: [{ table: "item", pk: id, version: item.version }],
    write(tx) {
      tx.run(
        "update item set value = ?, version = version + 1 where id = ? and version = ?",
        next.value,
        id,
        item.version,
      );
    },
  });
});

The mechanics would be:

  1. Read a baseline without holding the package write lock.
  2. Compute the candidate change outside the lock.
  3. Acquire the package write lock.
  4. Re-check declared preconditions, such as row versions, table revisions, schema version, or application-defined invariants.
  5. Apply SQL or a SQLite changeset in a short transaction.
  6. On busy/conflict, release the lock, back off, and retry if the operation is idempotent and the policy allows it.

The granularity can be row-level, table-level, or application-defined if the schema supports it. Stable primary keys, version columns, revision tables, and explicit compare-and-swap predicates make conflicts precise. SQLite sessions and changesets can help because changesets include primary-key identity and original values for updates, giving the apply step a way to detect row-level conflicts. Patchsets are less suitable for this because they carry less original-value data.

This pattern works best for independent row updates, append-only records, idempotent upserts, and operations whose invariants can be rechecked at commit time. It is weaker for schema migrations, aggregate invariants, cross-table constraints, or business rules that require a broad serializable read set unless the application declares and validates those preconditions explicitly.

Why SQLite Sessions Matter

The SQLite session extension is one of the more important facts behind this package idea. It does not make SQLite concurrent by itself, and it does not replace file locks. It gives SQLite a built-in way to turn committed row changes into portable binary changesets or patchsets that can be applied to another database with the same schema and compatible starting data.

That is close to the missing application-level layer for this problem. The wrapper still needs an outer lock to decide who may write the live file, but once a write is admitted, a session can record the semantic result of that write:

acquire package lock
  open SQLite transaction
  attach SQLite session
  execute application write
  commit SQLite transaction
  extract changeset or patchset
  append/store/export changeset metadata if requested
release package lock

The design separates the authoritative archive from the optional mutation stream. The SQLite file remains the archive, while changesets can support features that are hard to provide with a lock alone:

  • audit trails for package-mediated writes;
  • compact export of "what changed" since a known baseline;
  • replay into a read replica, local cache, or rebuilt index;
  • merge-oriented workflows when two files share schema and compatible history;
  • safer downstream processing than parsing SQL text or diffing database files;
  • optional undo/invert flows for package-managed changes.

SQLite sessions also explain why the package should stay low-level and schema-aware. The session extension only tracks ordinary table row changes that meet its requirements. It is strongest when the application schema has stable primary keys and when writes are short, explicit, and captured as transactions.

Session History and Availability

SQLite release 3.13.0 on 2016-05-18 merged the session extension into SQLite trunk. The official session documentation says it has been included in the SQLite amalgamation since 3.13.0, but remains disabled by default unless SQLite is compiled with SQLITE_ENABLE_SESSION and SQLITE_ENABLE_PREUPDATE_HOOK, or configured with --enable-session.

That history matters. Sessions are not a new experimental idea inside SQLite itself. They are a mature but opt-in extension that many bindings never exposed because the common SQLite use case is simpler: open a file, run SQL, and let SQLite handle transactions. The session extension lives in a different niche: change capture, apply, invert, conflict handling, and merge-like workflows.

Ecosystem evidence from this review is consistent with that niche status:

  • SQLite upstream ships ext/session/sqlite3session.c, ext/session/sqlite3session.h, many session tests, and a changesetfuzz tool.
  • Node's built-in node:sqlite exposes DatabaseSync#createSession(), Session#changeset(), Session#patchset(), and database.applyChangeset(), but node:sqlite is a recent Node feature and is still release-candidate rather than a long-established stable Node API.
  • The reviewed better-sqlite3 build does not compile with SQLITE_ENABLE_SESSION / SQLITE_ENABLE_PREUPDATE_HOOK and does not expose a public session API.
  • Python's standard sqlite3 module and many simple SQLite bindings focus on SQL execution, not changeset production and application.

This is why sessions may be underused yet still promising. They are too specialized to show up in most "SQLite as local database" wrappers, but they line up directly with the harder requirements here: safe package-mediated writes, portable mutation records, backups, replay, and cross-language protocol design.

What Sessions Do Not Solve

Sessions should not be described as the concurrency solution. They are a change-capture and change-application mechanism. They do not prevent two processes from writing the same live file at the same time, and they cannot stop a raw SQLite writer from bypassing the package.

Important constraints:

  • sessions require tables with declared primary keys;
  • virtual table changes are not captured;
  • rows with NULL primary-key values are ignored by the session module;
  • patchsets are smaller than changesets but carry less conflict-detection information;
  • applying a changeset requires an application conflict policy;
  • changeset blobs are SQLite session artifacts, not a universal human-readable log;
  • schema migrations need careful handling because changesets apply only to compatible table shapes.

The correct design is therefore "outer cooperative lock plus optional sessions," not "sessions instead of locking." Sessions make the package more useful after admission has been serialized. They can support optimistic apply/retry flows with row-level conflict detection, but they still do not make two live writers safe without an admission protocol.

Where Standard Solutions Fall Short

Existing solutions cover parts of the target, but each gives up at least one essential goal.

Standard solution What it does well Where it falls short for this target
Plain SQLite in rollback-journal mode durable ordinary SQLite file, broad local APIs, cross-platform support depends on native file locks for concurrent live access; no package-level backup/session protocol
SQLite WAL mode strong local performance and reader/writer overlap on trustworthy local filesystems WAL shared memory is not suitable for NFS/network filesystems; backup/checkpoint sidecars complicate portability
Raw flock/fcntl wrapper simple on Unix local filesystems not cross-platform enough and not reliable on the filesystems that motivate this work
better-sqlite3 mature Node SQLite API, transactions, backup, serialize, FTS same SQLite file format but not a coordination protocol; no exposed session/changeset API in the reviewed build
proper-lockfile good generic mkdir lock primitive with stale/heartbeat handling not SQLite-specific; no transaction, backup, session, or cross-language database protocol
Application server or sidecar centralizes writes and can admit arbitrary clients through an API no longer pure direct file access for writers; adds lifecycle and deployment surface
rqlite/dqlite/LiteFS-style systems replication, high availability, or distributed coordination around SQLite solve a different server/cluster/sidecar problem rather than a tiny direct local file primitive

The proposed package is useful only if it owns the missing middle: ordinary SQLite files plus cooperative write admission, WAL-free defaults, safe backup helpers, and a published protocol that other languages can implement.

Questions to Answer

  1. Can ordinary SQLite files satisfy the archive, portability, and direct local API requirements?
  2. Why are WAL and normal SQLite file locks insufficient for the desired network-filesystem and "any filesystem" ambitions?
  3. Is there an existing Node package that already provides a complete standalone, cooperative, cross-platform, WAL-free, backup-safe wrapper?
  4. Does better-sqlite3 have the same file format as SQLite, and would wrapping it make sense?
  5. Does proper-lockfile solve enough of the lock problem to use directly?
  6. What experiments have already validated assumptions?
  7. What tentative package design follows from the evidence?

Scope

In scope:

  • SQLite file format, rollback journal mode, WAL mode, locking, backup, sessions, and thread-safety behavior
  • Node-facing SQLite surfaces, especially node:sqlite and better-sqlite3
  • Cross-process lock directory design, including proper-lockfile
  • Network filesystem and weak-lock-filesystem failure modes
  • Dataroom and metabrowser context as motivating consumers, not as package scope
  • Adjacent projects that solve related but different SQLite portability problems

Out of scope:

  • Replacing SQLite with a new database engine
  • Designing a full ORM
  • Distributed SQL, consensus replication, or sync protocols as first-release package features
  • Claiming safety against applications that bypass the cooperative protocol
  • Proving behavior on every filesystem without a dedicated validation matrix

Executive Findings

F1. SQLite Is the Right Archive Substrate, but Not a Complete Coordination Contract

SQLite's file format is stable, documented, and broadly implemented. The official file-format documentation describes the on-disk format used by all SQLite 3 releases since 2004-06-18 and notes that the complete state is usually held in one main database file, with journal or WAL sidecars during transactions or recovery. That makes SQLite a strong durable archive format.

SQLite does not solve every cross-process and cross-filesystem coordination problem. Its own documentation recommends a client/server database when many clients access the same database over a network, especially because network filesystems can have buggy file locking.

Implication: The wrapper should rely on SQLite for page format, transactions, and crash recovery, but it must own a higher-level cooperative admission protocol.

F2. WAL Is a Poor Default for the Stated Requirements

WAL improves local read/write concurrency, but it depends on a WAL-index implemented as shared memory. SQLite's source and official WAL documentation both state that WAL is not supported on network filesystems because all users must be able to share memory.

WAL also introduces multiple files (.db, -wal, -shm) and checkpoint behavior. That is workable, but it is a bad default for a package whose explicit requirements are portability, backup friendliness, and no NFS/WAL issues.

Implication: Default to rollback journal mode (journal_mode=DELETE or a carefully chosen rollback variant) with synchronous=FULL. Offer WAL only as an explicit local-filesystem mode.

F3. "Works on Any Filesystem" Must Mean "Cooperative Clients on Validated Filesystems"

No wrapper can make arbitrary uncoordinated access safe. SQLite explicitly warns that rogue processes can overwrite ordinary database files, that broken or missing lock implementations can corrupt databases, and that enabling nolock=1 is dangerous unless the application guarantees serialized writes.

A wrapper can still be useful if it narrows the contract:

  • all writers must use the package or the published protocol;
  • the lock primitive must be validated on the target filesystem;
  • backup must be performed through package APIs or quiesced copy helpers;
  • raw direct reads are allowed only in modes where SQLite's own locks remain active, or from package-created snapshots.

Implication: Write the docs as an explicit cooperative protocol, not as magical filesystem safety.

F4. A Standalone Package Appears Worthwhile

The reviewed ecosystem has strong pieces but no single common package that owns this specific contract:

  • better-sqlite3 provides mature synchronous SQLite bindings, transactions, backup, and serialize support.
  • node:sqlite now exposes sessions, changesets, and backup, but remains marked experimental in Node 24 and release-candidate in current Node docs.
  • proper-lockfile provides a strong general-purpose mkdir lock implementation.
  • SQLite itself has alternate VFS locking modes, nolock, unix-dotfile, and proxy locking, but those are not a high-level Node API with sessions, backups, and a cross-language protocol.

Implication: A focused package can add value if it is small, explicit, heavily tested, and honest about cooperative enforcement.

F5. SQLite Sessions Are Promising Because They Turn Writes into Portable Deltas

The session extension is not just an audit feature. It is SQLite's built-in mechanism for producing binary changesets and patchsets, combining changesets, inverting changesets, and applying them with conflict handling. That could give the package a compact mutation artifact without inventing a custom diff format.

The feature is real in upstream SQLite, but not universally exposed. It is opt-in at SQLite build time, requires the pre-update hook, and has schema constraints. This explains why it has not become a default part of common Node/Python SQLite usage even though it may be exactly the right primitive for a coordinated wrapper.

Implication: Make sessions a first-class capability when the backend supports them, but do not make the outer lock depend on sessions. The wrapper should work as a safe WAL-free write/backup package without sessions, and become more powerful when changeset support is available.

F6. The Core Tradeoff Is a Coordination Triangle, Not a CAP Theorem

A useful shorthand is a SQLite coordination triangle. This is not the CAP theorem: there is no distributed consensus claim here. It is a statement about where enforcement lives.

The three desirable properties are:

  1. Direct file access: the SQLite file is the primary artifact, and clients open it directly without a database server or broker.
  2. Strong coordinated writes on weak/shared filesystems: concurrent package clients cannot double-write, bypass backup rules, or rely on WAL/shared-memory/file-lock behavior the filesystem cannot provide.
  3. Unmodified arbitrary writers: any SQLite-capable process can write to the live file without knowing about the package protocol.

The package can maximize the first two, but only by giving up the third. All writers must use the wrapper or a compatible implementation of its protocol. Unmodified tools can still be excellent readers in supported modes, especially against snapshots, but unmodified live writers are outside the safety boundary.

The other pairs have known shapes:

Properties Shape Concession
Direct file access + strong coordinated writes This wrapper: ordinary SQLite file plus cooperative lock/session/backup protocol Writers are not arbitrary; they must follow the protocol
Direct file access + arbitrary writers Plain SQLite with normal SQLite locking on trustworthy filesystems No package-level safety on weak filesystems, no nolock=1 safety, weaker live-backup story
Strong coordinated writes + arbitrary clients Application server, local sidecar, rqlite/dqlite-style service, or domain API Clients no longer write the SQLite file directly

This framing should be used carefully. It does not say arbitrary reads are impossible. It says arbitrary uncoordinated live writes are incompatible with a server-free wrapper that also promises strong coordination on filesystems where native locks or WAL cannot be trusted.

F7. Dataroom Is a Motivating Consumer, Not the Package Boundary

Current Dataroom work uses LMDB indexes for URL/blob/file/action metadata. Metabrowser can read those indexes with an LMDB reader, and EIA agents can use native Dataroom lookup instead of duplicating source indexes. That proves the value of file-backed local data stores, but it also shows the cost of a less common binary store for broad human/tool portability.

Implication: The package should be generic. Dataroom can later consume it behind a Dataroom-owned index backend interface.

SQLite Background

File Format and Archival Properties

SQLite is designed as an embedded database library. Its official "Appropriate Uses" page says SQLite competes with fopen(), not with client/server databases, and emphasizes local storage for applications and devices. The database file starts with the magic string SQLite format 3\000, uses pages, and has a documented schema layer, b-tree representation, rollback journal format, and WAL format.

For an archive-oriented package, useful properties are:

  • ordinary files, not a daemon-managed storage directory;
  • broad tooling support;
  • readable from Python's standard sqlite3, Rust crates, Node bindings, SQLite CLI, DB Browser for SQLite, Datasette, and similar tools;
  • stable SQL schema and migration conventions;
  • ability to set application_id and user_version for application-owned metadata;
  • ability to run PRAGMA integrity_check after crash and backup tests.

The package should avoid hiding data behind a custom opaque binary envelope. If the package adds coordination files, those should sit beside the SQLite file and not change the SQLite file format.

Rollback Journal Mode

Rollback journal mode is the safer default for this target because it does not require a shared-memory WAL-index. SQLite's rollback-journal locking model has one writer at a time, shared reader locks, and an exclusive lock during the database-file write phase. When commit reaches the journal-delete step, the change becomes durable under the configured synchronous policy. This provides good many-reader behavior but not table-level or page-level concurrent writers. Readers can coexist with a writer that holds write intent, but a writer eventually needs exclusive access to write pages and commit, so long readers can delay writers.

The wrapper should prefer:

PRAGMA journal_mode=DELETE;
PRAGMA synchronous=FULL;
PRAGMA foreign_keys=ON;
PRAGMA busy_timeout=5000;
BEGIN IMMEDIATE;
-- caller write statements
COMMIT;

BEGIN IMMEDIATE is useful because it obtains write intent up front, so contention surfaces at transaction start rather than halfway through application work.

WAL Mode

WAL mode has real advantages on local disks: readers and writers can overlap, and local write throughput is often better. Those advantages are why better-sqlite3 recommends WAL for many web application workloads. WAL does not change the single-writer fact for one database file. Its concurrency advantage is that readers and a writer can proceed at the same time using snapshot reads and the WAL sidecar, while writes are still serialized.

The requirement here is different. The package is meant to make file-based databases easier to use in backup-friendly, server-free, WAL-free deployments, including network filesystems where WAL is not a safe default.

Use WAL only as an explicit advanced mode with:

  • local filesystem requirement;
  • -wal and -shm backup handling;
  • checkpoint policy;
  • documentation that WAL without shared memory requires exclusive locking and only one connection.

Thread Safety

SQLite supports three thread modes:

  • single-thread: unsafe across threads;
  • multi-thread: multiple threads may use SQLite, but a single connection or derived object must not be used concurrently;
  • serialized: SQLite serializes access to each connection/object with mutexes.

better-sqlite3 builds with SQLITE_THREADSAFE=2, which is SQLite's multi-thread mode. Its source asserts sqlite3_db_mutex(db_handle) == NULL after opening, so the package must not share a single connection concurrently across Worker Threads.

The wrapper needs a process-local mutex even when it also has a cross-process lock. The process-local mutex protects one package instance and one SQLite connection from async/Worker interleavings. The cross-process directory lock protects all cooperative processes.

Backup and Copy Semantics

SQLite's own corruption guide is clear: copying a live database while a transaction is active can produce a corrupt copy. Safe live backup paths include:

  • SQLite Online Backup API;
  • VACUUM INTO;
  • sqlite3_rsync for remote live copies in SQLite 3.47.0 and later;
  • copying the database only while no transactions are active, including the relevant journal or WAL sidecar when needed.

The wrapper should make safe backup the normal path:

await db.backup("state.snapshot.sqlite");
await db.vacuumInto("state.compact.sqlite");
await db.copyUnderLock("state.raw-copy.sqlite");

The documentation should avoid saying users can copy only state.sqlite while writers are live and get a guaranteed current snapshot.

Sessions and Changesets

SQLite's session extension tracks table changes and produces changesets or patchsets. This is attractive for sync, audit, export, or downstream rebuilds, and it deserves more weight in the package design than a generic optional feature would.

The official SQLite introduction frames sessions as a way for two users to start from the same application file, make independent changes, and later exchange/apply changesets. It explicitly compares the feature to patch/merge systems. For this package, the more immediate use is not arbitrary distributed merge. It is controlled write capture under an outer lock: after a package-mediated write commits, the package can return or store the changeset as the durable description of the mutation.

Constraint: Session support is a backend capability, not a property of every Node SQLite binding. The Node node:sqlite module exposes DatabaseSync#createSession(), Session#changeset(), Session#patchset(), and database.applyChangeset(). The reviewed better-sqlite3 build does not compile or expose the session extension in its public API.

Implication: Model changesets as optional capability. Do not make them required for the first generic write/backup wrapper unless the chosen backend is node:sqlite or a custom backend.

SQLite Source-Code Findings

Source checkout: sqlite/sqlite at 6bf7825ac5afd870fbe69c0eb82cddbb031de171.

The checkout was read as source only. No third-party build or test was run.

Session Source Is Substantial but Optional

The source checkout contains the session implementation under ext/session/sqlite3session.c and the public C interface under ext/session/sqlite3session.h. The same directory includes many session regression tests, rebase tests, fault tests, and changesetfuzz.c.

The header is guarded by SQLITE_ENABLE_SESSION, and it documents that the session module uses SQLite's pre-update hook. This matches the official build requirement: sessions require both SQLITE_ENABLE_SESSION and SQLITE_ENABLE_PREUPDATE_HOOK.

This source shape is important. Sessions are not an external ad hoc extension, but they are also not part of every SQLite build by default. The package should therefore expose them through backend capability detection.

WAL Source Confirms the Network-Filesystem Problem

src/wal.c documents the WAL-index as conceptually shared memory. It says SQLite does not support journal_mode=WAL on a network filesystem because all users of the database must be able to share memory.

This is not just a web-doc caveat. It is in the WAL implementation source.

Unix VFS Source Shows Multiple Locking Strategies, None Fully Solving the Package Contract

src/os_unix.c includes VFS method sets for:

  • normal POSIX advisory locks;
  • unix-none / no-lock behavior;
  • unix-dotfile;
  • optional unix-flock;
  • optional Apple-specific proxy locking;
  • other platform styles.

The file also contains comments about network filesystem cache coherency. For OS X network filesystems, SQLite notes that close-to-open cache coherency does not effectively satisfy concurrent database access by multiple readers and writers. Proxy locking addresses this by limiting access to a single host at a time and moving locks to a local proxy file.

These source details show that SQLite already has specialized VFS mechanisms for hard filesystems, but those mechanisms are platform-specific and VFS-level. They do not provide a convenient npm package contract with owner metadata, stale lock recovery, backup APIs, and cross-language protocol docs.

URI and nolock=1

src/sqlite.h.in documents URI filenames and examples such as:

file:/home/fred/data.db?vfs=unix-dotfile

The official URI documentation says nolock=1 disables VFS lock calls and may be used on filesystems without file locking. It also warns that corruption can result if multiple connections access the same database and one uses nolock. It says nolock should only be used if the application can guarantee serialized writes.

This is the key source-backed argument for an outer cooperative wrapper. nolock=1 can be a fallback only when the wrapper has exclusive protocol control over all writers and, in the strictest mode, all readers.

Windows Uses OS Byte-Range Locking Internally

SQLite's Windows VFS uses Windows file-lock APIs such as LockFileEx. The proposed package should not replace SQLite's platform-specific internals. It should add an outer admission protocol that works the same way for package callers across macOS, Linux, and Windows.

Node and npm Package Findings

node:sqlite

Local environment:

node --version -> v24.13.0
node:sqlite exports -> DatabaseSync, Session, StatementSync, backup, constants

Node printed:

ExperimentalWarning: SQLite is an experimental feature and might change at any time

Current Node docs for node:sqlite say:

  • added in Node v22.5.0;
  • no longer behind --experimental-sqlite in v23.4.0 / v22.13.0, but still experimental at that time;
  • currently marked Stability 1.2, release candidate, in the Node v26.3.0 docs;
  • exposes DatabaseSync, Session, changesets, patchsets, applyChangeset, and sqlite.backup.

Conclusion: node:sqlite is the best proof backend for session and backup semantics. It may be too new for a conservative first public npm release if the target includes older Node LTS versions.

better-sqlite3

Source checkout: WiseLibs/better-sqlite3 at f93f490cdd51731271534ad470d09dfd7388c6f1.

The checkout was read as source only. No install, build, test, or package script was run.

better-sqlite3 is a Node native addon around SQLite. It does not define a different file format. Its docs say backup output is a regular SQLite database file, and serialize() returns bytes that can be written to disk to create a regular SQLite database file.

Useful findings:

  • Mature synchronous API.
  • Good transaction wrapper shape: function callbacks, nested savepoints, deferred / immediate / exclusive variants.
  • Explicit warning that async transaction functions do not work because the transaction can commit before awaited code runs.
  • Backup API uses SQLite's online backup mechanism and lets normal use continue while backup is in progress.
  • serialize() can create a regular SQLite database file.
  • Bundled SQLite enables FTS, JSON, RTREE, GEOPOLY, DBSTAT, DESERIALIZE, STAT4, and related features.
  • Builds with SQLITE_THREADSAFE=2.
  • Does not list SQLITE_ENABLE_SESSION or SQLITE_ENABLE_PREUPDATE_HOOK in the checked compile flags.
  • Calls sqlite3_open_v2() with a NULL VFS and ordinary read/write/create flags.
  • Default compile flags include SQLITE_USE_URI=0; URI support can be toggled globally through the SQLITE_USE_URI environment variable before SQLite initialization.
  • Recommends WAL for performance in its docs and compiles SQLITE_DEFAULT_WAL_SYNCHRONOUS=1.
  • Uses native addon install/build machinery (prebuild-install or node-gyp rebuild).

Conclusion: Wrapping better-sqlite3 makes sense for a stable synchronous backend covering read/write/backup/serialize and FTS-heavy local indexes. It is not enough for first-class SQLite sessions/changesets as currently published. Its global/env URI behavior also makes nolock=1 and vfs=unix-dotfile awkward as per-database package options.

proper-lockfile

Source checkout: moxystudio/node-proper-lockfile at 9f8c303c91998e8404a911dc11c54029812bca69.

The checkout was read as source only. No install, build, test, or package script was run.

proper-lockfile implements the same core primitive this package wants:

  • lock by creating <file>.lock with atomic mkdir;
  • treat EEXIST as contention;
  • use mtime to decide stale locks;
  • update mtime periodically as a heartbeat;
  • probe mtime precision for second-resolution filesystems;
  • resolve canonical paths with realpath by default;
  • provide retry settings;
  • detect compromised locks when heartbeat update fails, takes too long, or the lock mtime no longer matches the holder's expected value;
  • remove held locks on process exit where possible;
  • include a stress test intended to guarantee exclusiveness.

Limitations relative to the desired SQLite package:

  • no SQLite transaction protocol;
  • no owner metadata schema with protocol version, host, pid, target, and mode;
  • no changeset or backup integration;
  • documented undetected compromise cases around manual lock removal followed by a new owner and inconsistent stale/update values among clients;
  • older CommonJS dependency tree.

Conclusion: Use proper-lockfile as design and test reference. A package can either depend on it for primitive acquire/release or reimplement the small core with owner metadata and attribution. The cleaner package boundary is to own the lock protocol directly.

Local Spike Results

Spike 1: node:sqlite Sessions and Backup

A local Node spike created a database with DatabaseSync, set:

PRAGMA journal_mode=DELETE;
PRAGMA synchronous=FULL;

It created a session, ran a BEGIN IMMEDIATE write transaction, obtained a changeset, and used sqlite.backup() to copy the database.

Observed result:

{
  "journalMode": { "journal_mode": "delete" },
  "synchronous": { "synchronous": 2 },
  "sourceRows": { "n": 2 },
  "backupRows": { "n": 2 },
  "changesetBytes": 45,
  "files": ["backup.db", "catalog.db"]
}

Conclusion: node:sqlite can validate the proof-of-concept for rollback journal, FULL synchronous, sessions, changesets, and backup.

Additional session-apply spike:

{
  "changesetBytes": 77,
  "applied": true,
  "rows": [
    { "id": 1, "value": "new", "count": 2 },
    { "id": 2, "value": "added", "count": 1 }
  ]
}

This spike created matching source and target databases, attached a session to the source, performed an update plus insert inside BEGIN IMMEDIATE, extracted a changeset, and applied it to the target. It confirms the practical package shape: a write callback can commit the ordinary SQLite transaction and return a compact changeset artifact for replay or audit.

Spike 2: Directory Lock Contention

Eight Node child processes attempted to create the same directory.

Observed result:

{
  "winners": 1,
  "losers": 7,
  "results": [
    { "result": "won" },
    { "result": "EEXIST" },
    { "result": "EEXIST" }
  ]
}

Conclusion: Local mkdir behaves as expected for atomic mutual exclusion. This is necessary evidence, not sufficient evidence. NFS, SMB, cloud-sync folders, Windows, and failure injection still need explicit tests.

Spike 3: URI Options Through node:sqlite

DatabaseSync accepted URI filenames with:

nolock=1
vfs=unix-dotfile

Conclusion: The proof backend can exercise the URI options needed for network-filesystem fallback experiments.

Spike 4: npm Package Scan

Observed package metadata during review:

Package Observed Version License Observed Role
better-sqlite3 12.10.0 MIT mature sync SQLite binding
proper-lockfile 4.1.2 MIT general mkdir lock utility
sqlite3 6.0.1 BSD-3-Clause async SQLite binding
sqlite 5.1.1 MIT TS wrapper/migration API
@sqlite.org/sqlite-wasm 3.53.0-build1 Apache-2.0 official SQLite Wasm packaging

The local npm emitted warnings that the repo's min-release-age / minimum-release-age configuration was unknown to the installed npm 11.6.2. Any package implementation should enforce dependency cool-off in CI or use an npm version that supports the intended setting.

Dataroom and LMDB Context

The current Dataroom and metabrowser work is useful background but should not define the generic package boundary.

Existing facts:

  • Dataroom stores URL/blob/action/file metadata in LMDB-backed indexes under room metadata directories.
  • The EIA Dataroom agent-access plan correctly treats Dataroom as the source of truth and avoids duplicating every source index into another YAML artifact.
  • The metabrowser Dataroom plugin shipped a generic LMDB reader and a format-specific Dataroom plugin layer.
  • The LMDB reader opens environments read-only with lock=False so viewers do not block the JS Dataroom writer.
  • The LMDB plugin is deliberately format-only; Dataroom decoding lives in the Dataroom plugin.

Why SQLite remains attractive:

  • More users and languages have SQLite tooling than LMDB tooling.
  • A zipped project with a SQLite database is easier to inspect with standard tools.
  • SQLite supports SQL indexes, FTS, JSON functions, views, migrations, and backup APIs.
  • A SQLite index can still be paired with an application-owned JSONL record stream for rebuildability.

Why SQLite is not an automatic replacement:

  • LMDB has excellent read performance and simple MVCC read behavior.
  • The existing Dataroom implementation already works.
  • SQLite write coordination must be tested carefully, especially on network filesystems.
  • Dataroom-specific schema and provenance semantics should stay in Dataroom, not in the generic SQLite wrapper.

Tentative Dataroom role: Use the generic package first as a lower-level primitive. If it succeeds, Dataroom can add a SQLite index backend behind a Dataroom-owned interface and compare it against LMDB on representative rooms.

Approach Comparison

Approach Strengths Weaknesses Tentative Fit
Raw SQLite, rollback journal portable, ordinary file, broad tooling, many readers and one writer depends on OS locks, no package-level session/backup guardrails good substrate, insufficient alone
Raw SQLite, WAL strong local reader/writer overlap, high performance, one writer not network-filesystem safe, extra files, checkpoint complexity opt-in local mode only
SQLite plus cooperative directory lock server-free, portable file, avoids flock as outer lock, package can own backups one package writer per database file; only protects cooperative clients; needs real FS validation best first package direction
SQLite plus local sidecar can centralize writes, retry conflicts, and own memory state starts becoming a server, more lifecycle complexity; SQLite still serializes final commits future option for high contention
better-sqlite3 wrapper mature sync Node API, backups, transactions, FTS native addon, no session API, awkward URI mode strong backend candidate
node:sqlite wrapper sessions, backup, no third-party native addon Node-version maturity, current local Node warns experimental best proof backend
proper-lockfile dependency mature mkdir lock utility, retries, mtime precision not SQLite-specific, limited owner protocol design reference or narrow primitive
SQLite unix-dotfile VFS SQLite-owned dotfile locking strategy Unix-only, exclusive-style, stale lock downsides experiment, not package default
SQLite nolock=1 under wrapper lock works when FS lock calls are broken corruption if any writer bypasses protocol; live readers need package coordination strict network-exclusive fallback
LMDB fast read store, already in Dataroom less transparent to ordinary tools, binding requirements keep for current Dataroom until SQLite proves out
DuckDB excellent analytics, local files not an OLTP app DB; multi-process writes need server/protocol not a replacement for this package
LiteFS / Litestream replication and backup around SQLite sidecars, ops model, not generic no-server local file locking useful adjacent patterns
rqlite / dqlite consensus replication, high availability server/cluster model, not plain direct file access out of scope for package
libSQL / Turso SQLite-compatible ecosystem, extensions, remote/local modes fork/rewrite direction, not the same low-level wrapper goal reference only

Tentative Recommendations

R1. Build a Small Standalone Package Spike

The package is viable enough to justify a focused spike. The first release should not be an ORM. It should be a coordination and safety wrapper for ordinary SQLite files.

Working scope:

  • openSqliteFile(path, options);
  • read(fn);
  • write(fn) with a synchronous transaction callback for sync backends;
  • optimisticWrite(fn, policy) for declared preconditions and retryable conflicts;
  • backup(path);
  • vacuumInto(path) if supported;
  • inspectLock();
  • breakStaleLock() with careful owner metadata;
  • backend capability reporting for sessions and backups.

R2. Default to WAL-Free Coordinated Mode

Default mode:

  • external directory lock for writes;
  • SQLite rollback journal;
  • synchronous=FULL;
  • BEGIN IMMEDIATE;
  • SQLite's native locks remain enabled;
  • package reads use ordinary SQLite read behavior and do not take the outer write lock;
  • package writes are serialized per database file, not per table;
  • busy timeout and bounded retry around readers that delay commit;
  • package-created backups use online backup API or VACUUM INTO.

This is the mode most likely to preserve ordinary SQLite behavior and cross-language readability. It is also the fairest WAL-free default: it admits many readers, one writer, and honest rollback-journal read/write overlap, while avoiding WAL sidecar and shared-memory requirements.

R3. Treat network-exclusive as a Separate Stricter Mode

Network-exclusive mode should be explicit. It may require:

  • package lock for reads and writes;
  • nolock=1 only after validation;
  • no raw live readers during writes;
  • no read/write overlap when native SQLite locks are disabled or untrusted;
  • read snapshots for uncoordinated tooling;
  • stronger warnings in docs and runtime diagnostics.

This mode is where the wrapper can target filesystems with missing or broken SQLite file locks, but only under full cooperative control.

R4. Expose Optimistic Retry as an Application-Level Feature

Optimistic concurrency should be a first-class optional API, not an implicit promise. The package can help callers:

  • read without holding the package write lock;
  • declare row, table, schema, or application preconditions;
  • compute candidate updates outside the lock;
  • revalidate inside a short write transaction;
  • apply SQL or a session changeset;
  • retry on SQLITE_BUSY or declared conflicts when the operation is idempotent.

This gives better throughput under moderate contention because expensive computation can happen outside the serialized write section. It does not provide simultaneous writers inside one SQLite file. The final commit is still one admitted writer per database file.

R5. Own the Lock Protocol

Do not make proper-lockfile the whole protocol. The package should define a lock directory like:

state.sqlite.lock/
  owner.json
  heartbeat

owner.json should include:

{
  "protocol": "sqlite-session-safe.lock.v1",
  "ownerId": "host:pid:random",
  "target": "state.sqlite",
  "mode": "write",
  "createdAt": "2026-06-06T00:00:00.000Z",
  "updatedAt": "2026-06-06T00:00:05.000Z",
  "process": {
    "pid": 12345,
    "hostname": "host.example"
  }
}

Take design cues from proper-lockfile:

  • atomic mkdir;
  • canonical path resolution;
  • mtime precision probing;
  • heartbeat updates;
  • stale threshold;
  • compromised-lock detection;
  • stress tests.

R6. Use node:sqlite for Proof, Evaluate better-sqlite3 for Release Stability

Recommended backend path:

  1. Prototype on node:sqlite to validate sessions, changesets, URI modes, and backups.
  2. Implement backend abstraction early so the package can support better-sqlite3 for stable Node versions.
  3. Mark changesets as optional capability until the release backend is chosen.
  4. If sessions become mandatory and node:sqlite remains too new, evaluate a small native binding or a better-sqlite3 fork/extension, but keep that out of v0 unless necessary.

R7. Make Backup APIs Central

The wrapper should be opinionated:

  • users should call backup(), vacuumInto(), or copyUnderLock();
  • raw copy of a live database is not the contract;
  • every backup helper should run PRAGMA integrity_check in tests;
  • if WAL is enabled, helper docs must include the sidecar files and checkpoint behavior.

R8. Publish a Cross-Language Protocol Before Promising Cross-Language Writes

Cross-language reads are easy: open the SQLite file normally. Cross-language writes require protocol compliance.

The package should publish:

  • lock directory layout;
  • owner metadata schema;
  • acquire/release algorithm;
  • stale break algorithm;
  • transaction mode expectations;
  • backup expectations;
  • unsupported cases.

Python and Rust adapters can follow after Node proves the protocol.

What the Package Can and Cannot Promise

Can promise, after validation:

  • ordinary SQLite file remains the data format;
  • package writers are serialized;
  • package write transactions are SQLite transactions;
  • package-created backups are consistent snapshots;
  • package docs explain which modes are tested on which filesystems;
  • raw read access is possible in supported modes.

Cannot honestly promise:

  • safety against raw writers that bypass the package;
  • correctness on a filesystem that lies about atomic mkdir, mtime, writes, rename, or sync;
  • no data loss beyond the current transaction if users copy the live database file with arbitrary backup software during active writes;
  • WAL safety over NFS;
  • thread safety if callers share non-serialized backend connections across Worker Threads.

Validation Matrix Needed

Minimum validation before treating the design as production-ready:

Area Tests
Local contention many Node processes, one writer at a time, no double ownership
Worker Threads process-local mutex prevents shared connection interleaving
Crash safety kill writer during transaction, reopen, integrity_check
Stale locks stale lock reclaimed without two breakers owning the lock
Compromised locks lock deletion/replacement while held aborts current operation
Backup backup during queued writes, backup integrity check, restore read
URI modes normal, nolock=1, vfs=unix-dotfile where supported
Filesystems APFS, ext4, NTFS, NFS, SMB if relevant, cloud-sync folder caveats
Cross-language Node writer plus Python read; later Python/Rust writer adapters
Path handling symlinks, spaces, Unicode paths if supported, Windows drive paths
Dependency policy lockfile, install-script policy, package age, native addon review

Open Questions

  1. Is node:sqlite mature enough for the first npm release target, or should v0 support better-sqlite3 first and make sessions optional?
  2. Should network-exclusive require all reads to acquire the package lock, or can package-managed snapshot reads be the only supported uncoordinated read path?
  3. Which rollback journal mode should be default: DELETE, TRUNCATE, or PERSIST? DELETE is simplest and most compatible; performance tradeoffs need measurement.
  4. Should the lock heartbeat update owner.json, a separate heartbeat file, or only directory mtime?
  5. Is a no-dependency lock core worth more than reusing proper-lockfile?
  6. Should the first public package include a CLI, or should CLI wait until the API is stable?
  7. Which real NFS/SMB/cloud-sync environments matter enough to include in CI or release qualification?

Methodology

Sources reviewed:

  • local repo docs and specs around Dataroom, metabrowser, file-backed access, and NFS failures;
  • official SQLite documentation on file format, locking, WAL, corruption, backup, sessions, pragmas, threading, and appropriate use;
  • local source review of sqlite/sqlite, better-sqlite3, and node-proper-lockfile;
  • local Node spikes against node:sqlite;
  • npm package metadata observed during the prior package scan;
  • adjacent-project review for SQLite replication, backup, forks, and alternative embedded databases.

Third-party source checkouts under attic/ were treated as untrusted source trees. They were read but not installed, built, tested, or executed.

Resource Links

Official SQLite References

Source-Code References Inspected Locally

Node and npm Package References

Adjacent Projects and Alternatives

Local Repo References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment