Note
An experimental alternative to the Git Smart HTTP protocol, not a replacement. Both protocols can coexist, sharing the same backing object store.
Instead of packfiles, clients sync individual git objects over WebSocket. The server becomes a thin proxy over any content-addressable object store — no git library needed server-side.
| Smart HTTP (current) | WebSocket (proposed) | |
|---|---|---|
| Server memory | O(repo size) — full packfile in RAM | O(1) — one object at a time |
| Server CPU | Delta compression/decompression | Near zero |
| Server code | Full git protocol library | Small handler + WebSocket lib |
| Push latency | 2 HTTP requests + pack assembly | Continuous stream, bounded by wavefront depth |
| Resumability | Retry entire packfile | Resume from where it stopped |
A git remote helper named
git-remote-wsgit teaches git the wsgit:// URL scheme. The URL includes the
hostname so the protocol is not tied to any single provider:
git remote add origin wsgit://git.example.com/my-team/my-project
git push origin main
git clone wsgit://git.example.com/my-team/my-projectThe remote helper resolves the URL to a WebSocket connection:
wsgit://git.example.com/my-team/my-project
→ wss://git.example.com/repos/my-team/my-project/push
→ wss://git.example.com/repos/my-team/my-project/fetch
Auth is passed as a bearer token during the HTTP upgrade handshake. How the token is obtained is provider-specific (environment variable, credential helper, config file, etc.).
Two WebSocket endpoints. Frame types — binary for data, text for JSON control messages.
Binary frames (object):
[type byte][20-byte SHA-1][zstd-compressed body]
Type bytes: 1=commit 2=tree 3=blob 4=tag 5=delta
Body: canonical git object (type 1-4)
or [20-byte base SHA-1][delta instructions] (type 5)
Binary frames (want):
[20-byte SHA-1]×N (concatenated hashes)
Text frames (JSON control):
Client → Server: {"id": 1, "ref": "refs/heads/main", "new": "def..."}
Server → Client: {"id": 1, "status": "done", ...} or {"id": 1, "status": "error", ...}
- Client sends a control message declaring the ref update and target hash
- Server seeds an expect set with the target hash
- Client streams objects as binary frames (depth-first order)
- For each object, the server:
- Validates the 20-byte hash prefix is in the expect set (fast reject if not)
- Verifies content hash matches the declared hash (abort on mismatch)
- Stores the object
- Parses to discover children, checks store for each, adds missing to expect set
- When expect set is empty, update the ref based on mode:
- Default: verify new commit is a descendant of current ref, then conditional put
force: true: skip ancestry check, unconditional putoldprovided: conditional put requiring ref equalsold(force-with-lease)
Important
The server never stores an unexpected object. The expect set is derived entirely from the declared commit graph. No orphans, no garbage collection needed.
Tip
On conflict or non-fast-forward rejection, the client can retry on the same connection with a new control message. All objects are already stored.
Multiple ref updates can be in flight on one connection. Each has a unique
client-assigned id. Objects are routed to the queue containing their hash.
Shared objects (branch + tag pointing to same commit) only need to be sent once.
Example: push branch + tag concurrently
Client Server
────── ──────
{"id":1,"ref":"refs/heads/main",
"new":"def456"} [stream 1 expect: {def456}]
{"id":2,"ref":"refs/tags/v1.0",
"new":"def456"} [stream 2 expect: {def456}]
[def456][commit] ───────────────► [store, parse → tree aaa]
[stream 1: {aaa}]
[stream 2: complete → update tag]
◄──────────────── {"id":2,"status":"done"}
[aaa][tree] ────────────────────► [store, parse → bbb,ccc,ddd]
[ccc exists in store]
[stream 1: {bbb, ddd}]
[bbb][blob] ────────────────────► [store, stream 1: {ddd}]
[ddd][blob] ────────────────────► [store, stream 1: empty]
[update ref main → def456]
◄──────────────── {"id":1,"status":"done"}
Binary frames use the same format as push (see above), with directions reversed:
Client → Server: want frames (concatenated hashes)
Server → Client: object frames ([type][hash][body])
Text frames (JSON control):
Client → Server: {"id": 1, "ref": "refs/heads/main"} (ref prefix filter)
Server → Client: {"id": 1, "status": "refs", "refs": {...}}
Client → Server: {"id": 1, "status": "done"} (client finished)
- Client sends a control message with a ref prefix filter
- Server resolves matching refs → sends
refscontrol message - Client compares against local state, sends binary want frames
- Server pipelines object responses (hash prefix + zstd content)
- Client parses objects, discovers more needs, sends more wants
- Client sends
donewhen finished
Example: pull latest
Client Server
────── ──────
{"id":1,"ref":"refs/heads/main"}
◄──────────────── {"id":1,"status":"refs",
"refs":{"refs/heads/main":"def456..."}}
[def456] (20 bytes) ───────────►
◄──────────────── [def456][commit]
[aaa] (20 bytes) ──────────────►
◄──────────────── [aaa][tree]
[bbb|ddd] (40 bytes) ──────────►
◄──────────────── [bbb][blob]
◄──────────────── [ddd][blob]
{"id":1,"status":"done"} ──────►
The client streams objects as it walks its own graph. The server checks its object store for each child and only requests what's missing — no need to assemble or unpack a complete packfile on either side.
Tip
Optimizing existence checks: At connection start, the server can list all objects for the repo and build an in-memory hash set. Until the listing completes, individual existence checks cover the first few objects. Once the set is ready, all subsequent checks are local. This avoids sending unnecessary "want" frames to clients on slow connections.
Important
Smart HTTP fetch has a hidden latency cost: the server reads ALL objects, computes delta chains, and assembles a complete packfile before sending byte one. For large repos this takes seconds. This protocol starts sending immediately.
Latency is bounded by object graph depth (typically 3-5 levels), not object count. Each level of the tree fans out into parallel store lookups. Exact numbers depend on the client's connection and store response times — worth measuring with a prototype.
The remote helper uses git's import/export capability, which provides raw
uncompressed data — the protocol is free to choose any wire compression. zstd is
faster at both compression and decompression while achieving better ratios than
zlib.
Objects are stored individually as they arrive. If a push fails mid-stream, the client reconnects and re-sends the control message. The server walks the graph, finds most objects already present, and only needs the remainder.
Binary frames may carry a delta instead of a full object, using type byte
5 in the standard frame format. The zstd-compressed body contains the 20-byte
base hash followed by delta instructions (git's REF_DELTA copy/insert format).
All implementations must decode deltas; creating them is optional.
When a receiver gets a delta frame:
- Check if the base object is available locally (object store or local repo)
- Yes → apply delta, store/process the resulting full object
- No → buffer the delta, request the base via a want frame
The base always exists somewhere — it's from a previous commit. On push, the server reads it from its own store. On fetch, the client either has it locally or requests it, adding one extra wavefront generation.
Deltas trade bandwidth for wavefront depth. Encoders should only emit a delta when the savings are substantial — for example, a large tree object where a single entry changed between commits. Heuristics:
- Skip deltas for small objects (the overhead isn't worth the extra round trip)
- Skip deltas where most of the content changed (the delta is barely smaller)
- Good candidates: tree objects at the same path in parent vs child commit, where typically one entry was added, removed, or updated
Evolved from the lit package manager protocol (WebSocket object sync over a content-addressable store). Lit demonstrated faster package transfers than npm in production despite running on a single server vs npm's CDN.
The protocol assumes a server backed by:
- A content-addressable object store — any store that can put/get/check by SHA-1 hash (S3, GCS, Azure Blob, local filesystem, etc.)
- A ref store with conditional writes — any store supporting compare-and-swap on key-value pairs (DynamoDB, etcd, PostgreSQL, Redis, etc.)
The server needs no git library. Object parsing is limited to extracting child hashes from commits (parent + tree) and trees (entry hashes) — roughly 50 lines of code in any language.