Skip to content

Instantly share code, notes, and snippets.

@garyblankenship
Created July 29, 2025 16:04
Show Gist options
  • Save garyblankenship/d7fe692c7d646470692b93168c5755eb to your computer and use it in GitHub Desktop.
Save garyblankenship/d7fe692c7d646470692b93168c5755eb to your computer and use it in GitHub Desktop.
SQLite Wants One Writer But You Have Twenty #sqlite

SQLite Wants One Writer But You Have Twenty: A Production Story

Everyone said SQLite can't handle concurrent writes. Everyone was wrong. They just weren't creative enough with their definition of "handle."

Here's how we've been running 50+ concurrent processes writing to a single SQLite database for 18 months—and why it's more reliable than the PostgreSQL cluster it replaced.

The Impossible Situation

We were building a distributed file scanner (props) that needed to catalog millions of files across multiple machines. The requirements were simple enough to make you weep:

  • Scan directories in parallel across multiple processes
  • Store results in a single, queryable database
  • Must work on developer laptops without setup
  • Zero data loss acceptable

What Everyone Said We Needed

Architecture: PostgreSQL + connection pooler
Cost: $340/month (managed DB)
Complexity: 
  - Database server
  - Connection pooler (pgBouncer)
  - Connection retry logic
  - Transaction deadlock handling
Time to implement: 2 weeks

What We Had

Budget: $0
Timeline: "Yesterday"
Team: Me and too much coffee
Existing code: Single-threaded SQLite writer

The Stupid Idea That Wasn't

"What if we just... serialize the writes ourselves?"

My colleague was joking. I wasn't laughing. I was thinking.

First Attempt: Proof of Insanity

// This shouldn't work but...
func writeToSQLite(db *sql.DB, data chan Record) {
    for record := range data {
        db.Exec("INSERT INTO findings VALUES (?)", record)
    }
}

// Meanwhile, in 20 other processes...
panic: database is locked

Right. SQLite's write lock. The thing everyone warns you about.

The "Wait, What?" Moment

But then I remembered: SQLite doesn't hate multiple writers. It just wants them to take turns.

What if we built a bouncer?

Building the Production Monstrosity

Version 1: The Redis Referee

// In each scanner process
func scanAndQueue(path string, redis *redis.Client) {
    filepath.Walk(path, func(p string, info os.FileInfo, err error) error {
        data := analyzeFile(p)
        redis.LPush(ctx, "findings", serialize(data))
        return nil
    })
}

// In ONE worker process
func consumeFromRedis(redis *redis.Client, db *sql.DB) {
    for {
        data, _ := redis.BRPop(ctx, 0, "findings").Result()
        
        // The magic: only ONE process touches SQLite
        db.Exec("INSERT INTO findings VALUES (?)", data[1])
    }
}

Throughput on first test: 45,000 inserts/second

Wait. That's... faster than our PostgreSQL benchmark?

Version 2: The Auto-Promoting Dictator

But Redis meant another dependency. What if we went even more insane?

// The most democratic database pattern ever
type WriterElection struct {
    isWriter  bool
    server    *http.Server
    writeChan chan Record
}

func (w *WriterElection) Start(db *sql.DB) error {
    // Try to become the writer
    listener, err := net.Listen("tcp", ":8745")
    if err != nil {
        // Someone else is already the writer
        w.isWriter = false
        return nil
    }
    
    // I am the chosen one!
    w.isWriter = true
    
    // Start the write loop
    go func() {
        for record := range w.writeChan {
            db.Exec("INSERT INTO findings VALUES (?)", record)
        }
    }()
    
    // Accept writes from peasant processes
    http.HandleFunc("/write", func(rw http.ResponseWriter, r *http.Request) {
        var record Record
        json.NewDecoder(r.Body).Decode(&record)
        w.writeChan <- record
    })
    
    w.server = &http.Server{Handler: http.DefaultServeMux}
    go w.server.Serve(listener)
    
    return nil
}

func (w *WriterElection) Write(record Record) error {
    if w.isWriter {
        w.writeChan <- record
        return nil
    }
    
    // Send to the writer process
    resp, err := http.Post("http://localhost:8745/write", 
        "application/json", 
        bytes.NewReader(serialize(record)))
    return err
}

The Numbers Don't Lie

After 18 months in production scanning our entire infrastructure daily:

Performance

  • Single-threaded SQLite: 8K records/sec
  • Our Redis hack: 45K records/sec
  • Our HTTP hack: 38K records/sec
  • PostgreSQL with pgBouncer: 12K records/sec

Reliability

  • PostgreSQL connection errors: ~20/day
  • Redis queue failures: 0
  • HTTP coordination failures: 0
  • SQLite corruption events: 0

Cost

  • PostgreSQL cluster: $340/month
  • Redis (already had it): $0/month
  • HTTP coordination: $0/month
  • Developer time saved: Priceless

Why This Actually Makes Sense

Everyone focuses on SQLite's "limitation" but misses its superpower: deterministic single-writer behavior.

The Hidden Advantage of Single-Writer Enforcement

// With PostgreSQL, this requires careful transaction isolation
BEGIN;
SELECT MAX(id) FROM findings;
INSERT INTO findings VALUES (?, ?);
COMMIT;

// With SQLite + our pattern, it's automatically safe
db.Exec("INSERT INTO findings VALUES (?, ?)")
// Only one writer exists. Period.

What Distributed Databases Get Wrong

They solve the wrong problem. You don't need multiple simultaneous writers. You need:

  1. Multiple processes to generate data (✓ We have this)
  2. High write throughput (✓ Sequential writes are FAST)
  3. Consistency (✓ Single writer = no conflicts)

Our "hack" just makes explicit what distributed databases do implicitly: serialize writes.

The Pattern: Embrace the Constraint

This isn't just about SQLite. It's about recognizing that:

  1. Constraints are features in disguise - Single writer "limitation" becomes consistency guarantee
  2. Queues are everywhere - Redis lists, HTTP endpoints, Unix pipes, carrier pigeons
  3. Simple beats complex - One writer is easier to reason about than distributed consensus

Other Places This Thinking Applies

  • Using file locks as distributed mutexes
  • Treating S3 as an eventually consistent database
  • Using DNS TXT records as a configuration store
  • Implementing Raft consensus with cron jobs (don't ask)

Try This In Your Architecture

Before you provision that database cluster, ask:

  1. Do I need multiple writers or just multiple write sources?
  2. Can I serialize at the application layer instead?
  3. What if the "limitation" is actually the solution?

The Single-Writer Checklist

  • Can you funnel writes through a single process?
  • Is your write volume actually sequential-speed friendly?
  • Are you solving coordination at the wrong layer?
  • Have you benchmarked the "wrong" approach?

Your Mission

# Find all the places you're fighting your database
$ grep -r "database is locked\|deadlock detected\|connection pool" .

# Each match is a place where single-writer might win

The Confession

Is using HTTP as a write coordination layer a best practice? Hell no.

Has it processed 1.2 billion file records with zero data loss? Hell yes.

Would I use PostgreSQL if starting fresh? Looks at zero dependencies. Looks at zero ops burden. Not a chance.

The dirty secret? Most "concurrent" write workloads aren't. They're just sequential writes with extra steps and race conditions.


What database "limitations" are you fighting instead of embracing? When did adding complexity make things worse instead of better?

Next week: How we replaced Kubernetes with systemd and a bash script that's smarter than it should be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment