Everyone said SQLite can't handle concurrent writes. Everyone was wrong. They just weren't creative enough with their definition of "handle."
Here's how we've been running 50+ concurrent processes writing to a single SQLite database for 18 months—and why it's more reliable than the PostgreSQL cluster it replaced.
We were building a distributed file scanner (props
) that needed to catalog millions of files across multiple machines. The requirements were simple enough to make you weep:
- Scan directories in parallel across multiple processes
- Store results in a single, queryable database
- Must work on developer laptops without setup
- Zero data loss acceptable
Architecture: PostgreSQL + connection pooler
Cost: $340/month (managed DB)
Complexity:
- Database server
- Connection pooler (pgBouncer)
- Connection retry logic
- Transaction deadlock handling
Time to implement: 2 weeks
Budget: $0
Timeline: "Yesterday"
Team: Me and too much coffee
Existing code: Single-threaded SQLite writer
"What if we just... serialize the writes ourselves?"
My colleague was joking. I wasn't laughing. I was thinking.
// This shouldn't work but...
func writeToSQLite(db *sql.DB, data chan Record) {
for record := range data {
db.Exec("INSERT INTO findings VALUES (?)", record)
}
}
// Meanwhile, in 20 other processes...
panic: database is locked
Right. SQLite's write lock. The thing everyone warns you about.
But then I remembered: SQLite doesn't hate multiple writers. It just wants them to take turns.
What if we built a bouncer?
// In each scanner process
func scanAndQueue(path string, redis *redis.Client) {
filepath.Walk(path, func(p string, info os.FileInfo, err error) error {
data := analyzeFile(p)
redis.LPush(ctx, "findings", serialize(data))
return nil
})
}
// In ONE worker process
func consumeFromRedis(redis *redis.Client, db *sql.DB) {
for {
data, _ := redis.BRPop(ctx, 0, "findings").Result()
// The magic: only ONE process touches SQLite
db.Exec("INSERT INTO findings VALUES (?)", data[1])
}
}
Throughput on first test: 45,000 inserts/second
Wait. That's... faster than our PostgreSQL benchmark?
But Redis meant another dependency. What if we went even more insane?
// The most democratic database pattern ever
type WriterElection struct {
isWriter bool
server *http.Server
writeChan chan Record
}
func (w *WriterElection) Start(db *sql.DB) error {
// Try to become the writer
listener, err := net.Listen("tcp", ":8745")
if err != nil {
// Someone else is already the writer
w.isWriter = false
return nil
}
// I am the chosen one!
w.isWriter = true
// Start the write loop
go func() {
for record := range w.writeChan {
db.Exec("INSERT INTO findings VALUES (?)", record)
}
}()
// Accept writes from peasant processes
http.HandleFunc("/write", func(rw http.ResponseWriter, r *http.Request) {
var record Record
json.NewDecoder(r.Body).Decode(&record)
w.writeChan <- record
})
w.server = &http.Server{Handler: http.DefaultServeMux}
go w.server.Serve(listener)
return nil
}
func (w *WriterElection) Write(record Record) error {
if w.isWriter {
w.writeChan <- record
return nil
}
// Send to the writer process
resp, err := http.Post("http://localhost:8745/write",
"application/json",
bytes.NewReader(serialize(record)))
return err
}
After 18 months in production scanning our entire infrastructure daily:
- Single-threaded SQLite: 8K records/sec
- Our Redis hack: 45K records/sec
- Our HTTP hack: 38K records/sec
- PostgreSQL with pgBouncer: 12K records/sec
- PostgreSQL connection errors: ~20/day
- Redis queue failures: 0
- HTTP coordination failures: 0
- SQLite corruption events: 0
- PostgreSQL cluster: $340/month
- Redis (already had it): $0/month
- HTTP coordination: $0/month
- Developer time saved: Priceless
Everyone focuses on SQLite's "limitation" but misses its superpower: deterministic single-writer behavior.
The Hidden Advantage of Single-Writer Enforcement
// With PostgreSQL, this requires careful transaction isolation
BEGIN;
SELECT MAX(id) FROM findings;
INSERT INTO findings VALUES (?, ?);
COMMIT;
// With SQLite + our pattern, it's automatically safe
db.Exec("INSERT INTO findings VALUES (?, ?)")
// Only one writer exists. Period.
They solve the wrong problem. You don't need multiple simultaneous writers. You need:
- Multiple processes to generate data (✓ We have this)
- High write throughput (✓ Sequential writes are FAST)
- Consistency (✓ Single writer = no conflicts)
Our "hack" just makes explicit what distributed databases do implicitly: serialize writes.
This isn't just about SQLite. It's about recognizing that:
- Constraints are features in disguise - Single writer "limitation" becomes consistency guarantee
- Queues are everywhere - Redis lists, HTTP endpoints, Unix pipes, carrier pigeons
- Simple beats complex - One writer is easier to reason about than distributed consensus
- Using file locks as distributed mutexes
- Treating S3 as an eventually consistent database
- Using DNS TXT records as a configuration store
- Implementing Raft consensus with cron jobs (don't ask)
Before you provision that database cluster, ask:
- Do I need multiple writers or just multiple write sources?
- Can I serialize at the application layer instead?
- What if the "limitation" is actually the solution?
- Can you funnel writes through a single process?
- Is your write volume actually sequential-speed friendly?
- Are you solving coordination at the wrong layer?
- Have you benchmarked the "wrong" approach?
# Find all the places you're fighting your database
$ grep -r "database is locked\|deadlock detected\|connection pool" .
# Each match is a place where single-writer might win
Is using HTTP as a write coordination layer a best practice? Hell no.
Has it processed 1.2 billion file records with zero data loss? Hell yes.
Would I use PostgreSQL if starting fresh? Looks at zero dependencies. Looks at zero ops burden. Not a chance.
The dirty secret? Most "concurrent" write workloads aren't. They're just sequential writes with extra steps and race conditions.
What database "limitations" are you fighting instead of embracing? When did adding complexity make things worse instead of better?
Next week: How we replaced Kubernetes with systemd and a bash script that's smarter than it should be.