karlhorky/fly-io-2025-newsletter.md

Created January 30, 2025 10:22

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/karlhorky/f80217c1e908e5c3bbdaffb002c55eb9.js"></script>
Save karlhorky/f80217c1e908e5c3bbdaffb002c55eb9 to your computer and use it in GitHub Desktop.

Download ZIP

Fly.io January 2025 Newsletter - Managed PostgreSQL

Raw

fly-io-2025-newsletter.md

Feature story: Managed Postgres and Why We're Building it After Saying We Never Would

If there's one product decision in the history of the company we often wish we could revisit, it's how we've handled databases. In 2025, one of the biggest things we're doing is reversing that decision. So, here in this January newsletter, let's read you in to our plan.

For most of the life of this company, we've offered "unmanaged" (or, more charitably, "automated") Postgres. That's a system in which we help you stand up a Postgres cluster, optimized in some ways by us for the Fly.io platform, and then leave you to operate and scale it. We call that feature "Fly Postgres". Sometime in the relatively near future, we'll be superseding it with a managed database, called Fly MPG.

Real quick, let's recap how we got here.

People who have been with us for a long time might remember, we launched Fly.io without persistent storage of any kind. Back in 2020, we were an "edge compute provider", and our platform was optimized for stateless workloads, like image accelerators; we were a "CDN" code code.

All that changed starting in 2021, when two things happened: we launched persistent, directly attached storage, and Heroku launched the surface-to-air missile that blew up their free tier offerings. We quickly became part of an ecosystem of "Heroku alternatives", powered by a sugar rush of Heroku refugees eager to boot up full-stack apps on our platform.

What full-stack developers building on "Platform as a Service" providers want is a turnkey database, a thing they can light up and throw SQL queries at without thinking too much about it. What we had was persistent volumes, with which you could stand up your own database server. There is, obviously, a huge gap between those two things. We filled it with something weird.

The thing about the Fly.io team is, we are ourselves refugees from the managed database problem. Prior to Fly.io, many of us worked on a company called Compose.io (née MongoHQ), which did nothing but managed databases: first MongoDB, and then Postgres. That was, and still is, a whole company on its own. Not the company we were looking to build.

Our users wanted low-drama databases. The up-and-coming managed database companies had their hands full building their own platforms, and we were a bit young to partner with at the time. So instead we built "automated Postgres". Automated Postgres is Postgres that you run, manage, monitor, and scale, with assists from tooling we provide. It feels like a product feature (you create new databases with flyctl postgres create), and it sort of is, but under the hood it's not doing anything you couldn't do yourself.

This made sense to us at the time (it still does make some sense). We're building a whole public cloud here, on our own hardware, worldwide; that's a product direction that comes with a metric crapload of hard engineering problems. It did not seem reasonable to add to that all the problems we knew came with shipping a serious managed database.

The trap we set for ourself is glaringly visible in retrospect.

For a full-stack developer booting up a new app on Fly.io, Fly Postgres, our automated solution, had mostly the same UX/DX affordances as managed Postgres. To know it wasn't managed, you'd have to read the documentation, and for modern developers, documentation might as well be fine print. We set RDS expectations with something that was essentially just a bunch of scripts and flyctl Go code. When we couldn't meet managed database expectations, users were deeply disappointed.

We've been in the process of course-correcting this for a year now. The fact pattern that led us to do "automated" instead of "managed" Postgres is still there: we're a small team building an extraordinarily ambitious platform, and managed databases are a full-time problem of their own. For a time, we expected to solve this by partnering, with friends at Supabase. But we're not announcing Managed Supabase here, so: clearly that's not happening.

The Supabase story is pretty straightforward. Getting a database running on Fly.io is a Fun challenge that comes with a bunch of engineering tasks: managing multi-node multi-region clusters, dealing with attached storage, doing coordination. These things significantly complicate how you manage and monitor and recover databases. Supabase "natively" runs, for its own "direct" customers, on simpler public clouds where this work is less Fun.

There's an upside to all that Fun! You get highly-scalable multi-region global Postgres databases! But that's not really the core business Supabase is in. They're a Firebase alternative. Postgres is just a detail for them. Making a really, really good Postgres for Fly.io doesn't move their core business forward. And bridging Fly.io's platform to meet Supabase Postgres where it already is doesn't really move our business forward.

So, for the past several months, we've been building our own managed Postgres, Fly MPG.

Two big things make this tractable for us now, where it wasn't back in 2022.

First: last year, we shipped FKS, the Fly Kubernetes Service. We did FKS because we have customers that need K8s interfaces to manage their app stacks, and we did FKS because we discovered a really slick way to map K8s onto Fly Machines; we were nerdsniped. Either way, it's been a good thing, and so now we have K8s, and much of the K8s ecosystem, in our toolbelt. You'll read more about this when we get around to writing up Pilot, our new init.

Second: we have financial resources we didn't have back then, and can buy some of Fly MPG instead of building it. Specifically: we've built Fly MPG on Percona, with their backing and support. By running on Percona's K8s operator infrastructure, we get to inherit all the work they do on deployment, dynamic scaling, monitoring, backup, and cluster management.

So what are we going for here? If we pull this off, you'll get a Postgres, and it'll just… continue working for you. And if it blows up because of entropy or hardware failures (or your own fumble fingers), we want to notice before you do and recover everything. That's it.

It has been a long time coming, but after biting this bullet, we're finally closing in on a point where we can offer new applications instant access to a Postgres database that doesn't need developer babysitting. And, yes, this is a table-stakes feature for a lot of other platforms. But rolling this out on Fly.io and Fly Machines has us excited for idiosyncratically Fly.io reasons, not just because we can make them globally fast, but because we can integrate them deeply into our platform and security logic. You'll probably see it early this year from us, but at some point, we're going to make an LLM pull a database out of its butt, and it's going to just work and not be a horrifying security liability. Fun!

Anyways, if you're interested in Fly MPG, it'll be ready to play with within the next month-ish. We read replies to this newsletter! Hit us up if you're interested in early access.