Skip to content

Instantly share code, notes, and snippets.

@jbreckmckye
Last active November 25, 2025 02:42
Show Gist options
  • Select an option

  • Save jbreckmckye/32587f2907e473dd06d68b0362fb0048 to your computer and use it in GitHub Desktop.

Select an option

Save jbreckmckye/32587f2907e473dd06d68b0362fb0048 to your computer and use it in GitHub Desktop.
The CloudFlare outage was a good thing

The Cloudflare outage was a good thing

Cloudflare, the CDN provider, suffered a massive outage today. Some of the world's most popular apps and web services were left inaccessible for serveral hours whilst the Cloudflare team scrambled to fix a whole swathe of the internet.

And that might be a good thing.

The proximate cause of the outage was pretty mundane: a bad config file triggered a latent bug in one of Cloudflare's services. The file was too large (details still hazy) and this led to a cascading failure across Cloudflare operations. Probably there is some useful post-morteming about canary releases and staged rollouts.

But the bigger problem, the ultimate cause, behind today's chaos is the creeping centralisation of the internet and a society that is sleepwalking into assuming the net is always on and always working.

It's not just "trivial" stuff like Twitter and League of Legends that were affected, either. A friend of mine remarked caustically about his experience this morning

I couldn't get air for my tyres at two garages because of cloudflare going down. Bloody love the lack of resilience that goes into the design when the machine says "cash only" and there's no cash slot. So flat tires for everyone! Brilliant.

We are living in a society where every part of our lives is increasingly mediated through the internet: work, banking, retail, education, entertainment, dating, family, government ID and credit checks. And the internet is increasingly tied up in fewer and fewer points of failure.

It's ironic because the internet was actually designed for decentralisation, a system that governments could use to coordinate their response in the event of nuclear war. But due to the economics of the internet, the challenges of things like bots and scrapers, more of more web services are holed up in citadels like AWS or behind content distribution networks like Cloudflare.

Outages like today's are a good thing because they're a warning. They can force redundancy and resilience into systems. They can make the pillars of our society - governments, businesses, banks - provide reliable alternatives when things go wrong.

(Ideally ones that are completely offline)

You can draw a parallel to how COVID-19 shook up global supply chains: the logic up until 2020 was that you wanted your system to be as lean and efficient as possible, even if it meant relying totally on international supplies or keeping as little spare inventory as possible. After 2020 businesses realised they needed to diversify and build slack in the system to tolerate shocks.

In the same way that growing one kind of banana, nearly resulted in bananas going extinct, we're drifing towards a society that can't survive without digital infrastructure; and a digital infrastructure that can't operate without two or three key players. One day there's going to be an outage, a bug, or cyberattack from a hostile state, that demonstrates how fragile that system is.

Embrace outages, and build redundancy.

@fenix1851
Copy link

Such outages are like a flu shot: nobody really knows what happens inside huge systems like cloudflare or aws. One small detail can cause a chain reaction in several major services that the internet relies on. When you add the lower cost of code generation and the fact that more than 50% of internet traffic already comes from botnets, the future of the internet looks unstable. Incidents like the ones with cloudflare or aws need more active preparation.

@bchewy
Copy link

bchewy commented Nov 24, 2025

W gamer

@pabloko
Copy link

pabloko commented Nov 24, 2025

GG noobs every weekend when theres a football match CloudFlare is gone on Spain and nothing happens

@dennisvexnl
Copy link

Resilience is a matter of adopting the right strategy, but that strategy can be complex and/or cost (a lot of) money.
We're addicted to the ease of use of bugtech (see what I did there) by monthly paying a small free. On a larger scale this is done on C-level and everything gets dumped to SaaS platforms but in the end rely on the same pillars.
This trickles down into our (semi)governement and bluelight services as well.
Fact is that if AWS, Azure and CloudFlare woud kick the bucket at the same time, most of western civilization would come to a halt in a matter of hours, but as long as the convenience of these platforms outweigh the downfalls nothing is going to change......sadly
The only thing we can do is to start looking for ourselves how and what we are resilient to these services ourself. Try not using your phone for a day or two or simulate not having access to any digital assets when you would be in an emergency situation.
If we become more resilient ourselves we are better capable to help others to achieve this as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment