Internet chaos as Cloudflare goes down.

Andy_Ross · Nov 18, 2025

About 20% of websites doen including X and Chat GPT

Cloudflare apologises for outage which took down X and ChatGPT

"We apologise to our customers and the Internet in general" the web infrastructure company said.

www.bbc.co.uk

Wudang · Nov 18, 2025

I bet it's DNS again. It's way more complicated than I could hope to avoid having to explain.

Klimax · Nov 18, 2025

Wudang said:
I bet it's DNS again. It's way more complicated than I could hope to avoid having to explain.

Maybe yes + bad config in their servers. I am seeing CORS failure on their crapchallenge pages. Frankly, good thing very few sites in my country use them. (And AFAIK no eshop, CF's incompetency with their "protection" would be costly as is, with outages like this, it could be downright company-ending)

alfaniner · Nov 18, 2025

X and Chat GPT

Wow, those two are biggies. I have another app or two which are having issues. Here now a company I wasn't aware of is responsible for 20% of internet traffic? That's still why I don't believe in keeping anything personal (like pictures or documents) in storage on the web, unless it's only a backup.

JayUtah · Nov 18, 2025

alfaniner said:
Wow, those two are biggies. I have another app or two which are having issues.

Apollohoax.net is down too. The webmaster there recently added a Cloudflare human authenticator to cut down on bot traffic.

Klimax said:
...CF's incompetency with their "protection" would be costly as is, with outages like this, it could be downright company-ending)

Indeed. My friend runs an ISP that offers boutique hosting. It's where I host my Apollo website. Some of his customers have 15-minute SLA agreements. That means that the site he hosts for them can be unavailable for at most 15 minutes for pretty much any reason, after which he has to pay them $100 a minute until service is restored. When their service fails, it's an all-hands-on-deck exercise.

Dabop · Nov 18, 2025

Andy_Ross said:
About 20% of websites doen including X and Chat GPT

Cloudflare apologises for outage which took down X and ChatGPT

"We apologise to our customers and the Internet in general" the web infrastructure company said.

www.bbc.co.uk

So no great loss then....

In fact a major improvement on the overall factual accuracy of the internet with those two gone....
Can we make it permanent????

Steve · Nov 18, 2025

Dabop said:
So no great loss then....

In fact a major improvement on the overall factual accuracy of the internet with those two gone....
Can we make it permanent????

No loss at all that I can see. Has zero effect on any of my online activity.

catsmate · Nov 18, 2025

Eggs <-> Basket

arthwollipot · Nov 19, 2025

junkshop · Nov 19, 2025

arthwollipot said:
View attachment 66154

I was going to give this one of those emoji/smiley reaction things, but I couldn't pick between 'ha ha' and 'wow', and what I really wanted was an 'oh, we're ◊◊◊◊◊◊', which we don't have (also: I'm not sure how that could be rendered in emoji form, not without giving poor Otto a stroke, anyway).
In the end I posted this rambling bollocks instead.

arthwollipot · Nov 19, 2025

junkshop said:
I was going to give this one of those emoji/smiley reaction things, but I couldn't pick between 'ha ha' and 'wow', and what I really wanted was an 'oh, we're ◊◊◊◊◊◊', which we don't have (also: I'm not sure how that could be rendered in emoji form, not without giving poor Otto a stroke, anyway).
In the end I posted this rambling bollocks instead.

Worth it.

arthwollipot · Nov 20, 2025

An update:

lauwersw · Nov 20, 2025

It was not DNS this time. Nor BGP.

Cloudflare broke the internet with a bad DB query

: Thought it was the victim of a ‘hyper-scale DDoS attack’ before finding the fix

www.theregister.com

catsmate · Nov 20, 2025

lauwersw said:
It was not DNS this time. Nor BGP.

Cloudflare broke the internet with a bad DB query

: Thought it was the victim of a ‘hyper-scale DDoS attack’ before finding the fix

www.theregister.com

So incompetence then.

arthwollipot · Nov 20, 2025

catsmate said:
So incompetence then.

Isn't it always?

catsmate · Nov 20, 2025

arthwollipot said:
Isn't it always?

Hmmm, in my experience about 5-10% of technical issues are genuinely hardware induced, excluding hardware problems caused by human stupidity (wrong equipment, stretching cables at ankle height, turning off cooling, et cetera).

Norman Alexander · Nov 20, 2025

From that report:

This time around the company plans to do four things:

Hardening ingestion of Cloudflare-generated configuration files in the same way we would for user-generated input

Enabling more global kill switches for features

Eliminating the ability for core dumps or other error reports to overwhelm system resources

Reviewing failure modes for error conditions across all core proxy modules

And, of course, NONE of those solutions include anything like "TEST ALL UPGRADES IN A CONTROLLED ENVIRONMENT BEFORE RELEASING THEM!!"

Mongrel · Nov 20, 2025

lauwersw said:
It was not DNS this time. Nor BGP.

Cloudflare broke the internet with a bad DB query

: Thought it was the victim of a ‘hyper-scale DDoS attack’ before finding the fix

www.theregister.com

Maybe someone broke the lava lamps

Klimax · Nov 20, 2025

Norman Alexander said:
From that report:

And, of course, NONE of those solutions include anything like "TEST ALL UPGRADES IN A CONTROLLED ENVIRONMENT BEFORE RELEASING THEM!!"

Why would they...

Darat · Nov 20, 2025

catsmate said:
So incompetence then.

Yep. They do apologise but from the start it's as if the change to the database was a natural event that just happens.

Darat · Nov 20, 2025

Mongrel said:
Maybe someone broke the lava lamps

Hey - it's not that easy - you need to get hold of an old style mini-spotlight bulb, a LED one doesn't produce enough heat.

catsmate · Nov 20, 2025

Darat said:
Yep. They do apologise but from the start it's as if the change to the database was a natural event that just happens.

AWS did it better back in '17.

Dabop · Nov 21, 2025

Darat said:
Hey - it's not that easy - you need to get hold of an old style mini-spotlight bulb, a LED one doesn't produce enough heat.

I have been asked in the past to make a replacement 'bulb' for lava lamps (a mate actually collects them lol- there's no accounting for some peoples tastes...) and it really isn't that hard to make them up for them or indeed any application that needs a 'hot' lamp eg some older egg incubators, reptile cages etc all used bulbs as a heat source in the past...

You can buy 'bulb bases' online readily, and if its just a heat source needed, an appropriately rated resistor does the job fine, for those cases where both heat and light is needed, a resistor coupled with an LED does it... its a couple of minute job to make a 'replacement' bulb/heatsource up.... the parts are readily available online- hell, you can even buy the hand tool to do it although you can do it without one- or even full on production line machines to 'make your own' as a 'production line' job...

arthwollipot · Nov 21, 2025

Another update. See if you can spot the subtle difference from last time:

theprestige · Nov 21, 2025

Norman Alexander said:
From that report:

And, of course, NONE of those solutions include anything like "TEST ALL UPGRADES IN A CONTROLLED ENVIRONMENT BEFORE RELEASING THEM!!"

Not every aspect of a global scale production system can be tested outside of that system.

Klimax · Nov 21, 2025

arthwollipot said:
Another update. See if you can spot the subtle difference from last time:

View attachment 66234

No. I don't agree. If I were to place Angry Bird striker there, then I'd place Google. (Or Crowdstrike...)

plague311 · Nov 21, 2025

I generally like cloudflare. They're easy to use, and they're a perfect upstream for my pi-hole. This is a pretty big blunder though.

a_unique_person · Nov 21, 2025

Darat said:
Yep. They do apologise but from the start it's as if the change to the database was a natural event that just happens.

This has more detail.

The change to the database is actually done often as new threats are detected and analysed. It's more of a continual process. For reasons of reliability and performance, the memory is allocated for the rules only once then each monitoring task starts. That means it can only fit a certain amount of rules to fit in that fixed size.

Their code to detect and react on a larger than permitted number of rules is not well written. It just does a hard fail without providing helpful diagnostics. The logic of how to elegantly deal with a larger and more complex set of rules than permitted was never implemented.

JayUtah · Nov 21, 2025

Questions such as, "Does the database query return the kind of result the programmer expected?" seem like something that should have been tested in a development environment and could easily be tested in staging environment. Not the kind of thing that needs to be thrown into production with fingers crossed. Ostensibly this is something my software team would have caught via review. All code in our critical applications must be approved by two senior software engineers before it can be accepted into the version control system. I know hindsight is 20/20, but programming constructs that on their face seem to produce an unrecoverable error are the kinds of red flags our team notices.

Programming that responds to errors in input data (which would include there being too much of that data) by aborting the program doesn't seem well thought out for a critical, ongoing process. I can immediately think of different programming techniques to mitigate this. But it boils down to simply avoiding allowing an unhandled exceptional condition in production code. I agree with the presenter in the video: it largely doesn't matter what language you use or how it models program exceptions.

plague311 · Nov 21, 2025

The hardest part for Cloudflare is that they let the cat out of the bag too. When flaws like these get caught it generally sends signs to hackers of what type of flaws the company is prone to and what their processes are to handle data. Cloudflare better be in panic mode right now.

They always seemed like they had it together. The sheer time involved in managing traffic to 1.1.1.1 when that project took off had to be crazy.

Norman Alexander · Nov 21, 2025

theprestige said:
Not every aspect of a global scale production system can be tested outside of that system.

These are billion-dollar companies. They can build big enough environments to test all their distributable components.

Also, decent testing should involve edge conditions such as query response overloads. "What happens if this query returns the whole database? Is that possible, and how? Is that technically a bad situation? How do we handle that? How might we prevent that?" Etc.

Gord_in_Toronto · Nov 21, 2025

catsmate said:
Eggs <-> Basket

Aka monoculture.

Wudang · Nov 21, 2025

theprestige said:
Not every aspect of a global scale production system can be tested outside of that system.

No but code analysis tools like SonarQube should be highlighting things like unhandled exceptions.
And this is the sort of error I was telling badly trained COBOL programmers about 20 years ago.

Steve · Nov 21, 2025

Still haven't noticed any internet chaos. Seems to be just chugging along as always.

rjh01 · Nov 23, 2025

One thing the organisation could do is implement any changes for one organisation and see what happens. If it works then do a few more. Then repeat until it is fully implemented. Worst case only a few organisations go down for a few minutes until they have backed out of the change.

Internet chaos as Cloudflare goes down.

Penultimate Amazing

BOFH

NWO Cyborg 5960x (subversion VPUNPCKHQDQ)

Penultimate Amazing

Penultimate Amazing

Master Poster

Penultimate Amazing

No longer the 1

Observer of Phenomena, Pronouns: he/him

Otto's Favourite

Observer of Phenomena, Pronouns: he/him

Observer of Phenomena, Pronouns: he/him

Thinker

No longer the 1

Observer of Phenomena, Pronouns: he/him

No longer the 1

Penultimate Amazing

Begging for Scraps

NWO Cyborg 5960x (subversion VPUNPCKHQDQ)

Lackey

Lackey

No longer the 1

Master Poster

Observer of Phenomena, Pronouns: he/him

Penultimate Amazing

NWO Cyborg 5960x (subversion VPUNPCKHQDQ)

Great minds think...

Director of Hatcheries and Conditioning

Penultimate Amazing

Great minds think...

Penultimate Amazing

Penultimate Amazing

BOFH

Penultimate Amazing

Gentleman of leisure

Similar threads