Internet chaos as Cloudflare goes down.

Darat · Nov 20, 2025

Mongrel said:
Maybe someone broke the lava lamps

Hey - it's not that easy - you need to get hold of an old style mini-spotlight bulb, a LED one doesn't produce enough heat.

catsmate · Nov 20, 2025

Darat said:
Yep. They do apologise but from the start it's as if the change to the database was a natural event that just happens.

AWS did it better back in '17.

Dabop · Nov 21, 2025

Darat said:
Hey - it's not that easy - you need to get hold of an old style mini-spotlight bulb, a LED one doesn't produce enough heat.

I have been asked in the past to make a replacement 'bulb' for lava lamps (a mate actually collects them lol- there's no accounting for some peoples tastes...) and it really isn't that hard to make them up for them or indeed any application that needs a 'hot' lamp eg some older egg incubators, reptile cages etc all used bulbs as a heat source in the past...

You can buy 'bulb bases' online readily, and if its just a heat source needed, an appropriately rated resistor does the job fine, for those cases where both heat and light is needed, a resistor coupled with an LED does it... its a couple of minute job to make a 'replacement' bulb/heatsource up.... the parts are readily available online- hell, you can even buy the hand tool to do it although you can do it without one- or even full on production line machines to 'make your own' as a 'production line' job...

arthwollipot · Nov 21, 2025

Another update. See if you can spot the subtle difference from last time:

theprestige · Nov 21, 2025

Norman Alexander said:
From that report:

And, of course, NONE of those solutions include anything like "TEST ALL UPGRADES IN A CONTROLLED ENVIRONMENT BEFORE RELEASING THEM!!"

Not every aspect of a global scale production system can be tested outside of that system.

Klimax · Nov 21, 2025

arthwollipot said:
Another update. See if you can spot the subtle difference from last time:

View attachment 66234

No. I don't agree. If I were to place Angry Bird striker there, then I'd place Google. (Or Crowdstrike...)

plague311 · Nov 21, 2025

I generally like cloudflare. They're easy to use, and they're a perfect upstream for my pi-hole. This is a pretty big blunder though.

a_unique_person · Nov 21, 2025

Darat said:
Yep. They do apologise but from the start it's as if the change to the database was a natural event that just happens.

This has more detail.

The change to the database is actually done often as new threats are detected and analysed. It's more of a continual process. For reasons of reliability and performance, the memory is allocated for the rules only once then each monitoring task starts. That means it can only fit a certain amount of rules to fit in that fixed size.

Their code to detect and react on a larger than permitted number of rules is not well written. It just does a hard fail without providing helpful diagnostics. The logic of how to elegantly deal with a larger and more complex set of rules than permitted was never implemented.

JayUtah · Nov 21, 2025

Questions such as, "Does the database query return the kind of result the programmer expected?" seem like something that should have been tested in a development environment and could easily be tested in staging environment. Not the kind of thing that needs to be thrown into production with fingers crossed. Ostensibly this is something my software team would have caught via review. All code in our critical applications must be approved by two senior software engineers before it can be accepted into the version control system. I know hindsight is 20/20, but programming constructs that on their face seem to produce an unrecoverable error are the kinds of red flags our team notices.

Programming that responds to errors in input data (which would include there being too much of that data) by aborting the program doesn't seem well thought out for a critical, ongoing process. I can immediately think of different programming techniques to mitigate this. But it boils down to simply avoiding allowing an unhandled exceptional condition in production code. I agree with the presenter in the video: it largely doesn't matter what language you use or how it models program exceptions.

plague311 · Nov 21, 2025

The hardest part for Cloudflare is that they let the cat out of the bag too. When flaws like these get caught it generally sends signs to hackers of what type of flaws the company is prone to and what their processes are to handle data. Cloudflare better be in panic mode right now.

They always seemed like they had it together. The sheer time involved in managing traffic to 1.1.1.1 when that project took off had to be crazy.

Norman Alexander · Nov 21, 2025

theprestige said:
Not every aspect of a global scale production system can be tested outside of that system.

These are billion-dollar companies. They can build big enough environments to test all their distributable components.

Also, decent testing should involve edge conditions such as query response overloads. "What happens if this query returns the whole database? Is that possible, and how? Is that technically a bad situation? How do we handle that? How might we prevent that?" Etc.

Gord_in_Toronto · Nov 21, 2025

catsmate said:
Eggs <-> Basket

Aka monoculture.

Wudang · Nov 21, 2025

theprestige said:
Not every aspect of a global scale production system can be tested outside of that system.

No but code analysis tools like SonarQube should be highlighting things like unhandled exceptions.
And this is the sort of error I was telling badly trained COBOL programmers about 20 years ago.

Steve · Nov 21, 2025

Still haven't noticed any internet chaos. Seems to be just chugging along as always.

rjh01 · Nov 23, 2025

One thing the organisation could do is implement any changes for one organisation and see what happens. If it works then do a few more. Then repeat until it is fully implemented. Worst case only a few organisations go down for a few minutes until they have backed out of the change.

Internet chaos as Cloudflare goes down.

Darat

Lackey

catsmate

No longer the 1

Dabop

Master Poster

arthwollipot

Observer of Phenomena, Pronouns: he/him

theprestige

Penultimate Amazing

Klimax

NWO Cyborg 5960x (subversion VPUNPCKHQDQ)

plague311

Great minds think...

a_unique_person

Director of Hatcheries and Conditioning

JayUtah

Penultimate Amazing

plague311

Great minds think...

Norman Alexander

Penultimate Amazing

Gord_in_Toronto

Penultimate Amazing

Wudang

BOFH

Steve

Penultimate Amazing

rjh01

Gentleman of leisure

Similar threads