Software quality collapse

Wudang · Oct 13, 2025

The Great Software Quality Collapse: How We Normalized Catastrophe

The Apple Calculator leaked 32GB of RAM.

techtrenches.substack.com

I thought of posting this to the AI thread but AI is just part of the problem.

Our research found:

AI-generated code contains 322% more security vulnerabilities

45% of all AI-generated code has exploitable flaws

Junior developers using AI cause damage 4x faster than without it

70% of hiring managers trust AI output more than junior developer code

We've created a perfect storm: tools that amplify incompetence, used by developers who can't evaluate the output, reviewed by managers who trust the machine more than their people.

But AI wasn't the cause of the big Crowstrike fubar:

Total economic damage: $10 billion minimum.
The root cause? They expected 21 fields but received 20.
One. Missing. Field.
This wasn't sophisticated. This was Computer Science 101 error handling that nobody implemented. And it passed through their entire deployment pipeline.

Sanitize your inputs.

The Great Zaganza · Oct 13, 2025

the AI boom is build on load-bearing straws

Wudang · Oct 13, 2025

Somewhere I read an article about programmers building up islands of incompetence. It mirrored my experience of several application programming teams at a major bank. The team lead became so by performing adequately at churning requirements into code while dotting the ts and crossing the is as per the process. They distrust people who know how to do it better.
I’ll keep trying to find it.

Darat · Oct 13, 2025

Was watching a video about some of OpenAI's recent announcement and in light of the "levels of abstraction" comment caught this one:

Darat · Oct 13, 2025

The whole aim of a lot of these companies is to of course get rid of expensive humans and experienced coders and developers are expensive, but if all you need is someone who can type a prompt....

Deepmind has just published this about its AI to automatically detect and fix security issues. https://deepmind.google/discover/blog/introducing-codemender-an-ai-agent-for-code-security/

Apparently this has already been used to fix 72 security fixes in open-source projects, including detecting stuff like heap buffer overflows. Interestingly Google does say that human review is required before code submission. So in the end we get back to needing expensive humans and who is going to invest in low-cost humans i.e. junior coders so that you get expensive humans i.e. experienced and good developers in 20 years time who can do the human reviewing? I think they've added in the "requires human review" as a temporary fob.

Just before I was about to hit post reply I had a thought (occasionally still happens) what a great tool this will be for malicious actors, get it to find security flaws and then use it to create exploits based on those flaws. Zero-day vulnerabilities are exploited the day software is released... <gulp>

JayUtah · Oct 13, 2025

Darat said:
The whole aim of a lot of these companies is to of course get rid of expensive humans and experienced coders and developers are expensive, but if all you need is someone who can type a prompt....

We've used AI to generate code at my company with only limited success.

Since the software systems described in the OP seem to be mostly consumer-facing commodity systems, I wager that time to market is the driving force. Back in the day you could spend a year on a 50,000 line application and update it only yearly for bug fixes. Nowadays it seems like multi-million line programs are going to market on a much faster schedule and being updated on a much faster timetable. Availability seems to be more important than correctness or performance in that sector.

Darat said:
Deepmind has just published this about its AI to automatically detect and fix security issues. https://deepmind.google/discover/blog/introducing-codemender-an-ai-agent-for-code-security/

Apparently this has already been used to fix 72 security fixes in open-source projects, including detecting stuff like heap buffer overflows.

I might want to give that a try. There are already many static and dynamic code analysis tools out there. I'd like to know how this one is different aside from having the AI moniker attached to it.

Darat said:
Interestingly Google does say that human review is required before code submission. So in the end we get back to needing expensive humans and who is going to invest in low-cost humans i.e. junior coders so that you get expensive humans i.e. experienced and good developers in 20 years time who can do the human reviewing? I think they've added in the "requires human review" as a temporary fob.

Our senior software engineers are quite experienced. Most of them are in their 40s and 50s and quite a lot of them came from the gaming industry. Their management reports that they find about twice as many bugs through manual code review than via static code analysis and QA testing. But then again our software needs are weird. We do embedded systems extensively. But then we also have supercomputers. Very little of our software is meant to run on a consumer PC.

Ironically the memory leakages reported in the OP would be considered unacceptable for our supercomputers. Yes, we have scary amounts of RAM in them, but the problems we program almost always still bump up against the limits of the hardware. If one of our custom program solutions on a supercomputer is leaking 30+ gigabytes, that's a show-stopper bug. Some of our programs have to run continuously for up to two weeks on the supercomputers, not just without crashing but also without incurring anomalous performance issues.

From my chair there is still very much a market for diligent, experienced software developers.

Darat · Oct 13, 2025

And given one of the points in the article:

Broadcom Stock Surges On OpenAI Chip Deal

Broadcom stock jumped after the chipmaker announced a deal to supply AI processors for artificial intelligence leader OpenAI.

www.investors.com

….Under their "strategic collaboration," OpenAI plans to deploy 10 gigawatts of AI accelerators designed with Broadcom. A gigawatt is a measure of power that is increasingly being used to describe the biggest clusters of AI processors…..

lauwersw · Oct 14, 2025

At my company there's also a big push for AI and most of my team members were quite skeptical (I work in a great team

). The company is providing strict guidelines how to use AI and has setup internal platforms to avoid leaking any proprietary software to external models. AI generated software must stay below a certain percentage of the total lines and so on. All quite encouraging.

I have been playing with it a bit and I must say I'm reasonably impressed about what it can do. I'm learning Python and I asked the AI to setup an initial program that can take some arguments and perform some async IO stuff. That worked really well. Next putting the real intelligence into the code was less great, but I'm totally fine with that. For many tasks you need a ton of uninteresting boilerplate code and scaffolding and it can take that out of your hands and save you some time. We have used it to add a variable to our code base. This requires touching 5 files in different places and it does that in 30 seconds instead of 10 minutes of fiddling to remember where it should all go. The code still gets reviewed and tested manually after that.

It can do code reviews. A lot of the comments are noise. Not that the remarks are wrong, but it lacks some contest that humans have. Still it did point out a few interesting issues. I was impressed by a form of complex reasoning where it pointed out that normally you shouldn't set file permissions as wide, but because this other part seems to indicate it handles public information, it's probably ok. That's actual reasoning!

I must admit I'm starting to warm up to AI. It's a great smart code secretary that can help us, but it can't replace us. Yet...

Wudang · Oct 14, 2025

"The code still gets reviewed and tested manually after that."
That's a key part. I helped write test code and cases for IBM's CICS, MQ and other products and everything went through massive testing and regression testing. The lines of code (I know) to test the products were significantly larger than the products themselves.
Working as a developer later writing java webapps we wrote a big project template with all the mocking etc etc ready to have unit tests for everything and larger tests and SonarQube reviewing it all. We still turned fixes and changes around quickly.
I'd say the articles main point is that AI is the straw that threatens to break a software ecology that is already creaking because of a lack of these factors.

Darat · Oct 14, 2025

And it really doesn't help now that what would have been beta versions of apps a few years back are released as release candidates and this is considered normal and indeed often seems to be treated as a virtue!

Wudang · Oct 14, 2025

Darat said:
And it really doesn't help now that what would have been beta versions of apps a few years back are released as release candidates and this is considered normal and indeed often seems to be treated as a virtue!

Yes, often a big misunderstanding of the original "run fast and break things" which was about small changes easily backed out or fixed and deployed to a few instances at first.

Beelzebuddy · Oct 14, 2025

I remember hearing this same crap about the newfangled object oriented programming back in the day, and it was old then because people had been hearing it about compilers for decades. Look at that guy's timeline, he thinks software bloat only started becoming a problem in 2018.

Mongrel · Oct 15, 2025

Darat said:
And it really doesn't help now that what would have been beta versions of apps a few years back are released as release candidates and this is considered normal and indeed often seems to be treated as a virtue!

That's been happening in the gaming space since the early noughties, once the Xbox came out with an ethernet port and the expectation that you'd be persistently online, the game devs quickly brought them in-line with PC titles and were chucking half-finished crap out of the door with release day patches.

jeremyp · Oct 21, 2025

I just started Apple Calculator on my Mac and it's currently running at 61Mb*. This "it leaks 32Gb" is really meaningless. That was probably the amount it had leaked before the person running it noticed there was a problem. I generally leave Calculator running all the time and so, if there is a leak, it's bound to cause problems eventually. The real question is how fast does it leak memory.

* a trifling amount in today's terms but unimaginable to 13 year old me who started programming on a computer with 16k.

Darat · Oct 21, 2025

I presume it has been fixed? Perhaps that is why for many years there was no default calculator on iPads!

Darat · Oct 21, 2025

Mongrel said:
That's been happening in the gaming space since the early noughties, once the Xbox came out with an ethernet port and the expectation that you'd be persistently online, the game devs quickly brought them in-line with PC titles and were chucking half-finished crap out of the door with release day patches.

Well I do remember once arguing with Nintendo about a clipping bug on the N64, apparently there was one area that if you ran at the wall something like 30 plus times you could drive out of the world. Should that have been fixed before shipping - my view was that it wasn't a stop bug as no one in right mind would do that. (It was by the way a terrible game, god knows why anyone would ever have bought it, the producer had a cut out on his desk that showed it as "No 1 Game of the Year" - the context of the article that he didn't cut out was that it was the No 1 WORSE title of the year. It was a contractable obligation that we had a game with its title approved by Nintendo by a certain date. As ever the mighty dollar - or actually francs was the "quality" bar.)

Wudang · Oct 21, 2025

I won't talk about my part in the alpha of what became IBM Websphere as people tend to throw rocks at me. Customers actually paid large sums to be in the alpha program which was a POS but woe betide anyone who even tried hinting there were problems.

JayUtah · Oct 21, 2025

jeremyp said:
I just started Apple Calculator on my Mac and it's currently running at 61Mb*. This "it leaks 32Gb" is really meaningless.

On MacOS the ps -l command is showing me 34.6 GB vsize on a newly opened Calculator app after a few calculations. It's showing a resident set size of about 112 MB. The "resident set" is the set of virtual memory pages that are actually resident in RAM, generally corresponding to the minimum amount of memory for code and data needed to run the program at that given second.

Virtual memory size is meaningless for the purpose of determining resource usaget. All the processes running on my MacBook have virtual set sizes between 30 and 35 GB. There are many reasons why the Darwin kernel would report this without the process actually using that much memory. I wonder if that's the measurement they're (mistakenly) using.

jeremyp said:
That was probably the amount it had leaked before the person running it noticed there was a problem. I generally leave Calculator running all the time and so, if there is a leak, it's bound to cause problems eventually. The real question is how fast does it leak memory.

Yes, this. As I mentioned up-thread, the programs we run on our supercomputers run for up to two weeks and simply can't leak because even small leaks add up rapidly. It's a paradox that our programmers working on some of the most powerful computers available have to be just as resource-conscious as someone programming an embedded system. One of our systems has a RAM capacity of 4 petabytes, and that's not even within the ballpark of what's running at, say, the Dept. of Energy sites today.

Leaving a program running in the background waiting for human input is unlikely to result in a program state that multiples memory leaks. It's mostly idle. But it's absolutely the case that a program that's doing work in the background can get into a state where it's leaking memory at a measurable rate. We had one program that got into a state where it was leaking 32 kB for every message it received from another node in the computer. That translated to a one-line error in the source code.

jeremyp said:
* a trifling amount in today's terms but unimaginable to 13 year old me who started programming on a computer with 16k.

Indeed. The first computer I did engineering on was an IBM System 370 with a whopping 8 megabytes of RAM. We spent a day in meetings trying to justify to management an upgrade to 12 megabytes. We eventually upgraded to a System 3084 and finally a System 3090 with a whopping 6 CPUs and 32 megabytes per CPU. That was luxury.

catsmate · Oct 21, 2025

Wudang said:
I won't talk about my part in the alpha of what became IBM Websphere as people tend to throw rocks at me. Customers actually paid large sums to be in the alpha program which was a POS but woe betide anyone who even tried hinting there were problems.

View media item 4707

Blue Mountain · Oct 22, 2025

JayUtah said:
As I mentioned up-thread, the programs we run on our supercomputers run for up to two weeks and simply can't leak because even small leaks add up rapidly. It's a paradox that our programmers working on some of the most powerful computers available have to be just as resource-conscious as someone programming an embedded system. One of our systems has a RAM capacity of 4 petabytes, and that's not even within the ballpark of what's running at, say, the Dept. of Energy sites today.

Have you considered writing in a memory safe language such as Logo or Python? :rntongue:

JayUtah · Oct 22, 2025

Blue Mountain said:
Have you considered writing in a memory safe language such as Logo or Python?

Yes. We use Python extensively for in-house software. And some of our newer supercomputing applications are written in Python using its modules for parallel computation. I'll try to explain the paradox without derailing too much from software quality in general into specialized software techniques for supercomputing.

Memory-safe is a synonym for memory-managed. In a memory-managed language, the final code uses a language-specific runtime to manage the assignment of RAM to program structures and its subsequent recovery to avoid memory leaks. This comes at a computational and storage overhead. For computation, every reference to a program variable must be wrapped in additional code to interface with the memory manager and assure the correct dereference. For storage, the runtime must maintain its own metadata to organize the program storage which may exceed the actual program storage by a significant margin (or even a factor). This is often too much overhead and we have to turn to specialized Python modules to try to get around it.

The paradox is that both very small computers and very large computers achieve their desired ends by staying as close as possible to the bare metal. Our embedded data concentrators don't do any memory allocations on the fly, for example, and use CPU-specific SIMD instructions to eke the most performance out of the least silicon (i.e., the most robust silicon for hostile environments). Similarly our cluster-oriented software often uses hand-optimized code to avoid even normal overhead of general computing—cache behavior, for example. That's because your largest computers are always being asked to solve the largest problems, where even a slight overhead multiplies across many days of computation and many pages of RAM. Being able to optimize frequently-used code segments at the hardware level can win you days. And being able to use RAM without memory-management overhead can mean the difference between 10 and 20 days of computation. Doing that in Python means essentially suppressing all the advantages you would ordinarily get from Python.

There are other considerations too. Supercomputing on clusters is not just running the same software as before, only on a faster computer. The program architecture has to take advantage of the knowledge that it's running on a cluster. And the parallelism relies on predictable per-node computation times. That sort of synchronicity is hard to achieve with language runtimes jittering up the program steps.

Ironically medium-sized computers solving medium-sized problems is where you see the advantage from easier-to-use, write-once-run-everywhere languages like Python. If a computation is going to take a "day," it's often unimportant whether that day is 12 hours of computations or 15 hours. In all cases you're going to start it in the morning and come back the next morning to get your answer. But for large-scale computers, your problem sizes are once again bumping up against the limits of the hardware and you want to start throwing out things you don't need in order to accommodate just the problem.

Really it comes down to development cost, which is where we circle back to the thread topic. A lot of consumer-facing software today is of poor quality because its owners are trying to minimize development cost. Every bank, for example, has an online banking application. The cost of developing and maintaining that code is overhead for the bank. So they're all looking for the ideal price point, and the consumer's tolerance for bugs, inefficient computation, or inexpertly designed interactions is a factor in that calculus. In cases like Apple's calculator app, the most cost-effective solution overall might simply be to expand the memory size of the typical MacBook and charge the customer $50 extra for it. And our medium-sized supercomputers solving medium-sized problems can often overcome limits simply by adding $10,000 worth of additional RAM to them, and that's cheaper than asking a senior software engineer to spend a week optimizing the code.

It's quite possible to achieve good memory behavior using the classic compiled languages. It costs more to develop, but that additional cost pays for itself over time in terms of other advantages in our specific use cases. If it takes you 112 megabytes on a personal computer to find the square root of 42, that's okay. If it's a problem, add RAM to the computer. But if you're simulating a nuclear exposition at centimeter resolution or solving a matrix in 10¹² unknowns, every CPU cycle and every byte of memory still counts, and the best solution isn't always an incremental augmentation. Solving those gargantuan problems in finite time on the best hardware you can muster still means keeping your eye on the sparrow, ignoring the fact that there are trillions of sparrows.

The other problem is that a lot of the hard code is already written and debugged in the classical languages. This is often extremely difficult code derived from math that would make your eyes bleed. Its correct implementation on a computer is therefore often prohibitively difficult to verify. Once you've done it, you're reluctant to do it again. Hence rewriting it all from Fortran to Python is not a light undertaking. Thankfully Python provides a robust method (libpython) for Python programs to access compiled code in other languages. But then again in order to use it you typically have to sidestep Python's memory manager.

Gulliver Foyle · Oct 22, 2025

Darat said:
And given one of the points in the article:

Broadcom Stock Surges On OpenAI Chip Deal

Broadcom stock jumped after the chipmaker announced a deal to supply AI processors for artificial intelligence leader OpenAI.

www.investors.com

….Under their "strategic collaboration," OpenAI plans to deploy 10 gigawatts of AI accelerators designed with Broadcom. A gigawatt is a measure of power that is increasingly being used to describe the biggest clusters of AI processors…..

Probably not going to happen, or even come close.

The linked article is one by Ed Zitron discussing, at length (about 18,000 words) why LLMs and the companies building them are promising pie in the sky stuff and not even coming close to fulfilling their promises.

Blue Mountain · Oct 23, 2025

JayUtah said:
The other problem is that a lot of the hard code is already written and debugged in the classical languages. This is often extremely difficult code derived from math that would make your eyes bleed. Its correct implementation on a computer is therefore often prohibitively difficult to verify. Once you've done it, you're reluctant to do it again. Hence rewriting it all from Fortran to Python is not a light undertaking. Thankfully Python provides a robust method (libpython) for Python programs to access compiled code in other languages. But then again in order to use it you typically have to sidestep Python's memory manager.

In a way I'm envious that you get to do such programming on superscale hardware. I'm no slouch as a programmer, but what you're doing is well above my (last) pay grade.

JayUtah · Oct 23, 2025

Blue Mountain said:
In a way I'm envious that you get to do such programming on superscale hardware. I'm no slouch as a programmer, but what you're doing is well above my (last) pay grade.

Yes, it's exciting to work on large-scale computers. But we mostly consider them tools toward the greater end of achieving high standards in engineering and science. The results of large-scale analysis generally just become part of larger design and decision-making efforts that involve human brains.

As for the programming, I'm sure you'd be up to the task. Think of it not as some kind of high priesthood, but simply as a different vernacular of software writing. It can be learned and mastered just like any other. The software developers who slip easiest into the idiom tend to have come from backgrounds such as programming game consoles. Our software developers have really good tools to help them with our special needs. You might consider it a harder programming task than others, but it comes with more scheduled development time and better tools.

Now we do have some "magic" code here and there. In general these are carefully profiled and carefully optimized sections of critical libraries. And they have comments something like :—

C:

/* Don't touch this. If you touch it, Robbie will come find you and kill you. */

At least the Apollo lunar module programmers were classy enough to express the equivalent sentiment in Latin: NOLI SE TANGERE.

Apollo-11/Luminary099/BURN_BABY_BURN--MASTER_IGNITION_ROUTINE.agc at master · chrislgarry/Apollo-11

Original Apollo 11 Guidance Computer (AGC) source code for the command and lunar modules. - chrislgarry/Apollo-11

github.com

But we all understand that to be the product of the same hard word that every conscientious employee is occasionally called upon to do. Every programmer can talk about that one problem that was really, really hard to solve, and remain proud of the solution whether it's running in a game, a web app, or a computer the size of a tennis court.

JayUtah · Oct 23, 2025

As long as we're talking about quality collapse, can we talk about Apple?

I just upgraded my iOS devices to iOS 26. Apple has gone from sequential version numbers to year-based release numbers. I got my first look at the much anticipated Liquid Glass design language. It's awful. It looks like Windows Vista and has all kinds of use and accessibility problems. Basically, UI elements behave as if they're made of shaped glass. They're transparent, but they refractively distort underlying elements. The biggest problem is that it doesn't solve the text-atop-text problem that made so many transparency-based designs unworkable. Your foreground text competes visually with slightly distorted background text. This was a solved problem, Apple!

Similarly Safari 26 has completely broken its Bookmarks sidebar. As with most modern browsers, Safari models bookmarks as a folder hierarchy. But instead of Apple's version of the tree view that allows you to see the whole hierarchy but collapse and expand at will, you now have to click into and out of each bookmark folder. The only bookmarks visible in the UI are those within that particular folder. The subfolders are now unalterably clustered at the top with the leaf-level bookmarks in a group underneath them, regardless of the user-specified order from previous versions. Instead of one bookmark per line, each bookmark is now a tile containing metadata or scraped data from the bookmark target. This reduces the number of bookmarks you can display without scrolling. The ability to edit bookmark grouping and ordering in the sidebar itself has vanished, replaced by a separate Edit Bookmarks dialog. And despite the lessons learned from the "smart" playlist disaster in Apple Music, we still have Apple-mandated folders in the sidebar that can't be removed or relegated to lower-priority positions. And the final insult is that the sidebar doesn't remember or inherit the navigated position. In each newly opened window you have to navigate from the top-level view into your user-added bookmarks. This greatly decreases bookmark functionality and increases the UI load for navigating via bookmarks. I'm hearing other users call it nigh unto unusable for their purposes.

What happened, Apple? Back in the day, Apple was almost fanatical about data-driven UI design. They all read Don Norman's book and lived by it. They did all kinds of science to determine what worked better and forced app programmers to get in line with it or get lost. They eschewed what they felt was a more clunky design philosophy based on modal techniques such as dialogs and menus and just made everything the obvious affordance. If you could manipulate something directly and intuitively, that's how it was programmed.

Further, Apple sold its devices on the premise that everything was seamlessly similar and integrated. All the Apple-supplied apps behaved similarly, and apps for sale on Apple platforms were fairly coerced to obey its design. This was supposed to be their advantage over competitors that allowed different UI/UX personalities for each app. Apple sold you an ecosystem. Now their native apps are just steps above garbage. The Contacts, Mail, and Reminders apps all follow different interaction paradigms. Podcasts is so buggy as to be essentially unusable. Their attempt to split classical music into its own app messes up the underlying music metadata for the mainstream app.

We should probably take any followups to Computers and Internet, since there will always be the kind of criticism that descends into preference and holy wars. My point is not that Apple is or ought to be better than other ecosystems. It's that Apple lately has really fallen so very far from its tree. They set very high standards for themselves a decade or so ago, and then abandoned them.

theprestige · Oct 23, 2025

Cory Doctorow has opinions.

First you make the product good for users, to capture a large user base. Then you make the product good for businesses, at the expense of your users, because they're already locked in, and now you want to capture the businesses. Then you make the product good for shareholders, by running a bust-out on every feature that attracted users and businesses to begin with.

Wudang · Oct 24, 2025

For which Doctorow coined the term "◊◊◊◊◊◊◊◊◊◊◊◊◊◊◊" which seems to be gaining popularity with a number of people I follow.
eta FFS ok how about "en<faecal matter>ification"?
eta2 how about "enpooification"?

theprestige · Oct 24, 2025

Encrapification.

wea · Oct 24, 2025

JayUtah said:
...
At least the Apollo lunar module programmers were classy enough to express the equivalent sentiment in Latin: NOLI SE TANGERE.

What does SE stand for ? Software Engineer?

JayUtah · Oct 24, 2025

wea said:
What does SE stand for ? Software Engineer?

No, it doesn't stand for anything as an acronym. The original comments were written in all caps because lowercase did not functionally exist on the computers those programmers used to develop the Apollo software. Hence read instead, "Noli se tangere."

And you've actually hit upon some hacker lore, because the Latin is not quite correct. Se is the Latin reflexive pronoun, translating to one of "himself," "herself," or "itself." Think in law how a pro se litigant is one who argues "for himself," i.e. without a lawyer. The author probably meant to write Noli me tangere—"Don't touch me." This harks to what Jesus said to Mary in the Bible after he woke up from being dead. Instead, the comment translates literally to "Don't touch yourself." [Snickers from the back row] We know what meaning is intended from the context, but it's still funny if you understand Latin.

The Apollo code base is a fun early example of some snarky comments. Just a few lines above this comment is HONI SOIT QUI MAL Y PENSE, the motto of the Order of the Garter, generally interpreted here in context as "Don't judge me." This probably has to do with the notion that the code that follows is ugly and clunky, containing a bunch of empirically tuned constants to govern how the lunar module engine would be controlled. Just as every programmer can point to a shining moment of victory over impossible circumstances, every programmer can point to sections of code he wrote quickly under pressure and which he desperately wants to rewrite when he gets time.

JayUtah · Oct 24, 2025

Wudang said:
For which Doctorow coined the term "◊◊◊◊◊◊◊◊◊◊◊◊◊◊◊" which seems to be gaining popularity with a number of people I follow.
eta FFS ok how about "en<faecal matter>ification"?
eta2 how about "enpooification"?

Our company has "crapware" in its vernacular.

catsmate · Oct 27, 2025

Gulliver Foyle said:
Probably not going to happen, or even come close.

The linked article is one by Ed Zitron discussing, at length (about 18,000 words) why LLMs and the companies building them are promising pie in the sky stuff and not even coming close to fulfilling their promises.

Is anyone with an even partially working brain surprised? It's a stock bubble, money is the objective.

Dabop · Oct 27, 2025

JayUtah said:
No, it doesn't stand for anything as an acronym. The original comments were written in all caps because lowercase did not functionally exist on the computers those programmers used to develop the Apollo software. Hence read instead, "Noli se tangere."

And you've actually hit upon some hacker lore, because the Latin is not quite correct. Se is the Latin reflexive pronoun, translating to one of "himself," "herself," or "itself." Think in law how a pro se litigant is one who argues "for himself," i.e. without a lawyer. The author probably meant to write Noli me tangere—"Don't touch me." This harks to what Jesus said to Mary in the Bible after he woke up from being dead. Instead, the comment translates literally to "Don't touch yourself." [Snickers from the back row] We know what meaning is intended from the context, but it's still funny if you understand Latin.

The Apollo code base is a fun early example of some snarky comments. Just a few lines above this comment is HONI SOIT QUI MAL Y PENSE, the motto of the Order of the Garter, generally interpreted here in context as "Don't judge me." This probably has to do with the notion that the code that follows is ugly and clunky, containing a bunch of empirically tuned constants to govern how the lunar module engine would be controlled. Just as every programmer can point to a shining moment of victory over impossible circumstances, every programmer can point to sections of code he wrote quickly under pressure and which he desperately wants to rewrite when he gets time.

Why am I seeing 'Life of Brian' flashing before my eyes????
Romanes eunt domus

Wudang · Oct 27, 2025

When my last UK colleague and I at <Big Bank> were handing over all our code to our replacements in HK and China we told them not to touch just one project. A colleague who'd moved on wrote the initial version for his MSc in ML and it relied on Bayesian probability and none of the replacements had ever gone near that kind of maths or that depth into DNS. I'd long forgotten what I knew and needed a refresher.
FWIW - it used DNS queries to try to work out whether a given service or component thereof was running on primary or contingency servers.
Initial version here: Github. The associated patent has expired.

W.D.Clinger · Nov 23, 2025

I just now saw this thread.

JayUtah said:
Memory-safe is a synonym for memory-managed. In a memory-managed language, the final code uses a language-specific runtime to manage the assignment of RAM to program structures and its subsequent recovery to avoid memory leaks. This comes at a computational and storage overhead. For computation, every reference to a program variable must be wrapped in additional code to interface with the memory manager and assure the correct dereference. For storage, the runtime must maintain its own metadata to organize the program storage which may exceed the actual program storage by a significant margin (or even a factor). This is often too much overhead and we have to turn to specialized Python modules to try to get around it.

In the 1990s and early 2000s, several researchers sought to quantify the real-world overhead of using garbage collection (which is usually what people mean by "memory-managed") instead of manual deallocation. I was one of them.

That research was hard to publish, mainly because the experimental results usually showed manual deallocation to be less advantageous than conventional wisdom expected. It wasn't at all unusual for garbage collection to perform better than manual deallocation. You couldn't get that kind of result published.

When one of those papers was eventually accepted at a conference, someone asked the presenter how much work he had to put into rewriting the manually deallocated versions of the programs before they would perform as well as the garbage-collected versions. There was laughter in the hall, because some thought the question was a joke. But the presenter answered seriously, saying that when he first ran the experiments, manual deallocation was performing worse than garbage collection pretty much across the board. Yes, he did have to put a lot of work into rewriting the manually-deallocated programs before they performed better than garbage collection.

The reasons for that were pretty interesting. The easiest way to avoid disastrous dangling pointers is not to deallocate anything. That, of course, results in disastrous storage leaks. So programmers were deallocating objects that were obviously safe to delete, but very cautious about deallocating objects that were shared across procedure or module boundaries, where it can be extremely hard for a static analysis to determine where it's safe to delete the objects. (As a corollary of the halting problem's undecidability, there is no perfectly reliable way for any static analysis to do this job perfectly.) Erring on the cautious side, programmers were allowing objects to remain allocated long after garbage collection would have recovered their storage. That's one of the two main reasons why it was so hard for the researcher to rewrite the manually-deallocated versions so they would perform as well as or better than the garbage-collected versions.

The other reason is that the standard allocation and deallocation libraries used by C and C++ were adequately efficient for programs that didn't do much allocation and deallocation, but quite slow when stressed by the allocation-intensive programs where you can hope to see measurable differences between manual deallocation and garbage collection. The research I described above motivated those who maintained those libraries to rewrite them. The rewritten libraries used today are much more efficient than the libraries used back then.

JayUtah · Nov 25, 2025

W.D.Clinger said:
In the 1990s and early 2000s, several researchers sought to quantify the real-world overhead of using garbage collection (which is usually what people mean by "memory-managed") instead of manual deallocation. I was one of them.

Thanks, I should go read this, then. It will make a welcome departure from trying to teach fluid dynamics to a brick wall.

W.D.Clinger said:
The reasons for that were pretty interesting. The easiest way to avoid disastrous dangling pointers is not to deallocate anything. That, of course, results in disastrous storage leaks.

Oh my...

W.D.Clinger said:
So programmers were deallocating objects that were obviously safe to delete, but very cautious about deallocating objects that were shared across procedure or module boundaries, where it can be extremely hard for a static analysis to determine where it's safe to delete the objects.

I know our C++ team has a memory ownership policy and mandates "smart" pointers. I don't think that addresses the static analysis problem, but it does attempt to code in a way that makes resource ownership knowable to some extent. And then as soon as you embody smart-pointer semantics, I imagine someone can make the argument that you've incurred the overhead of memory safety (reference counts, indirection, etc.) without the advantages of garbage collection.

W.D.Clinger said:
The other reason is that the standard allocation and deallocation libraries used by C and C++ were adequately efficient for programs that didn't do much allocation and deallocation, but quite slow when stressed by the allocation-intensive programs where you can hope to see measurable differences between manual deallocation and garbage collection. The research I described above motivated those who maintained those libraries to rewrite them. The rewritten libraries used today are much more efficient than the libraries used back then.

Our compiled code that uses explicit memory management generally falls into the pattern of making all its allocations ahead of time and then releasing them only upon program exit. We write embedded code for small MCUs and tight memory sizes. There's an idiom of memory management that applies there. And then we write programs for supercomputing clusters that can basically assume they own the machine even though Linux is still present. We allocate at the page level for those, not through POSIX malloc and free or C++ new and delete. These seem like fairly idiomatic programming patters.

As you would expect, our middle-ground software is all written in Python.

I've seen some truly bad C and C++ code (in my opinion) that seems to treat the free store and the standard library features that rely on the free store as if it were, well, free. If you're going to compare that kind of coding to equivalent programs that use garbage collection, I expect quite poor performance from the explicit memory management.

malbui · Nov 25, 2025

I played in a metal band called the Dangling Pointers when I was a postgrad. The motivation behind the name is that none of us were very good C programmers.

Or musicians, to be completely honest.

W.D.Clinger · Nov 25, 2025

W.D.Clinger said:
In the 1990s and early 2000s, several researchers sought to quantify the real-world overhead of using garbage collection (which is usually what people mean by "memory-managed") instead of manual deallocation. I was one of them.

JayUtah said:
Thanks, I should go read this, then. It will make a welcome departure from trying to teach fluid dynamics to a brick wall.

The conference presentation I described became this journal article:

Benjamin Zorn. The measured cost of conservative garbage collection.

Software—Practice & Experience, Volume 23, Issue 7

Pages 733 - 756

https://doi.org/10.1002/spe.4380230704

Published: 01 July 1993 Publication History

A copy of the article is online here.

JayUtah said:
I know our C++ team has a memory ownership policy and mandates "smart" pointers. I don't think that addresses the static analysis problem, but it does attempt to code in a way that makes resource ownership knowable to some extent. And then as soon as you embody smart-pointer semantics, I imagine someone can make the argument that you've incurred the overhead of memory safety (reference counts, indirection, etc.) without the advantages of garbage collection.

Well, there are different kinds of smart pointers, with different overheads. Some smart pointers rely on reference counting. Studies that have compared reference counting to garbage collection have generally concluded that garbage collection is faster than reference counting, by a considerable margin. Smart pointers that rely on reference counting are viable when they aren't used very much. Programs that make heavy use of reference counting would probably perform better if they used garbage collection instead, even if they would have to use a conservative garbage collector instead of a precise collector.

JayUtah said:
Our compiled code that uses explicit memory management generally falls into the pattern of making all its allocations ahead of time and then releasing them only upon program exit. We write embedded code for small MCUs and tight memory sizes. There's an idiom of memory management that applies there. And then we write programs for supercomputing clusters that can basically assume they own the machine even though Linux is still present. We allocate at the page level for those, not through POSIX malloc and free or C++ new and delete. These seem like fairly idiomatic programming patters.

That remains the usual strategy for embedded systems.

JayUtah said:
I've seen some truly bad C and C++ code (in my opinion) that seems to treat the free store and the standard library features that rely on the free store as if it were, well, free. If you're going to compare that kind of coding to equivalent programs that use garbage collection, I expect quite poor performance from the explicit memory management.

That programming style is considered poor in C and C++ because those languages do not use and cannot use modern garbage collectors; they are limited to smart pointers and/or conservative garbage collection. In languages that are designed for precise garbage collection, with implementations that use modern garbage collectors, that programming style is not only acceptable but quite efficient.

malbui · Nov 25, 2025

I am completely out of the loop in respect of modern programming languages and techniques. In my day the management of limited resources was critical and the software I wrote for my MSc in 1992 had a central executable that was only 19K in size. One of my nephews is a clearly very good developer given his career so far but according to him questions of economising resources and making the code efficient seem to be left to the software itself. If ever things run too slowly he simply orders a new box... his current workhorse has something stupid like 64 cores and 16 TB of memory, or something like that.

alfaniner · Nov 26, 2025

In my 40+ years or programming I never once came across one that had anything to do with pointers or memory or garbage collection (from my perspective, anyway). This includes an allusion to them later on when I had to learn a new platform to develop iPhone and Android apps. The closest I ever came to that was probably in programming school using Assembly, but that was the only time I was ever exposed to it.

Even as a support person during most of that time, I never got a call on a problem with resources like mentioned above. But I was a good, conservative programmer and my code was efficient.

Software quality collapse

BOFH

Maledictorian

BOFH

Lackey

Lackey

Penultimate Amazing

Lackey

Thinker

BOFH

Lackey

BOFH

Penultimate Amazing

Begging for Scraps

Philosopher

Lackey

Lackey

BOFH

Penultimate Amazing

No longer the 1

Resident Skeptical Hobbit

Penultimate Amazing

Philosopher

Resident Skeptical Hobbit

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

BOFH

Penultimate Amazing

Critical Thinker

Penultimate Amazing

Penultimate Amazing

No longer the 1

Master Poster

BOFH

Philosopher

Penultimate Amazing

Beauf

Philosopher

Beauf

Penultimate Amazing

Similar threads