Building Dark factories

This week at Martello the dev team got together to share thoughts on the role of AI in development. I wasn’t in the meeting, but it was recorded, which made for a fascinating and insightful watch.

Coupled with listening to DHH on the Pragmatic Engineer podcast and Simon Willison on Lenny’s Podcast, I… well. I have some thoughts.

Dark Factories

Simon explained the idea that, if a factory is truly fully automated, then there is no need to turn the lights on. The analogy is that, in a fully AI-driven development system, nobody needs to look at the code.

“But,” we say, “if we don’t look at the code, how will we know if it’s good?”

Isn’t that what people said when compiled languages came about?

In some cases, we never did move over to fully trusting the compiler. The @FFmpeg Twitter account frequently tweets about the lengths contributors go to in order to write some of the most performant code imaginable by hand. They absolutely do not trust the output of compiled languages.

But, for a surprisingly large number of applications, optimising machine code by hand isn’t needed, and we don’t look at the machine code generated by the compiler at all.

For the majority of applications we solved the need for people to write and review machine code by building systems, tools and processes to reduce and in most cases eliminate the need to review low-level code.

Everyone can program now

Yes? No? Maybe? I don’t know. Yes, anyone can open an LLM application and type “build my company a CRM. make no mistakes” and you will get… something. A good something? Probably not.

In some respects, the difficult part of programming - at least the kind of programming that I do, which is as to Computer Science as plumbing is to Fluid Dynamics - was never writing the code. Not really. The hard part was writing the right code. The code that finds the elusive “Product-Market Fit”. The code that results in being able to “build something people want”. The code that can be exchanged for cold, hard cash, again and again and again.

That’s not to say that writing “good” code isn’t important, or that writing code at all was always easy. My preferred definition of “good” code is “code that is easy to change”. There is an abundance of wisdom on this aspect of programming - SOLID, DRY, “Tell don’t ask” and loads, loads, loads more.

Anyone can write code, but writing “good” code is a craft and, in the same way a beautifully crafted chef’s knife may be valuable in its own right, its real value is proven when achieving the outcome it was designed for.

We write code because we want to solve a problem. There are, usually, many ways to solve a problem, and usually those solutions come with trade-offs. This is true at the macro level - should the system do x at the expense of y, or a at the expense of b - but also at the micro level - should we use this pattern or that one?

Slop vs. “Taste”

An LLM will produce the application you prompt it to, in the same way it will produce the PRD you ask it to, summarise the meeting you ask it to, or whatever else.

The difference between a good and a bad outcome is what is distastefully referred to in common parlance as “taste”. Taste is the expression of experience applied to choices. Making the right choices is the difference between success and failure.

The goal, then, is not to verify that the output of the LLM is the same code one would have written by hand, in the same way that the goal is not to ensure the compiler writes the same code one would have written by hand.

The goal is to ensure the outcome matches the levels of taste and sophistication the author of the code expects.

Left unattended, LLMs will produce a torrent of unmitigated slop. I’ve rocked up to work on a Monday morning with a new project off the back of some variation of “build the thing, make no mistakes” with a cheery “hey everyone, look at this thing the clanker built - let’s ship it.” It doesn’t work, I don’t know how it’s supposed to work, this is not a useful experience for anyone.

And with AI, everyone can do this in astonishing volume. The slop tsunami is crashing down on us.

This is the role of the future developer in the world of AI. Paul Dix described this as “Build the machine that builds the machine”.

I’d push it one step further. Our role as developers is to harness the slopodrolic power and coerce it into production code.

We must impose our experience and preferences not on the code written but on the outcomes achieved. We must do the engineering required to build the Dark Factories.

“Agents” are wildly capable but, ironically enough, the one attribute they lack is agency. Agents can’t decide to go from zero to one. They exist in loops and are triggered by events. LLMs themselves are stateless, and every message is brand new information.

The old IBM quote still stands:

A computer can never be held accountable, therefore a computer must never make a management decision.
— IBM (1979)

It’s not that LLMs can’t make any decisions. Clearly they can. But there is a class of decision-making — call it management — that an LLM should not do, because it cannot be held accountable for the consequences.

In “Anything You Want”, Derek Sivers praises the virtues of delegation but warns of the perils of abdication.

Yes, we’re now in a world in which AI can write code. Perhaps the most important role engineers will play in this Brave New World is being accountable for the support and maintenance of that code.

Someone needs to decide to string the LLMs and tool calls together. Someone needs to do so in a way that results in code that not only works, but produces the desired outcome and meets the real world where it is.

Someone needs to be on the hook for the resulting code: for prioritising fixes, for strategically rolling out changes, for deciding when to preserve backwards compatibility and when to make a breaking change, and for ensuring that an outage at 2am gets resolved.

It might be the case that AI actually does much of that work, but it is engineers who remain ultimately accountable for the success or failure of a system running in production.

AI has already led to an unimaginable volume of code, and there is vastly more yet to be written. The role of engineering cannot be to eyeball all that code and approve it before shipping.

We need patterns and practices that allow us to turn off the lights without losing control of the factory. Just as TDD, OOP, and other ideas helped us reason about software at higher levels of abstraction, the dawn of AI demands new concepts and tools. We need ways to channel the output of AI systems without having to pore over every line before it hits production, while still remaining accountable for the results.

Engineering will change dramatically in an AI world. But fundamentally the job remains the same: engineers own the outcome of running systems in production. That’s the job. It always has been.

“So what are you trying to tell me? That I can dodge bullets?”

“No, Neo. I’m trying to tell you that when you’re ready, you won’t have to.”

— The Matrix (1999)