On AI assisted coding

A lot has happened in software engineering since my last post. I’ve had a draft post called “LLMs are crazy good” for a while now, but suffice to say that LLMs have gotten even better at coding since then, to the degree that they are now essentially my preferred way to edit the text files that we call code.

Also, it feels silly to have to write this, but every word on this blog has been and always will be written 100% by a human, except for code. I would say “artisanally hand-crafted,” but some days it’s more like slapped together with a stapler and chucked out the window. This is one of those days, to be honest.

LLMs are an abstraction layer comparable to compilers

The basic idea is that LLMs allow us to move up another abstraction level, similar to how compilers did in the 1950s (thank you, Admiral Grace Hopper). Were a lot of people concerned about the output of compilers back then? They sure were. Are they now? Only compiler engineers. Are a lot of people concerned about the output of LLMs today? They sure are. Will they be concerned about the output of LLMs tomorrow? I honestly don’t think so.

I think as we learn to use LLMs for code generation, we will find out more ways to make the output deterministic in the ways that we care about. Do I care whether an API is written in Ruby or Rust? Do I care whether a website was written with htmx or erb? No, I don’t.

Someone wrote a blog post that probably thought a lot more about the analogy which you should definitely read: https://alperenkeles.com/posts/llms-could-be-but-shouldnt-be-compilers/ My one comment on the contents of the post would be that 1) compilers aren’t entirely deterministic, either, and 2) we can and will find ways to increase the reliability of LLM output.

Side story: My dad told me a story about picking flooring for a home remodeling project. He went to the flooring store with my mom and they were agonizing over their choices. The sales rep gave them a few options and then, after witnessing the agony, shrugged, and said “it’s just money” and laughed. In our case, the sales rep is Anthropic and they’re saying “it’s just tokens.”

The point is, you can now waste tokens to talk through the specification with your LLM agent. And speaking of specifications…

The specification is the application now

These days, the specification is the application, now, or at least more so than before. LLMs do much better when there’s some sort of deterministic output they can test their code against. For example, JustHTML passes 100% of the html5lib test suite and was entirely vibecoded: https://friendlybit.com/python/writing-justhtml-with-coding-agents/ And they were able to do it because there is a test suite in the first place. Meanwhile, Anthropic had Claude Opus 4.6 build a working C compiler that can compile the Linux kernel. The trick was to have a reference compiler and a great harness for writing a clean room implementation.

These, on the spectrum of “formally specified” on one end to “bro, I have an idea for an app that’s like Strava plus Hinge” are definitely much closer to “formally specified.” There’s a clear feedback loop that tells the LLM objectively whether it’s headed in the right direction, and a test suite that lets it iterate toward the goal. There’s actually a library that is just a list of specifications of the behavior. If you can find that library for me, I would appreciate it.

I feel like Cucumber needs to make a comeback, but this time powered by LLMs trying to playtest your website with Playwright:

When logged in as a User I should be able to personally send Anthropic's stock price to the moon 🚀

So how does this affect software development? We’re going to be much more in the business of specifying behaviors rather than writing fiddly text keywords to match syntactical rules. I think it’s inevitable. It’s not just a matter of cost and efficiency – it’s just a better way to code. But it is all of those things as well.

LLMs are FPGAs and runtimes are ASICs

FPGAs, or Field Programmable Gate Arrays, are hardware based logic boards that can be programmed to represent different logical circuits. When TSMC tapes out a chip, the design is always going to be a certain way. FPGAs let you program the physical design to your specification and it will behave as if it was a hardware circuit board design, albeit slower and less capable than the real thing.

ASICs, on the other hand, are Application Specific Integrated Circuits, and they are fully specified circuit boards that achieve a single purpose. These are often desirable because they’re cheaper to produce and perform much quicker than FPGAs or more general purpose designs.

In our analogy, LLMs are FPGAs. You can load an LLM up and tell it to pretend to be a command line terminal and it will respond as if you were ssh‘ing into a remote computer somewhere. They are the ultimate general purpose, if somewhat chaotic, computing tools. However. They can be used to encode processes into code. And code is far less chaotic than an LLM is, and that is a benefit when you want to use a tool instead of learn how to use the latest model. It’s also typically far less resource intensive than waiting for inference from an LLM. If you’re using the same runtime and the same code, you should more often than not get the same result.

Coding feels dead

I follow coding subreddits like Rust and Ruby and I have to say – it all feels dead now. There are fewer high quality posts, and the excitement is all in LLM submissions that, for better or for worse, have a much higher bar to meet before readers consider them remarkable. The magic feels simultaneously over and underwhelming. Software engineering used to feel like a delicate ballet of computers, orchestrated by a careful choreographers.

Now you can get world class software engineering knowledge at the touch of a button, if you ask the right questions. It’s great, but it’s also a kind of Deep Blue, a melancholy that Kasparov and later Lee Sedol must have faced when they were the first humans to be thoroughly defeated by computers in the games of Chess and Go, respectively.

I think that, despite the melancholy, we will adapt like the board gaming communities have. The game is still about people, but we will use AI to enhance our own training and our own capabilities. The melancholy is that we’ve built something that so far outpaces our capacity that we are, in a way, puppets of the machines rather than the operators.

In truth, the entire internet feels dead now, but this isn’t a new theory. It’s just that the internet is now aggressively dead, like a 28 Days Later kind of dead rather than an Abraham Lincoln sort of dead.

On trusting trust

Another compiler analogy, but the seminal paper “Reflections on Trusting Trust” explores how a backdoor in a compiler could lead to almost completely undetectable security vulnerabilities.

In LLMS, we’re already seeing OpenAI talk about ads, but the problem of trusting trust is actually even worse than the problem in compilers, since compilers are actually somewhat transparent compared to LLM models. So much of the model training process is obfuscated, from the data sources (like Facebook infamously claiming that its seeding 81.7TB of porn was for “personal use”) to the training harnesses, to the processes, and at the end you get a massive binary blob of statistics somehow representing the sum knowledge of humanity. A hypercube of billions of parameters for the pachinko ball of your piddly prompt to bounce through until you get an answer you’re satisfied with.

No one but the most well funded AI labs can introspect these things. And the actors on the web are quickly adapting to poison the wells the LLMs came from, whether from careless neglect and AI slop, or outright malice and pushing narratives to their own purposes. And finally, capital must always have its say, and so we have the ads injected into every prompt. I wouldn’t be surprised if it was eventually a service to have ads injected into the training data itself, to permanently bias LLMs, though I have my doubts as to the economics of that business model.

Ethics

If you’re coming from a Western perspective of intellectual property (as opposed to, say, a gongkai perspective), then AI is inherently unethical. It’s based on intellectual property laundering that washes away ownership through statistics. Image generation models all know what they’re not supposed to generate when you ask them for Avengers, Disney characters, Ghibli characters, or Pokemon. They absolutely can and will generate faithful recreations of all of the above, however.

LLMs are also yet another source of emissions that contribute to climate change, and use precious resources. In the most egregious examples, they directly compromise vulnerable populations with their emissions when the electricity grid can’t meet the AI datacenter demands.

And yet, this technology won’t go away. We can spurn it, we can denounce it, but we must in some ways engage with it.

The ethics of AI are not that much different than deciding to take an airplane somewhere, drive your car, or remodel your house. Everything we do, especially what we do out of convenience, creates waste. All waste has externalities and all externalities are paid for by the common folk, ultimately.

The problem is upstream of “do I use LLMs today?” and it’s upstream of “do I drive a car to work today?” In my opinion, we need to do our individual parts to reduce waste and the effect on the global good that is the health of our globe, but we must also balance that against the need to hold society and particularly corporations accountable for their share of the tragedy of the commons.

We should have technology that we can depend on as a public good, efficient infrastructure that sips resources, equitable access to the rewards that come from progress. That, unfortunately, is something that no prompt can solve. Only we can hold other humans accountable, and only we can share the fruits of our labors with others.

Brian Kung