Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Able to review the code output of coding agents

That probably won’t be necessary in a few years.



It's necessary for devs right now, no matter how good they are, and it's those devs' code the models are trained on


Even worse, the training set probably includes a lot of code that needed review but didn't get it...


If we know the outcome of that code, such as whether it caused bugs or data corruption or a crappy UX or tech debt -- which is potentially available in subsequent PR commit messages -- it's still valuable training data.

Probably even more valuable than code that just worked, because evidently we have enough of that and AI code still has issues.


I see this line of thought put out there many times, and I've been thinking: why do people do anything at all? What's the point? If no one at all is even reviewing the output of coding agents, genuinely, what are we doing as a society?

I fail to see how we transition society into a positive future without supplying means of verifying systemic integrity. There is a reason that Upton Sinclair became famous: wayward incentives behind closed doors generally cause subpar standards, which cause subpar results. If the FDA didn't exist, or they didn't "review the output", society would be materially worse off. If the whole pitch for AI ends with "and no one will even need to check anything" I find that highly convenient for the AI industry.


You could e.g. write specs and only review high level types plus have deterministic validation that no type escapes/"unsafe" hatches were used, or instruct another agent to create adversarial blackbox attempts to break functionality of the primary artifact (which is really just to say "perform QA").

As a simple use-case, I've found LLMs to be much better than me at macro programming, and I don't really need to care about what it does because ultimately the constraint is just that it bends the syntax I have into the syntax I want, and things compile. The details are basically irrelevant.


Code quality will impact the effectiveness of ai. Less code to read and change in subsequent changes is still useful. There was a while where I became more of a paper architect and stopped coding for a while and I realized I wasn't able to do sufficient code reviews anymore because I lacked context. I went back into the code at some point and realized the mess my team was making and spent a long while cleaning it up. This improved the productivity of everyone involved. I expect AI to fall into a similar predicament. Without first hand knowledge of the implementation details we won't know about the problems we need to tell the AI to address. There are also many systems which are constrained in terms of memory and compute and more code likely puts you up against those limits.


I don't disagree that code quality is currently more important than it's ever been (to get the most out of the tools). I expect that quality will increase though as people refine either training or instructions. I was able to get much better (well factored, aligned to business logic) output that I'm generally happy-ish with a couple months ago with some coding guidelines I wrote. It's possible that newer models don't even need that, but they work well enough with it that I haven't touched those instructions since.


I mean, sure, for programming macros. Or programming quick scripts, or type-safe or memory-safe programs. Or web frontends, or a11y, or whatever tasks for which people are using AI.

But if you peel back that layer to the point where you are no longer discussing the code, and just saying "code X that does Y"... how big is X going to get without verifying it? This is a basic, fundamental question that gets deflected by evaluating each case where AI is useful.

When you stop being specific about what the AI is doing, and switch to the general tense, there is a massive and obvious gap that nobody is adequately addressing. I don't think anyone would say that details are irrelevant in the case of life-threatening scenarios, and yet no one is acknowledging where the logical end to this line of thinking goes.


I mean, the promise of perfect AI and perfect robotics is that humans would no longer have to do anything. They could live a life of leisure. Unfortunately, we're going to get these perfect AI and perfect robotics before we transition socially into a post-scarcity, post-ownership society. So what will happen is that ownership of the AI and robots will be consolidated into the hands of the few, the vast rest of us will have nothing economically relevant to do, and we'll probably just subsist or die.

We're already seeing this today. Every year, thousands of people are becoming essentially irrelevant to the economy. They don't own much, they don't invest much, they don't spend much money, they don't make much money, and they are invisible to economics.


> They don't own much, they don't invest much, they don't spend much money, they don't make much money, and they are invisible to economics.

Indeed. Sometimes I think the so-called “lower classes” end up functioning more like crops to be farmed by the rich. Think, dollar stores that sell tiny packages of things at worse unit cost, checking account fees, rent-a-center, 15% interest auto loans and store credit cards with 30% interest…


I've definitely felt this kind of way in the past. But these days I'm not so sure.

Setting aside the AI point about it, the idea of people becoming essentially irrelevant to the economy is an indictment on society. But I'd argue that the indictment really is towards what constitutes measurement in the economy. Not an indictment on society itself, or technology.

Sure, someone may not spend much money or produce much money, but if they produce scientific research or cultural work that is intangibly valuable it is still valuable regardless of whether economists can point to a metric or not. Same goes for the infinite amounts of contributions to our world from nature: what is the economic value of a garden snake or a beetle? A meaningless question when the economy can only see things in dollars.


They will still be turning out the same problematic code in a few years that they do now, because they aren’t intelligent and won’t be intelligent unless there is a fundamental paradigm shift in how an LLM works.

I use LLMs with best practices to program professionally in an enterprise every day, and even Opus 4.6 still consistently makes some of the dumbest architectural decisions, even with full context, complete access to the codebase and me asking very specific questions that should point it in the right direction.


I keep hearing “they aren’t intelligent” and spit out “crap code”. That’s not been my experience. LLMs prevented and also caught intricate concurrency issues that would have taken me a long time.

I just went “hmmm, nice” and went on. The problem there is that I didn’t get that sense of accomplishment I crave and I really didn’t learn anything. Those are “me” problems but I think programmers are collectively grappling with this.


They are not intelligent. Full stop. Very sophisticated next word prediction is not intelligence. LLMs don’t comprehend or understand things. They don’t think, feel or comprehend things. That’s just not how they work.

That said, very sophisticated next word predictors can and sometimes do write good code. It’s amazing some of the things they get right and then can turn around and make the weirdest dumbest mistakes.

It’s a tool. Sometimes it’s the right tool, sometimes it’s not.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: