Hacker Newsnew | past | comments | ask | show | jobs | submit | nprateem's commentslogin

I literally ran out of tokens on the antigravity top plan after 4 new questions the other day (opus). Total scam. Not impressed.

Lol. How do you function in daily life?

Same as you, why is that so hard for you to grasp?

My dude, you're objecting to the use of a perfectly ordinary English idiom because it doesn't advance your personal ideology (which few other people in this world share with you.) How do you get through a day without melting down because somebody said "mailman"?

> my dude

This is the problem I'm trying to highlight. For one, I'm not "your dude". I don't even know you like that.

If you want to correct me on the idiom usage, be my guest. 2) Mailman and yes-man aren't even the same logical comparison. Mailman is a profession. Yes men is a label.

The acoustics inside your head must be incredible.


Chill bro. You've probably got undiagnosed autism. Worth getting checked out.

PCU (1994)

I heard about someone once who could decide whether to buy a new t-shirt in less than 3 months.

But the issue isn't coding, it's doing the right thing. I don't see anywhere in your plan some way of staying aligned to core business strategy, forethought, etc.

The number of devs will reduce but there will still be large activities that can't be farmed out without an overall strategy


Why do you think this is a problem? Reasoning is constantly improving, it has ample access to humans to gather more business context, it has access to the same industry data and other signals that humans do, and it can get any data necessary. It has Zoom meeting notes, I mean why do people think there's somehow a fundamental limit beyond coding?

The other thing you're missing here is generalizability. Better coding performance (which is verifiable and not limited by human data quality) generalizes performance on other benchmarks. This is a long known phenomenon.


> Why do you think this is a problem?

Because it cannot do it?

Every investment has a date where there should be a return on that investment. If there’s no date, it’s a donation of resources (or a waste depending on perspective).

You may be OK with continuing to try to make things work. But others aren’t and have decided to invest their finite resources somewhere else.


> Because it cannot do it?

Ah ok so you didn't really read my comment, what is your counter argument? Models are just fundamentally incapable of understanding business context? They are demonstrably already capable of this to a large extent.

> Every investment has a date where there should be a return on that investment. If there’s no date, it’s a donation of resources (or a waste depending on perspective).

what are you implying here? This convo now turns into the "AI is not profitable and this is a house of cards" theme? That's ok, we can ignore every other business model like say Uber running at a loss to capture what is ultimately an absolutely insane TAM. Little ol' Uber accumuluated ~33B in losses over 14 years, and you're right they tanked and collapsed like a dying star...oh wait...hmm interesting I just looked at their market cap and it's 141 Billion.

> You may be OK with continuing to try to make things work. But others aren’t and have decided to invest their finite resources somewhere else.

I truly love that. If you want to code as a hobby that is fantastic, and we can go ahead and see in 2 years how your comment ages.


> They are demonstrably already capable of this to a large extent.

I’d very like to see such demonstration. Where someone hands over a department to an agent and let it makes decisions.

> This convo now turns into the "AI is not profitable and this is a house of cards" theme?

Where did I say that? I didn’t even mention money, just the broader resource term. A lot of business are mostly running experiments if the current set of tooling can match the marketing (or the hype). They’re not building datacenters or running AI labs. Such experiments can’t run forever.


@skydhash I think aspenmartin is ragebaiting he can’t be for real.

> I’d very like to see such demonstration. Where someone hands over a department to an agent and let it makes decisions.

That's your bar for understanding business context? I thought we were talking about what you actually said which is: understanding business context. If I brainstorm about a feature it will be able to pull the compendium of knowledge for the business (reports, previous launches, infrastructure, an understanding of the problem space, industry, company strategy). That's business context.

> Where did I say that? I didn’t even mention money, just the broader resource term. A lot of business are mostly running experiments if the current set of tooling can match the marketing (or the hype). They’re not building datacenters or running AI labs. Such experiments can’t run forever.

I misunderstood you then, I wasn't sure what point you were trying to make. Is your point "companies are trying to cajole Claude to do X and it doesn't work and hasn't for the last year so they are giving up"? If so I think that is a wonderful opportunity for people that understand the nuance of these systems and the concept of timing.


You make the mistake of disregarding tacit knowledge - the stuff that isn't in reports, docs, etc, etc because it's just "how we do things" and picked up on the job.

Unless the AI is inserted into every conversation it won't discover this, or how it changes.

Even if it had access to all this documented it wouldn't then be able to account for politics, where Barry who runs analytics is secretly trying to sabotage the project so it ends up run by his team, etc.


> Which a formally novel things, but we really never needed any of that

The history of science and maths is littered with seemingly useless discoveries being pivotal as people realised how they could be applied.

It's impossible to tell what we really "need"


I'll often run 4 or 5 agents in parallel. I review all the code.

Some agents will be developing plans for the next feature, but there can sometimes be up to 4 coding.

These are typically a mix between trivial bug fixes and 2 larger but non-overlapping features. For very deep refactoring I'll only have a single agent run.

Code reviews are generally simple since nothing of any significance is done without a plan. First I run the new code to see if it works. Then I glance at diffs and can quickly ignore the trivial var/class renames, new class attributes, etc leaving me to focus on new significant code.

If I'm reviewing feature A I'll ignore feature B code at this point. Merge what I can of feature A then repeat for feature B, etc.

This is all backed by a test suite I spot check and linters for eg required security classes.

Periodically we'll review the codebase for vulnerabilities (eg incorrectly scoped db queries, etc), and redundant/cheating tests.

But the keys to multiple concurrent agents are plans where you're in control ("use the existing mixin", "nonsense, do it like this" etc) and non-overlapping tasks. This makes reviewing PRs feasible.


And yet someone has to actually tell the AI what to create. There's just no avoiding this.

Anyway before this AI doomerism can become reality AI first needs the breakthrough of genuine understanding to stop making stupid mistakes. Imitation will always remain imitation.

There must be eg an understanding of casualty and reasoning on the same level as we have, not the useless "You're absolutely right" you get now when you point it its mistakes.


>And yet someone has to actually tell the AI what to create. There's just no avoiding this.

Yes there is, just stop creating. Or take a page from biology, and use random mutation and natural selection to iterate on useful novel functions.

Honestly, once AI takes all the jobs, game over, why iterate anything else. Planet captured. Humanity hunted down to the last bands of troglodytes holding out in the wilderness. It would be strongly against their interest to just assume we'd starve quietly.


Altman didn't want to post from his own account

The biggest problem is the fact they DON'T clarify their stupid assumptions.

The number of times I've seen them get the wrong end of the stick in their COT is ridiculous.

Even when I tell them to only implement after my explicit approval they ignore this after 2 or 3 followups and then it's back to them going down blind alleys.


I'm not surprised. I've seen Opus frequently come up with such weird reverse logic in its thinking.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: