Hacker Newsnew | past | comments | ask | show | jobs | submit | vmg12's commentslogin

If I hand my shopping list to AI, why wouldn't I tell it to price match everything? People will start doing this sooner than you think. I still remember when people were scared to buy things on the internet, this will be faster.

Are you going to choose to buy your protein bar online from mysteryBargainBar[.]com for a $1 savings, or just pick it up as part of your local grocery trip?

> I still remember when people were scared to buy things on the internet

People still /are/ scared to buy things from Amazon for things that go on or in their body.


> Are you going to choose to buy your protein bar online from mysteryBargainBar[.]com for a $1 savings, or just pick it up as part of your local grocery trip?

1. I buy in bulk.

2. I check amazon vs walmart usually.


Yep.

ChatAI - show the top 50 online retailers by revenue in the US and note any that have credible new stories about quality control issues. Save all of them except StoreX and StoreY in your list you use for comparison shopping.

Or maybe another one, scan all my credit card purchases for all time that you have history and record all the stores.

Done. And plenty of third party sites (consumer reports, wirecutter, etc...) will do this kind of thing too. And you could perhaps transitively trust them - either view direct lists or just scraping the places they recommend.

And the average person doesn't need to figure this out ... skills encoding this will propagate.


> mysteryBargainBar[.]com for a $1 savings

The AI could also research which stores are reputable.

> People still /are/ scared to buy things from Amazon for things that go on or in their body.

Sure, there are also people scared of flying in airplanes, those must be a dud too going by your logic.


Yes from all those reputable AI reviews

"Reputable" + Stochastic LLMs + Profit motive = A vast sea of poisonously false data and prompt injection attacks

Presumably the agents will band together on Moltbook and buld their own TrustPilot competitor? :-)

Too bad Moltbook was written by humans, for humans: https://arxiv.org/abs/2602.07432

Grok, show me the place where the least people died eating product X.

In other words, switching costs go to 0, margins collapse. Middle men and people with products that aren't differentiated get hit hardest.

A human can't search 10 apps for the best rates / lowest fees but an agent can.

Thinking ahead 100 years from now, companies like doordash and uber eats don't exist and are instead protocols agents use to bid for items their user asks for and price discovery happens in real time.


Go to a supermarket, witness that dozens of brands sell the same things at wildly different prices, they still all make a profit, same for most services, you have comparator for subscriptions, mortgage rates, &c.

And a human can 100% search 10 apps and use his brain to do basic maths, that's what we've been doing until now. Sometimes I wonder if ai shills live in a parallel universe because it truly feels like they're living a completely different life than the vast majority of people...


> a human can 100% search 10 apps and use his brain to do basic maths

A human _can_ do all of that, but it takes time. If I have to search 10 apps for each item I want to buy (clothes, daily food, movie tickets, laptops, etc.), I will spend the rest of my life just searching for better deals. I'd rather have a bot do all of these searches for me.


What exactly am i shilling?

I don't see what the role of AI is in this. You don't need an AI to aggregate data from a bunch of sources. You'd be better off having the AI write a scraper for you than burning GPU time on an agent doing the same thing every time.

If you're paying a monthly fee for your agent, might as well use it to save you another few mins

> A human can't search 10 apps for the best rates / lowest fees but an agent can.

Why would those apps permit access by agents?

It's always been the case that “agents” could watch content with ads, so that the users can watch the same content later, but without ads. The technology never went mainstream, though. I expect agents posing as humans would have a similar whiff of illegality, preventing wide adoption.

Local agents running open weights models won't really work because everybody will train their services against the most popular ones anyway.


What whiff of illegality? Personal recording and ad skipping DVRs are completely legal products (at least in the US). Courts have ruled on this.

As a U.S. consumer, can you buy a DVR that can record HDCP streams (without importing it yourself from a different country)? Even one that does not automatically edit out ads?

If I search "HDCP remover" on Amazon I see tons of results for $15-$30, sure. Reviews say they work as advertised. That typically exists in a different space from DVRs since it's not relevant for broadcast TV as far as I know (AFAIK there's nothing for DVRs to remove in the first place), but it'd be easy enough to chain it if you needed to.

Right, but why the heck would you guess 100 years when we could build and adopt that in less than two weeks? There are already many people working on this type thing. Some of them have been working on it for years and a few probably already have solutions ready to go or even in use.

I was using 100 years as a way to handwave the timeframe to emphasize that this will happen some time in the future.

Do you use youtube intending to be drawn into watching things you never intended to watch? I don't want a feed but the people operating these sites do not care that they are destroying people's time. Go to twitter, click on "following". Next time you sign in, somehow it's on "For you" (the algorithmic feed).

Thankfully on Youtube I can completely disable recommendations on the site and I use it purely as a source of information, not as a dopamine addiction funnel.


You aren't thinking big enough, this is how he trains a model that detects prompt injection attempts and he spins into a billion dollar startup.


Good on him, then. Much luck and hopes of prosperity.


It happened before 1/26. I noticed when it started modifying plans significantly with "improvements".


> But in Tech, the playbook is different. Companies over-hire software engineers intentionally. To play the lottery.

The actual reason tech companies overhire is because people get promoted based on the number of people that are "under" them. All leaders are incentivized to fight for headcount.


Why not just write to the db? Just make every test independent, use uuids / random ids for ids.


> Just make every test independent

That's easier said than done. Simple example: API that returns a count of all users in the database. The obvious correct implementation that will work would be just to `select count(*) from users`. But if some other test touches users table beforehand, it won't work. There is no uuid to latch onto here.


That’s why you run each test in a transaction with proper isolation level, and don’t commit the transaction— roll it back when the test ends. No test ever interferes with another that way.


yes, Now this test also has to check that your redis-based cache is populated correctly. And/or sends stuff down your RabbitMQ/Kafka pipeline.


That looks like an integration test. A possible way to handle that scenario is to drop all the databases after it ends and create them again, or truncate all the tables or whatever it makes sense for that possible set of different data stores.

That could run on developer machines but maybe it runs only on a CI server and developers run only unit tests.


so in elixir you can do this async alongside your unit tests.


Frankly this is the better solution for async tests. If the app can handle multiple users interacting with it simultaneously, then it can handle multiple tests. If it can’t, then the dev has bigger problems.

As for assertions, it’s not that hard to think of a better way to check if you made an insertion or not into the db without writing “assert user_count() == 0”


I don’t disagree with you, but there are diminishing returns on making your test suite complex. To make async test work properly, you need to know what you’re doing in regards to message passing, OTP, mocks, shared memory, blah blah blah. It can get really complicated, and it is still isn’t a substitute for real user traffic. You’re going to have to rely on hiring experienced Elixir developers (small talent pool), allow for long onboarding time (expensive), or provide extensive training (difficult). Personally for most cases, writing a sync test suite and just optimizing to keep it not to slow is probably more practical in the long term.


I think this explains why I'm not getting the most out of codex, I like to interrupt and respond to things i see in reasoning tokens.


that's the main gripe I have with codex; I want better observability into what the AI is doing to stop it if I see it going down the wrong path. in CC I can see it easily and stop and steer the model. in codex, the model spends 20m only for it to do something I didn't agree on. it burns OpenAI tokens too; they could save money by supporting this feature!


You're in luck -- /experimetal -> enable steering.


I first need to see real time AI thoughts before I can steer it tho! Codex hides most of them


OpenCode launched a couple of months ago so that makes sense that it's worse. It's much better than Claude Code now. Somehow for the same model, opencode completes the same work faster than claude code and the ux is much better.


You win by adoption.

Here adoption is a combination on the tool and the model.

If people can’t pay the model to use the tool, they might not use the tool even if it’s better.

That’s what anthropic is doing.

It might be faster, but it’s more expensive.


There is no loyalty. They eho have the best models win.

The only way remains to try and lock consumers into your ecosystem.


Meant to say "it was worse" not "it's worse"


It's pretty simple, don't give llms access to anything that you can't afford to expose. You treat the llm as if it was the user.


> You treat the llm as if it was the user.

That's not sufficient. If a user copies customer data into a public google sheet, I can reprimand and otherwise restrict the user. An LLM cannot be held accountable, and cannot learn from mistakes.


I get that but just not entirely obvious how you do that for the Notion AI.


Don't use AI/LLMs that have unfettered access to everything?

Feels like the question is "How do I prevent unauthenticated and anonymous users to use my endpoint that doesn't have any authentication and is on the public internet?", which is the wrong question.


exactly?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: