Hacker Newsnew | past | comments | ask | show | jobs | submit | captainbland's commentslogin

I think the issue is that LLMs are a cash problem as much as they are a technical problem. Consumer hardware architectures are still pretty unfriendly to running models which are actually competitive to useful models so if you want to even do inference on a model that's going to reliably give you decent results you're basically in enterprise territory. Unless you want to do it really slowly.

The issue that I see is that Nvidia etc. are incentivised to perpetuate that so the open source community gets the table scraps of distills, fine-tunes etc.


You got me thinking that what's going to happen is some GPU maker is going to offer a subsidized GPU (or RAM stick, or ...whatever) if the GPU can do calculations while your computer is idle, not unlike Folding@home. This way, the company can use the distributed fleet of customer computers to do large computations, while the customer gets a reasonably priced GPU again.

The kinds of GPUs that are in use in enterprise are 30-40k and require a ~10KW system. The challenge with lower power cards is that 30 1k cards are not as powerful, especially since usually you have a few of the enterprise cards in a single unit that can be joined efficiently via high bandwidth link. But even if someone else is paying the utility bill, what happens when the person you gave the card to just doesn’t run the software? Good luck getting your GPU back.

Consumer hardware is there. grab a mac or AMD395+ and Qwen coder and Cline or Open code and you're getting 80% of the real efficiency.

New Strix Halo (395+) user here. It is very librating to be able to "just" load the larger open-weight MoEs. At this param count class, bigger is almost always better --- my own vibe check confirms this, but obviously this is not going to be anywhere close to the leading cost-optimized closed-weight models (Flash / Sonnet).

The tradeoff with these unified LPDDR machines is compute and memory throughput. You'll have to live with the ~50 token/sec rate, and compact your prefix aggressively. That said, I'd take the effortless local model capability over outright speed any day.

Hope the popularity of these machines could prompt future models to offer perfect size fits: 80 GiB quantized on 128 GiB box, 480 GiB quantized on 512 GiB box, etc.


Not for long, presumably. Apparently the majority of marketable skills will come from a handful of capex heavy, trillion dollar corporations and you will like it.

We've all seen shovelware, now introducing excavatorware. A single shovelware studio is now empowered to deliver on the order of kilogames per month.

Honestly it's really weird that it was ever allowed to get this stage. Their leadership has been pretty "mask off" for a good while now.

That's what you get when decisions are made by people who don't understand anything about the stuff they vote on.

...but understand and appreciate bribes very well

You only need to buy one or two to get it on the agenda, then everybody votes along party lines, on stuff they don't understand. It's not even that expensive.

Just ask Wes Streeting and Peter Mandelson.

I wouldn't mind as much, but the bribes are absolutely tiny.

Farage's "Reform" colleague Robert Jenrick took a £12k bribe to make his donor £30m.

If you're going to be bribed, at least make it worthwhile.

https://news.sky.com/story/the-1bn-development-and-the-tory-...


I feel like this is a feature which improves the perceived confidence of the LLM but doesn't do much for correctness of other outputs, i.e. an exacerbation of the "confidently incorrect" criticism.


It's a mismatch with our intuition about how much effort things take.

If there's humans involved, "I took this data and made a really fancy interactive chart" means that you put a lot more work into it, and you can probably somewhat assume that this means some more effort was also put into the accuracy of the data.

But with the LLM it's not really very much more work to get the fancy chart. So the thing that was a signifier of effort is now misleading us into trusting data that got no extra effort.

(Humans have been exploiting this tendency to trust fancy graphics forever, of course.)


It is not limited to graphics, better packaged products, better dressed / good looking well spoken person and so on. Celebrity endorsements depend on this thesis.

There has always been a bias towards form over function.


Once good form becomes commoditized, hopefully function starts taking priority


A recent LinkedIn post that I came across as an example of people trusting (or learning to trust) AI too much while not realizing that it can make up numbers too: https://www.linkedin.com/posts/mariamartin1728_claude-wrote-...

P.S. Credit to the poster, she posted a correction note when someone caught the issue: https://www.linkedin.com/posts/mariamartin1728_correction-on...


> A recent LinkedIn post that I came across as an example of people trusting (or learning to trust) AI too much while not realizing that it can make up numbers too

Honestly, people make them up just as much or generate equally incorrect graphs.

It's about time our trust into random visualizations is destroyed, without the actual formulas and data behind being exposed.


A similar thing happened when Google started really pushing generating flowcharts as a use-case with Nano Banana. A slick presentation can distract people from the only thing that really matters - the accuracy of the underlying data.


As a slightly different tack, I’ve been using Copilot to generate flowcharts from some of the fiendishly complex (and badly written) standard operating procedures we have at work.

People find them quite easy to check - easier than the raw document. My angle with teams is use these to check your processes. If the flow is wrong it’s either because the LLM has screwed up, or because the policy is wrong/badly written. It’s usually the latter. It’s a good way to fix SOPs


It’s interesting you mentioned that. One of the things I’ve started doing recently is throwing a large LLM such as codex-5.3 (highest level of reasoning) at some of the more complex systems we have to produce nicely formatted ASCII diagrams.

I still review each diagram afterward, but the great thing is that, unlike image-based diagrams, they remain fully text-readable and searchable. And you can even expose them as part of the knowledge base for the LLM to reference when needed going forward.


Copilot outputs Mermaid diagram markup for me - so the graphs are editable and importable into diagraming packages

Yeah I'm a big fan of mermaid as well. Extended ASCII works a bit better for me because depending on the text editor because I don't need a custom visualization tool.

It's a usability / quality of life feature to me. Nothing to do with increasing perceived confidence. I guess it depends on how much you already (dis)trust LLMs.

I'm finding more and more often the limiting factor isn't the LLM, it's my intuition. This goes a way towards helping with that.


This is the tension I keep hitting when building data tools on top of LLMs. A nice-looking chart makes the output feel more trustworthy, but the data can still be wrong. The chart just makes it harder to notice. LLMs still need to come with receipts of where the data came from and the math they did. It's as bad as "I read the headline so I know everything in the article."



This was my first thought as well, all this does is further remove the user from seeing the chat output and instead makes it appear as if the information is concretely reliable.

I mean is it really that shocking that you can have an LLM generate structured data and shove that into a visualizer? The concern is if is reliable, which we know it isnt.


The further they can get people from the reality of `This just spits out whatever it thinks the next token will be` the more they can push the agenda.


Its' a reasonable concern. Often it can be mitigated by prompting in a manner that invokes research and verification instead of defaulting to a corpus.

Passive questions generate passive responses.


I think this does something for the correctness of LLMs by making it easier to check their output.


I suspect chain of thought while building the chart will improve the overall correctness of the answer


I agree. Maybe next they'll add emotionally evocative music, with swelling orchestral bits when you reach the exciting climate of the slop.


They'd already taken VC money hadn't they? It's got to be said though that tech startups are getting very formulaic. Monster of the week vibes.


On top of the stringent border checks and Minneapolis, Brits are now seeing things like this and thinking twice: https://www.theguardian.com/us-news/2026/feb/21/karen-newton...


Today: this

Tomorrow: trillions invested in new technology for simulating human torture accurately at the molecular level, requiring twice the level of all consumer electricity use on the planet. Advocates claim "all use is valid".


Is this a reference to "Torment Nexus"?


While I'm sure that subconsciously influenced what I wrote, it was more a general jab at the sentiment that negative externalities can always be justified so long as a technology has users who prefer to use it.


Ah, I thought you were just referring to the decades-long use of the most massive supercomputers to simulate nuclear arsenal maintenance and explosions (maybe literally at the molecular/atomic/sub-atomic level).


Yeah. Did you see article that they made a brain organoid (actual brain neurons on a chip) play DOOM?. What are those neurons experiencing?


> What are those neurons experiencing?

A reasonable explanation is that a few neurons probably don't have conscience so they can't really experience anything.


It's an interesting question as to what that level is likely to be though. The chip in question apparently has around 800,000 neurons (https://www.forbes.com/sites/johnkoetsier/2025/06/04/hardwar...) so not a trivial quantity which makes it significantly more complex than most insects' forebrains but still less complex than any mammal.

I think once they're able to put 15 million such neurons on a single device that puts them in the range of more relatable animals like mice and Syrian hamsters, and I also expect that relatability is also what will drive most opinions about consciousness.


>a few neurons probably don't have conscience

Given our piss poor understanding of consciousness, I have to ask: on what grounds do you make this claim?


> What are those neurons experiencing?

Doom. (Obviously.)


I hadn't until you mentioned it but now I have! I expect one day they'll generate a language model on one and then we can just ask it, assuming they don't give it a special rule about never describing its experiences.


The language model's output would be informed by its weights, not by its experiences as wetware. Substrate does not make a computation special: that's the whole point of the Chinese Room thought experiment.

What mechanism are you imagining that would allow a LLM built of neurons to describe what it's like to be made of neurons, when a LLM built of GPUs cannot describe what it's like to be organised sand? The LLM in the GPU cluster is evaluated by performing the same calculations that could be performed by intricate clockwork, or very very slowly by generations of monks using pencil and paper. Just as the monks have thoughts and feelings, it is conceivable (though perhaps impossible) that the brain tissue implementing a LLM has conscious experience; but if so, that experience would not be reflected in the LLM's output.


When I say language model, I mean of whatever form would make it native to the wetware medium. This brings with it a few key distinctions. The distinction I think is most relevant is that human neurons including in chips like the CL1 have the capability to dynamically re-organise topologically (i.e. neuroplasticity) which is something that computed LLMs can't do, which have a fixed structure with weights.

We can't assume that a computer based neural network will have the same emergent behaviours as a biological one or vice versa.

The interesting point for me is in the neuroplasticity, because it implies that the networks which are specialised for language could start forming synapses which connect them to the parts which are more specialised to play doom giving rise to the possibility that this could be used for introspection


It is meaningful to consider this case. My general objection does not apply here.


I prefer to think of it as a reference in the Torment Nexus.


It would also work as a jab at Roko's basilisk


Actually “last year: this”. It was published in early 2025.


Arguments against call it immoral, while counter-arguments call it "legitimate".

Meanwhile, three-time Billionaire claims he's solved the problem using soylent green while fifty thousand people react in awe at the live presentation.


After seeing natural gas prices spike like that I'd probably pull out of such an energy intensive investment too


For Nvidia's part they're just giving money to one of their largest customers. They make money back even if they "lose" the bet


It's like government XX giving "help" or "grants" to countries at war so they can purchase weapons from XX.


Selling Shovels is quite lucrative whether there is an actual mining business or just a gold rush.

At one point Jensen Huang will be out (retired or forced by staginating sales) and can definitely look back on a very successful career. That much is certain.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: