Hacker Newsnew | past | comments | ask | show | jobs | submit | Loic's commentslogin

This is the engineering approach in a factory. You always have multiple layers of security systems.

The analogy is that each layer is a slice of Emmental cheese. You end up with a bad event, only if all the holes in the slices align.


I hope they will learn from the success of the Brazilian PIX system.


You know that if you ask the LLM correctly you get top notch answers, because you have the experience to judge if the answer is top notch or not.

I spend a couple of hours per week teaching software architecture to a junior in my team, because he has not the experience to not only ask correctly but also assess the quality of the answer from the LLM.


3 kids, same honest conversations, 2 where it worked and works very well, 1 where it is a constant battle.

So sorry but no, the platforms are addictive and not all the kids can resist against an armada of statisticians ensuring the systems stay addictive only through honest conversations.

By the way, this would mean you could solve all the addiction issues if it would be working...


My personal experience: Electric toothbrush and razor. I especially hate the razors, you can replace the head, they could last a lifetime, but the battery is practically dead after two years. Toothbrushes are improving, the last one has 3 years of service and still work ok.


I'm using an Oral-B electric toothbrush from 2009. The (non-replaceable) battery needs to be charged about every 3-5 days now, which is not a problem because it sits in its charging stand every night.

My wife bought some cheap electric toothbrush that runs on AA batteries, which can be rechargeable and have a lifespan independent of the gadget.


I think the OpenAI deal to lock wafers was a wonderful coup. OpenAI is more and more losing ground against the regularity[0] of the improvements coming from Anthropic, Google and even the open weights models. By creating a chock point at the hardware level, OpenAI can prevent the competition from increasing their reach because of the lack of hardware.

[0]: For me this is really an important part of working with Claude, the model improves with the time but stay consistent, its "personality" or whatever you want to call it, has been really stable over the past versions, this allows a very smooth transition from version N to N+1.


Is anyone else deeply perturbed by the realization that a single unprofitable corporation can basically buy out the entire world's supply of computing hardware so nobody else can have it?

How did we get here? What went so wrong?


> unprofitable

I'm assuming you wouldn't see it as fine if the corporation was profitable.

> How did we get here?

We've always been there. Not that it makes it right, but that's an issue that is neither simple to fix nor something most law makers are guaranteed to want to fix in the first place.

Nothing in the rules stops you from cornering most markets, and an international companies with enough money can probably corner specific markets if they'd see a matching ROI.


> I'm assuming you wouldn't see it as fine if the corporation was profitable.

I feel like the implication of what they said was "think of how much worse it would be if they could truly spare no expense on these types of things". If an "unprofitable" company can do this, what could a profitable company of their size do on a whim?


They're simply making a bet that they can put the DRAM dies to more valuable use than any of the existing alternatives, including e.g. average folks playing the latest videogames on their gaming rig. At this kind of scale, they had better be right or they are toast: they have essentially gone all-in on their bet that this whole AI thing is not going to 'pop' anytime soon.


> They're simply making a bet that they can put the DRAM dies to more valuable use than any of the existing alternatives

They can't. They know they can't. We all know they can't. But they can just keep abusing the infinite money glitch to price everyone else out, so it doesn't matter.


When they find out that it is not, in fact, an infinite money glitch, they're going to have to eat that cost. It will work out great for everyone as long as they aren't bailed out.


It's more like a waste-infinite-money glitch, if that's what they're trying. There's no way that a simple speculative attack actually makes DRAM more valuable in the long term on its own, and that's the only win condition for that kind of play. People have tried to hoard all sorts of commodities as a mere speculative play on the market, and it never works.


Perhaps ChatGPT has given them instructions.


I don't see this working for Google though, since they make their own custom hardware in the form of the TPUs. Unless those designs include components that are also susceptible?


That was why OpenAI went after the wafers, not the finished products. By buying up the supply of the raw materials they bottleneck everybody, even unrelated fields. It's the kind of move that requires a true asshole to pull off, knowing it will give your company an advantage but screw up life for literally billions of people at the same time.


> By buying up the supply

We actually don't know for certain whether these agreements are binding. If OpenAI gets in a credit crunch we'll soon find out.


Went after the right component too. RAM manufacturers love an opportunity to create as much scarcity as possible.


TPUs use HBM, which are impacted.


Even their TPU based systems need RAM.


Still susceptible, TPUs need DRAM dies just as much as anything else that needs to process data. I think they use some form of HBM, so they basically have to compete alongside the DDR supply chain.


Could this generate pressure to produce less memory hungry models?


There has always been pressure to do so, but there are fundamental bottlenecks in performance when it comes to model size.

What I can think of is that there may be a push toward training for exclusively search-based rewards so that the model isn't required to compress a large proportion of the internet into their weights. But this is likely to be much slower and come with initial performance costs that frontier model developers will not want to incur.


> exclusively search-based rewards so that the model isn't required to compress a large proportion of the internet into their weights.

That just gave me an idea! I wonder how useful (and for what) a model would be if it was trained using a two-phase approach:

1) Put the training data through an embedding model to create a giant vector index of the entire Internet.

2) Train a transformer LLM but instead only utilising its weights, it can also do lookups against the index.

Its like a MoE where one (or more) of the experts is a fuzzy google search.

The best thing is that adding up-to-date knowledge won’t require retraining the entire model!


Yeah that was my unspoken assumption. The pressure here results in an entirely different approach or model architecture.

If openAI is spending $500B then someone can get ahead by spending $1B which improves the model by >0.2%

I bet there's a group or three that could improve results a lot more than 0.2% with $1B.


> so that the model isn't required to compress a large proportion of the internet into their weights.

The knowledge compressed into an LLM is a byproduct of training, not a goal. Training on internet data teaches the model to talk at all. The knowledge and ability to speak are intertwined.


I wonder if this maintains the natural language capabilities which are what LLM's magic to me. There is a probably some middle ground, but not having to know what expressions, or idiomatic speech an LLM will understand is really powerful from a user experience point of view.


Or maybe models that are much more task-focused? Like models that are trained on just math & coding?


isn't that what the mixture of experts trick that all the big players do is? Bunch of smaller, tightly focused models


Not exactly. MoE uses a router model to select a subset of layers per token. This makes them faster but still requires the same amount of RAM.


Of course and then watch those companies reined in.


Please explain to me like I am five: Why does OpenAI need so much RAM?

2024 production was (according to openai/chatgpt) 120 billion gigabytes. With 8 billion humans that's about 15 GB per person.


What they need is not so much memory but memory bandwidth.

For training, their models have a certain number of memory needed to store the parameters, and this memory is touched for every example of every iteration. Big models have 10^12 (>1T )parameters, and with typical values of 10^3 examples per batch, and 10^6 number of iteration. They need ~10^21 memory accesses per run. And they want to do multiple runs.

DDR5 RAM bandwidth is 100G/s = 10^11, Graphics RAM (HBM) is 1T/s = 10^12. By buying the wafer they get to choose which types of memory they get.

10^21 / 10^12 = 10^9s = 30 years of memory access (just to update the model weights), you need to also add a factor 10^1-10^3 to account for the memory access needed for the model computation)

But the good news is that it parallelize extremely well. If you parallelize you 1T parameters, 10^3 times, your run time is brought down to 10^6 s = 12 days. But you need 10^3 *10^12 = 10^15 Bytes of RAM by run for weight update and 10^18 for computation (your 120 billions gigabytes is 10^20, so not so far off).

Are all these memory access technically required : No if you use other algorithms, but more compute and memory is better if money is not a problem.

Is it strategically good to deprive your concurrents from access to memory : Very short-sighted yes.

It's a textbook cornering of the computing market to prevent the emergence of local models, because customers won't be able to buy the minimal RAM necessary to run the models locally even just the inferencing part (not the training). Basically a war on people where little Timmy won't be able to get a RAM stick to play computer games at Xmas.


Thanks - but this seems like fairly extreme speculation.

> if money is not a problem.

Money is a problem, even for them.


large language models are large and must be loaded into memory to train or to use for inference if we want to keep them fast. older models like gpt3 have around 175 billion parameters. at float32s that comes out to something like 700GB of memory. newer models are even larger. and openai wants to run them as consumer web services.


I mean, I know that much. The numbers still don't make sense to me. How is my internal model this wrong?

For one, if this was about inference, wouldn't the bottleneck be the GPU computation part?


Concurrency?

Suppose some some parallelized, distributed task requires 700GB of memory (I don't know if it does or does not) per node to accomplish, and that speed is a concern.

A singular pile of memory that is 700GB is insufficient not because it lacks capacity, but instead because it lacks scalability. That pile is only enough for 1 node.

If more nodes were added to increase speed but they all used that same single 700GB pile, then RAM bandwidth (and latency) gets in the way.


This "memory shortage" is not about AI companies needing main memory (which you plug into mainboards), but manufacturers are shifting their production capacities to other types of memory that will go onto GPUs. That brings supply for other memory products down, increasing their market price.


The conspiracy theory (which, to be clear, may be correct) is that they don't actually need so much RAM, but they know they and all their competitors do still need quite a bit of RAM. By buying up all the memory supply they can, for a while, keep everyone else from being able to add compute capacity/grow their business/compete.


> By creating a chock point at the hardware level, OpenAI can prevent the competition from increasing their reach because of the lack of hardware

I already hate OpenAI, you don't have to convince me


This became very clear with the outrage, rather than excitement, of forcing users to upgrade to ChatGPT-5 over 4o.


I'm not too keyed into the economics of this supposed AI bubble, but is this not an unfathomably risky move on OpenAI's part? If this thing actually pops, or a competitor like Google actually pulls ahead and comes out victorious, then OpenAI will sit holding a very expensive bag of expensive but unusable raw materials that they'll have to sell of at a discount?


Sure, but if the price is being inflated by inflated demand, then the suppliers will just build more factories until they hit a new, higher optimal production level, and prices will come back down, and eventually process improvements will lead to price-per-GB resuming its overall downtrend.


Memory fabs take billions of dollars and years to build, also the memory business is a tough one where losses are common, so no such relief in sight.

With a bit of luck OpenAI collapses under its own weight sooner than later, otherwise we're screwed for several years.


Micron has said they're not scaling up production. Presumably they're afraid of being left holding the bag when the bubble does pop


Not just Micron, SK Hynix has made similar statements (unfortunately I can only find sources in Korean).

DRAM manufacturers got burned multiple times in the past scaling up production during a price bubble, and it appears they've learned their lesson (to the detriment of the rest of us).


Why are they building a foundry in Idaho?

https://www.micron.com/us-expansion/id


Future demand aka DDR6.

The 2027 timeline for the fab is when DDR6 is due to hit market.


I mean it says on the page

>help ensure U.S. leadership in memory development and manufacturing, underpinning a national supply chain and R&D ecosystem.

It's more political than supply based


Hedging is understandable. But what I don't understand is why they didn't hedge by keeping Crucial around but more dormant (higher prices, less SKUs, etc)


The theory I've heard is built on the fact that China (CXMT) is starting to properly get into DRAM manufacturing - Micron might expect that to swamp the low end of the market, leaving Crucial unprofitable regardless, so they might as well throw in the towel now and make as much money as possible from AI/datacenter (which has bigger margins) while they can.

But yeah even if that's true I don't know why they wouldn't hedge their bets a bit.


So position Crucial as a premium brand, raise prices 4x instead of 3x, and drastically cut down on the SKUs to reduce overhead. If they tried that and kept spiraling into fewer and fewer SKUs and sales, I could understand it. But the discontinuation felt pretty abrupt.


Chip factories need years of lead time, and manufacturers might be hesitant to take on new debt in a massive bubble that might pop before they ever see any returns.


Long life, high activity nuclear waste represents less than 3500m3 (one Olympic swimming pool), and this, since the start of civil nuclear electrical production in the 50's. World wide.


Global waste is 400,000+ tons (https://www.stimson.org/2020/spent-nuclear-fuel-storage-and-...). Even 1 pool full is ~28,000 tons (UO2 package 8tons/m3). Urainium is dense.


20 swimming pools of total waste isn't that impressive. I don't want to live near that, but I'm sure I'd we can find a place to put that in that will have minimal impact on people's lives.


Exactly. The waste isn't really a problem. But it doesn't have to be waste. That's the point. All that U235 in 'spent' silos? You can get 60x - 100x its OG power feeding it to nextgen reactors. So cool


Properly contained nuclear waste is almost as concerning to me as my wifi router is.


I wrote about the high energy, long life waste. The part really causing issues.


I guess you mean the "super hot for centuries" minor actinides (Np-237, Am-241/243, Cm-242/244/245 etc..)? These are less than 1% global waste, but next gen reactors can still eat them. The majority of waste (95%+) is U-235, then Pu, which nextgen also eats.


Durable is really the French household name "par excellence".


You mean Duralex. And maybe not just in France. My British school canteen was monopolised by Duralex too.


La disparition[0], Georges Perec.

[0]: https://en.wikipedia.org/wiki/A_Void


Currently renovating our house, everything will be KNX based. Offline, no servers needed (even within the house) but nice for visualization, standard, 500+ vendors of compatible hardware. Highly recommended.


Also currently renovating our house, will not put any smart home stuff in it at all.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: