$3000 for running a 397B total parameters model is quite a bargain. The Mac is b... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		zozbot234 25 days ago \| parent \| context \| favorite \| on: Flash-MoE: Running a 397B Parameter Model on a Lap... $3000 for running a 397B total parameters model is quite a bargain. The Mac is being used for its access to fast internal storage here since that's the key bottleneck, you could probably achieve similar outcomes with conventional (even fairly low-end) iGPU/APU hardware plus a fast PCIe x4 5.0 SSD (which would also allow you to overlap SSD transfers with iGPU/APU compute), but the cost would also be in a similar range. (Unless you carefully chose low-end e.g. Intel hardware with proper PCIe x4 5.0 NVMe support - which is still quite uncommon, especially for laptops.)

ActorNightly 23 days ago [–]

If you want to flex on being able to run 397b parameter models at unusably slow tokens/second sure.

You can buy a 3090 for $2k, and run QWEN3.5 at 50+ token a second, and it will do everything you need, especially if you give it enough context.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact