Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So, can somebody in the know speculate about how Deepseek (or OpenAI, or whoever really) is actually running their API?

If I wanted to run a production-grade service using the full Deepseek model, with good tokens/sec and the ability to serve concurrent requests, what sort of hardware are we looking at?



Racks and Racks of servers (likely nVidia HGX H100/H200 8-GPU server) connected at at least 100GB (but more likely 400gb and 800gb) links. The servers alone start at about $350k. Then you need to supply power, cooling, networking and a technical team to support the program.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: