Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Very cool. I jumped in here thinking it was gonna be something else though: a packaged service for distributing on-prem model running across multiple GPUs.

I'm basically imagining a vast.ai type deployment of an on-prem GPT; assuming that most infra is consumer GPUs on consumer devices, the idea of running the "company cluster" as combined compute of the company's machines



Great point. I can see how you'd land there. Also a great idea! xD

Maybe a better descriptor is "self-sovereign AI?" "Self-hosted AI?"


Sounds like something that could be implemented with llm-d, though I've not experimented with it.

https://llm-d.ai/blog/intelligent-inference-scheduling-with-...


Yeah, I don't see why we could not integrate that. I think that is the next step as we move our workloads to production.


`lf deploy` here we come!


We're building something closer to this at Muna: https://docs.muna.ai . Check us out and let me know what you think!



Let me know when you open source it; I think there is a place for this and I think we could integrate it as a plug in pretty easily into the LlamaFarm framework :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: