Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the advantage of Flash-MoE compared to plain mmap is mostly the coalesced representation where a single expert-layer is represented by a single extent of sequential data. That could be introduced to existing binary formats like GGUF or HF - there is already a provision for differently structured representations, and that would easily fit.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: