Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving (github.com/kvcache-ai)
8 points by sarkory 11 months ago | past
Show HN: KTransformers:671B DeepSeek-R1 on a Single Machine-286 tokens/s Prefill (github.com/kvcache-ai)
14 points by sssummer on Feb 10, 2025 | past
Show HN: KTransformers–236B Model and 1M Context LLM Inference on Local Machines (github.com/kvcache-ai)
20 points by sssummer on Aug 29, 2024 | past | 3 comments
Mooncake: A KVCache-Centric Disaggregated Architecture for LLM Serving (github.com/kvcache-ai)
13 points by zinccat on June 29, 2024 | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: