Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
vLLM (high-throughput LLM serving engine) (github.com/vllm-project)
2 points by roody_wurlitzer 14 hours ago | past | discuss
vLLM multi-turn conversations design (github.com/vllm-project)
1 point by CCs 34 days ago | past
VLLM-Omni: A framework for efficient model inference with Omni-modality models (github.com/vllm-project)
2 points by zyh888 85 days ago | past | 1 comment
Cost-efficient and pluggable Infrastructure components for GenAI inference (github.com/vllm-project)
1 point by rrampage on Feb 23, 2025 | past
Cost-efficient and pluggable Infrastructure components for GenAI inference (github.com/vllm-project)
1 point by delduca on Feb 22, 2025 | past
LLM compressor: compress models for efficient deployment (github.com/vllm-project)
1 point by hajduksplit on Aug 20, 2024 | past | 1 comment
VLLM Sacrifices Accuracy for Speed (github.com/vllm-project)
1 point by behnamoh on Jan 24, 2024 | past
Easy, fast, and cheap LLM serving for everyone (github.com/vllm-project)
2 points by vincent_s on Dec 17, 2023 | past
vllm (github.com/vllm-project)
1 point by tosh on Dec 15, 2023 | past
Mixtral Expert Parallelism (github.com/vllm-project)
1 point by tosh on Dec 15, 2023 | past
Official PR Reveals the Inference Code for Mixtral 8x7B (github.com/vllm-project)
2 points by georgehill on Dec 11, 2023 | past
Vllm: High-throughput and memory-efficient inference and serving engine for LLMs (github.com/vllm-project)
3 points by tosh on Sept 10, 2023 | past
Vllm (github.com/vllm-project)
3 points by kordlessagain on Aug 6, 2023 | past
VLLM (github.com/vllm-project)
2 points by sherlockxu on June 25, 2023 | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: