vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep
Link: https://blog.vllm.ai/2025/12/17/large-scale-serving.html
Discussion: https://news.ycombinator.com/item?id=46602737
vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep
Link: https://blog.vllm.ai/2025/12/17/large-scale-serving.html
Discussion: https://news.ycombinator.com/item?id=46602737
If you have a fediverse account, you can quote this note from your own instance. Search https://social.lansky.name/users/hn100/statuses/115892552375304088 on your instance and quote it. (Note that quoting is not supported in Mastodon.)