Everything you need to run Mission Critical Inference (ft. DeepSeek v3 + SGLang)

✨ AI Summary

DeepSeek v3 launched as best open-weights model (LM Arena score 1319), making large model serving the bottleneck, requiring specialized inference infrastructure like Baseten's H200 clusters
Chinese labs releasing 400B+ parameter models (Hunyuan-Large, MiniMax-Text) alongside DeepSeek v3, creating supply of high-quality open weights but infrastructure challenges
Mission-critical inference requires deep optimization, collaboration with model builders, and hardware investment to serve state-of-the-art models in production

More from Latent Space: The AI Engineer Podcast

Apr 3, 2026 · 1h 16m

Apr 2, 2026 · 1h 6m

Mar 30, 2026 · 48m

Mar 24, 2026 · 35m