PodcastIntel
Sign in Get Started Free
Latent Space: The AI Engineer Podcast
Latent Space: The AI Engineer Podcast

Everything you need to run Mission Critical Inference (ft. DeepSeek v3 + SGLang)

Jan 19, 2025 · 1h 0m
AI Summary
  • DeepSeek v3 launched as best open-weights model (LM Arena score 1319), making large model serving the bottleneck, requiring specialized inference infrastructure like Baseten's H200 clusters
  • Chinese labs releasing 400B+ parameter models (Hunyuan-Large, MiniMax-Text) alongside DeepSeek v3, creating supply of high-quality open weights but infrastructure challenges
  • Mission-critical inference requires deep optimization, collaboration with model builders, and hardware investment to serve state-of-the-art models in production

More from Latent Space: The AI Engineer Podcast

View all episodes →

Get AI Summaries for Every New Episode

Subscribe to Latent Space: The AI Engineer Podcast and get AI summaries, guest tracking, and email digests delivered automatically.

Sign Up Free →