PodcastIntel
Sign in Get Started Free
Latent Space: The AI Engineer Podcast
Latent Space: The AI Engineer Podcast

State of the Art: Training >70B LLMs on 10,000 H100 clusters

Jun 25, 2024 · 1h 21m
AI Summary
  • Databricks' DBRX and Imbue's 70B model outperform GPT-4o zero-shot on reasoning/coding benchmarks while using 7x less data than Llama 3 70B
  • Imbue releasing 11 cleaned NLP benchmarks, new code reasoning benchmark, 450k human ambiguity judgments dataset, and infrastructure scripts for bare-metal cluster training
  • Focus on cost-aware hyperparameter optimization and practical tools for training large models efficiently on 10,000 H100 clusters

More from Latent Space: The AI Engineer Podcast

View all episodes →

Get AI Summaries for Every New Episode

Subscribe to Latent Space: The AI Engineer Podcast and get AI summaries, guest tracking, and email digests delivered automatically.

Sign Up Free →