PodcastIntel
Sign in Get Started Free
Latent Space: The AI Engineer Podcast
Latent Space: The AI Engineer Podcast

How to train a Million Context LLM — with Mark Huang of Gradient.ai

May 30, 2024 · 57m
AI Summary
  • Documents the evolution of context window lengths from 84k tokens (MPT-7B) to current 1M+ token models, covering the competitive 'Context Extension Campaigns' between frontier labs
  • Discusses Mark Huang's techniques for training long-context LLMs at scale, including architectural innovations and training strategies to handle million-token inputs
  • Addresses practical challenges in long-context training such as computational efficiency, interpolation methods, and maintaining performance across extended sequences

More from Latent Space: The AI Engineer Podcast

View all episodes →

Get AI Summaries for Every New Episode

Subscribe to Latent Space: The AI Engineer Podcast and get AI summaries, guest tracking, and email digests delivered automatically.

Sign Up Free →