PodcastIntel
Sign in Get Started Free
Latent Space: The AI Engineer Podcast
Latent Space: The AI Engineer Podcast

RLHF 201 - with Nathan Lambert of AI2 and Interconnects

Jan 11, 2024 · 1h 25m
AI Summary
  • Nathan Lambert provides deep dive into RLHF (Reinforcement Learning from Human Feedback), explaining how transformer models transition from next-token prediction to helpful, honest assistants
  • Covers the shoggoth mask factory concept and training techniques like DPO used in Tulu 2 and other open-source models
  • Educational survey episode on one of the most critical alignment and training techniques in modern LLM development

More from Latent Space: The AI Engineer Podcast

View all episodes →

Get AI Summaries for Every New Episode

Subscribe to Latent Space: The AI Engineer Podcast and get AI summaries, guest tracking, and email digests delivered automatically.

Sign Up Free →