PodcastIntel
Sign in Get Started Free
Latent Space: The AI Engineer Podcast
Latent Space: The AI Engineer Podcast

Why RL Won — Kyle Corbitt, OpenPipe (acq. CoreWeave)

Oct 16, 2025 · 1h 8m
AI Summary
  • OpenPipe pivoted from distilling GPT-4 into cheaper models to RL-based agent training as frontier model prices dropped, addressing why 90% of AI projects fail due to reliability rather than capability issues
  • RULER (Relative Universal Reinforcement Learning Elicited Rewards) breakthrough enables accessible RL training by using LLMs as judges to rank agent behaviors relatively, eliminating complex reward engineering
  • Kyle Corbitt transitioned from leading YC's Startup School to building a company acquired by CoreWeave, demonstrating shift from supervised fine-tuning to reinforcement learning as the critical path forward

Guests on This Episode

KC
Kyle Corbitt
1 podcast appearance

More from Latent Space: The AI Engineer Podcast

View all episodes →

Get AI Summaries for Every New Episode

Subscribe to Latent Space: The AI Engineer Podcast and get AI summaries, guest tracking, and email digests delivered automatically.

Sign Up Free →