✨
AI Summary
- Will Brown discusses multi-turn RL for multi-hour agents and inference-time reasoning advances in Claude 4 Opus and Gemini's Deep Think
- Focuses on verifiers, turn-level credit assignment in reasoning models, and agentic RL as next frontier beyond current SOTA approaches
- Addresses emerging paradigm of scaling reasoning through longer inference-time compute rather than just parameter/data scaling
Guests on This Episode
WB
Will Brown
1 podcast appearance