✨
AI Summary
- Ashvin Nair from Cursor shipped RL breakthroughs on GPT-4o/o1/o3; reasoning team scaled from 12 to 300+ people; IOI Gold felt reachable in 2022 but only materialized when o1 shipped
- Key insight: RL doesn't generalize beyond training distribution, requiring product-model co-design to bring economically useful tasks into distribution instead of overfitting to benchmarks
- Cursor's continual learning approach with policy updates every two hours and bi-directional human-in-the-loop prevents ADHD-like context-switching, positioning it for next paradigm shift
Guests on This Episode
AN
Ashvin Nair
1 podcast appearance