✨
AI Summary
- Cosine Genie achieved #1 ranking on SWE-Bench Full, Lite, and Verified using GPT-4o fine-tuning at scale on billions of tokens of synthetic data, beating all other agents including Cognition's Devin
- Fine-tuning GPT-4o proved worthwhile despite long context windows and prompt caching from competitors, demonstrating practical value beyond 'in context learning is all you need' assumptions
- The breakthrough combined OpenAI's new fine-tuning capabilities with massive synthetic data generation, establishing new SOTA for coding agents
Guests on This Episode
AP
Alistair Pullen
1 podcast appearance