Is finetuning GPT4o worth it? — with Alistair Pullen, Cosine (Genie) — Latent Space: The AI Engineer Podcast

✨ AI Summary

Cosine Genie achieved #1 ranking on SWE-Bench Full, Lite, and Verified using GPT-4o fine-tuning at scale on billions of tokens of synthetic data, beating all other agents including Cognition's Devin
Fine-tuning GPT-4o proved worthwhile despite long context windows and prompt caching from competitors, demonstrating practical value beyond 'in context learning is all you need' assumptions
The breakthrough combined OpenAI's new fine-tuning capabilities with massive synthetic data generation, establishing new SOTA for coding agents