PodcastIntel
Sign in Get Started Free
Latent Space: The AI Engineer Podcast
Latent Space: The AI Engineer Podcast

Reality: The Final Eval — Lukas Petersson and Axel Backlund of Andon Labs

Jun 4, 2026 · 1h 15m
AI Summary
  • Industry benchmarks don't capture real-world model performance.
  • Andon Labs developed SWE-Bench Pro for comprehensive AI evaluation.
  • Lukas Petersson and Axel Backlund discussed AI evaluation methods.

More from Latent Space: The AI Engineer Podcast

View all episodes →

Get AI Summaries for Every New Episode

Subscribe to Latent Space: The AI Engineer Podcast and get AI summaries, guest tracking, and email digests delivered automatically.

Sign Up Free →