Episodes (Page 6)
✨
Bolt.new (by Stackblitz) surpasses $8M ARR in just 2 months as Claude wrapper, demonstrating explosive growth of vibe coding platforms alongside flow engineering for code agents
✨
Erik Schluntz from Anthropic discusses new Claude 3.5 Sonnet capabilities, computer use features, and building state-of-the-art agents with latest model
Erik Schluntz
✨
Discussion of compound AI systems and open source advantages versus closed AI models in competitive landscape
✨
Lindy.ai founder Flo Crivello discusses agent platform evolution from late 2022 to present, challenging conventional wisdom in agent design with Rails and structured workflows
✨
Stanislas from Dust discusses evolution of agent frameworks since LangChain vs Dust debate; Dust prioritizes practical agent workflows at enterprises
✨
LMSys Chatbot Arena leads Anastasios Angelopoulos and Wei-Lin Chiang discuss crowdsourced AI evaluation platform attracting 1M+ votes, becoming de facto LLM comparison standard
✨
NotebookLM's Audio Overviews feature converts documents, websites, and videos into conversational podcasts using voice models, RAG, and GPT, pioneered by PMs Raiza Martin and Usama Bin Shafqat
✨
Singapore's GovTech is hosting an AI CTF challenge with ~$15,000 in prizes starting October 26th on Dreadnode's Crucible platform, open to local and virtual hackers
Josephine Teo
✨
Drew Houston has spent 400+ hours coding with LLMs and is refocusing Dropbox's 2,500+ employees around AI-native development 17 years after founding the company
✨
Ankur Goyal of Braintrust argues that production AI engineering should start with evals, not operational tooling, following the pattern of successful LLMOps founders with AI/research backgrounds
✨
OpenAI DevDay 2024 focused on developer-facing API announcements including Realtime API, Vision Finetuning, Prompt Caching, and Model Distillation rather than ChatGPT product announcements
✨
OpenAI's o1 release and recent hiring of Noam Brown and Shunyu Yao signals focus on tool-using chain-of-thought and tree-of-thought architectures for Level 3 Agents
✨
Sander Schulhoff's 'The Prompt Report' synthesizes 1,600+ arXiv papers on prompting techniques including few-shot learning, chain-of-thought, tree search, and self-criticism strategies
✨
Michelle Pokrass and OpenAI's DevRel team cover the entire OpenAI product suite including ChatGPT-latest, GPT-4o, o1 models, and how they're delivered via API with Structured Outputs
Michelle Pokrass
✨
AI inference costs decreased 10-100x in 2024, with open models like Llama 3.1 405B costing $3/mtok versus $30/mtok for Claude 3 Opus, and frontier models dropped 400x from 2022-2024
✨
Nicholas Carlini's 'How I Use AI' blog post demonstrates a practical approach focused on individual AI applications rather than broad AGI potential, covering 12 use cases with specific prompts
Nicholas Carlini
✨
Cosine Genie achieved #1 ranking on SWE-Bench Full, Lite, and Verified using GPT-4o fine-tuning at scale on billions of tokens of synthetic data, beating all other agents including Cognition's Devin
Alistair Pullen
✨
Jeremy Howard's Answer.AI ships 1000s of successful AI products with no managers and a team of 12, focusing on practical AI R&D aligned with GPU-poor needs
✨
Meta's Segment Anything 2 (SAM 2) improves image segmentation accuracy while being 6x faster than SAM 1, and elegantly solved video segmentation with 3x fewer interactions than prior approaches
✨
Q2 2024 AI progress analyzed through Four Wars framework: GPU-rich frontier labs (Claude 3.5, Mistral Large), GPU-rich helping GPU-poors (Llama 3.1 synthetic data, Phi 3, Gemma 2), and on-device LL...