Episodes (Page 9)
✨
Kanjun Qiu of Imbue (recently valued at $1B+) explains why current AI agents fail: they lack consistent reasoning, tool use reliability, and ability to handle complex multi-step tasks
✨
Swyx discusses the AI Horcrux concept and cognitive science approaches to AI engineering, bridging software engineering and foundational AI research perspectives
✨
Swyx outlines Software 3.0 landscape: shift from traditional code to AI-native development, requiring new tools, architectures, and engineering practices
✨
Jerry Liu of LlamaIndex discusses RAG (Retrieval Augmented Generation) as a pragmatic hack addressing context window limitations of early GPT-3 for large datasets
✨
Raza Habib of Humanloop addresses the rite of passage for AI engineers: transitioning from demo to production requires prompt versioning, evaluation, monitoring, and finetuning infrastructure
✨
Youssef Rizk's Wondercraft.ai builds AI-first podcasting startup, producing HN Recap podcast
✨
Chris Lattner of Modular announced $100M Series A for Mojo language and modular AI infrastructure, addressing fragmentation and poor software quality in AI development
✨
Harrison Chase of LangChain discusses the rapid evolution from 2022 startup to $20-25M Series A, expanding from prompt templating to comprehensive LLM orchestration framework
✨
Eugene Cheah discusses RWKV (Receptance Weighted Key Value) models as significant Transformer alternative, reviving RNNs for GPT-class LLMs with better scaling properties
✨
Aman Sanger of Anysphere created Cursor, an AI-first code editor designed to increase AI-assisted coding beyond Copilot's current 46% of VS Code usage toward 90%+
✨
Quentin Anthony of Eleuther AI demystifies the tacit knowledge around training LLMs efficiently, enabling non-insiders to estimate scaling laws and cost-performance tradeoffs
✨
Tianqi Chen of CMU/OctoML discusses MLC (Machine Learning Compilation) enabling LLMs to run on consumer hardware without GPUs
✨
NLW and hosts discuss Code Interpreter release as potential GPT-4.5, highlighting unexpected capabilities beyond coding and challenges in AI model evaluation
✨
Tri Dao explains FlashAttention: I/O-aware optimization reducing attention memory from O(N²) to sub-quadratic O(N) while maintaining exact computation without approximation
✨
Llama 2 released for commercial use with 2 trillion tokens pretraining, 2x context length, and ~$20M RLHF investment, immediately becoming leading open LLM
✨
GPT-3 trained on ~600GB of data (Wikipedia, Books, WebText, CommonCrawl), not the entire internet as commonly claimed
✨
Code Interpreter launched with ability to execute Python code, upload files, and handle edge cases with dependencies like Tesseract and TensorFlow
Alex Graveley
Alex Volkov
Aravind Srinivas
Simon Willison
✨
Data Dan Whitenack from Practical AI discusses 5-year podcast journey covering post-Transformers AI wave and learning from past episodes
✨
Ronen Eldan and Yuanzhi Li of Microsoft Research discuss tiny model revolution, showing how small models can match larger ones through clever training
✨
George Hotz discusses tinybox, a $15,000 'luxury AI computer' for local model training/inference with 738 FP16 TFLOPS and 144GB GPU RAM