Episodes (Page 10)
✨
Quentin Anthony of Eleuther AI demystifies the tacit knowledge around training LLMs efficiently, enabling non-insiders to estimate scaling laws and cost-performance tradeoffs
✨
Tianqi Chen of CMU/OctoML discusses MLC (Machine Learning Compilation) enabling LLMs to run on consumer hardware without GPUs
✨
NLW and hosts discuss Code Interpreter release as potential GPT-4.5, highlighting unexpected capabilities beyond coding and challenges in AI model evaluation
✨
Tri Dao explains FlashAttention: I/O-aware optimization reducing attention memory from O(N²) to sub-quadratic O(N) while maintaining exact computation without approximation
✨
Llama 2 released for commercial use with 2 trillion tokens pretraining, 2x context length, and ~$20M RLHF investment, immediately becoming leading open LLM
✨
GPT-3 trained on ~600GB of data (Wikipedia, Books, WebText, CommonCrawl), not the entire internet as commonly claimed
✨
Code Interpreter launched with ability to execute Python code, upload files, and handle edge cases with dependencies like Tesseract and TensorFlow
Alex Graveley
Alex Volkov
Aravind Srinivas
Simon Willison
✨
Data Dan Whitenack from Practical AI discusses 5-year podcast journey covering post-Transformers AI wave and learning from past episodes
✨
Ronen Eldan and Yuanzhi Li of Microsoft Research discuss tiny model revolution, showing how small models can match larger ones through clever training
✨
George Hotz discusses tinybox, a $15,000 'luxury AI computer' for local model training/inference with 738 FP16 TFLOPS and 144GB GPU RAM
✨
OpenAI released Functions API enabling structured JSON outputs and agent-like behavior with 75% embedding price drop and 4x context length
✨
Jeffrey Wang and Joe Reeve propose RLHB (Reinforcement Learning from Human Behavior) as alternative to RLHF, using implicit behavioral signals instead of explicit feedback
✨
Linus Lee from Notion AI discusses building AI × UX scenius and designing AI interfaces, emphasizing most knowledge work is not text generation
✨
Itamar Friedman from Codium AI discusses test generation IDE extension for Python/JS and vision of Code Integrity Agent for debugging software
✨
MosaicML's MPT-7B achieved SOTA in open-source with 84,000 token context length (vs GPT-3's 4,000), trained on 1 trillion tokens matching LLaMA-7B quality
✨
Shreya Rajpal of Guardrails AI addresses LLM output unpredictability and inability to follow requirements without extensive prompt engineering
✨
Sharif Shameem breaks podcast hiatus for exclusive interview on building Lexica (5B searches/day in 8 months, launched 24 hours after Stable Diffusion)
✨
Google's internal memo by Luke Sernau reveals panic over losing AI leadership to open source alternatives after Code Red declaration
Simon Willison
✨
Reza Shabani from Replit discusses open sourcing replit-code-v1-3b, beating OpenAI's Codex despite being 77% smaller when finetuned
✨
Mike Conover from Databricks discusses race for fully open source GPT-3/4-equivalent models addressing LLaMA's research-only license limitations