Paul Christiano — Preventing an AI takeover — Dwarkesh Podcast

✨ AI Summary

Paul Christiano has modest AGI timelines: 40% by 2040, 15% by 2030; addresses whether RLHF invention was regrettable and whether alignment is necessarily dual-use
Post-AGI world design remains unsettled: question of whether keeping superintelligent entities enslaved is ethical; pushing labs toward responsible scaling policies to prevent AI coups
New proof systems could solve alignment by explaining model behavior mathematically; preventing bioweapon development and AI takeover requires coordination on safety research

More from Dwarkesh Podcast

May 15, 2026 · 2h 37m

May 8, 2026 · 2h 13m

Apr 29, 2026 · 2h 13m

Apr 15, 2026 · 1h 43m