Scaling Test Time Compute to Multi-Agent Civilizations — Noam Brown, OpenAI — Latent Space: The AI Engineer Podcast

✨ AI Summary

Noam Brown's work on solving Poker and Diplomacy demonstrates test-time compute scaling with multi-agent reasoning; System 1/2 analogy oversimplifies—reasoning models work differently than fast/slow thinking in humans
Reinforcement fine-tuning and long-term model adaptability outperform over-reliance on scaffolds and routers; fragility in tool use remains challenge for production agent systems
Multi-agent intelligence hypothesis suggests emergent civilization-like behavior; implicit world models, memory reuse, and PR review processes needed for better AI developer collaboration