✨
AI Summary
- RLVR may not fundamentally improve LLM reasoning beyond base models.
- Pass@k metric shows limited gains across math, code, and visual reasoning.
- Base models perform surprisingly well, questioning RLVR's impact.