✨
AI Summary
- ACPBench Hard evaluates LLM reasoning for automated planning.
- Features open-ended, generative planning tasks mirroring symbolic planner challenges.
- Current LLMs show limitations on these complex reasoning tasks.