04:00
2026-06-29
arxiv.org
large-language-models
NormAct: A Benchmark for Hidden Social Norm Compliance in Embodied Planning
Researchers introduced NormAct, a benchmark evaluating whether multimodal large language models comply with hidden social norms during embodied planning. Tests on GPT-5.4, Claude Opus 4.7, and Gemini โฆ