04:00
2026-06-19
arxiv.org
natural-language-processing
LaViSA: A Language and Vision Structural Ambiguity Benchmark
Researchers introduced LaViSA, a benchmark to evaluate vision-language models' ability to resolve structural ambiguity using visual scenes. Tests on proprietary and open-source models showed they can โฆ