04:00
2026-06-03
arxiv.org
artificial-intelligence
Consistent Yet Wrong: Evidence Insensitivity in Spatial Vision-Language Models
Leading vision-language models (VLMs) produce view-invariant and consistent answers to spatial distance queries even when those answers are incorrect, revealing a weak link between predictions and actโฆ