04:00
2026-06-16
arxiv.org
computer-vision
GridVQA-X: A Framework for Evaluating Multimodal Explainability Methods
Researchers introduced GridVQA-X, a diagnostic framework to evaluate cross-modal explainability in Vision-Language Models. The framework uses synthetic data with ground-truth explanations to test whetβ¦