{"slug": "not-truly-multilingual-script-consistency-as-a-missing-dimension-in-vlm", "title": "Not Truly Multilingual: Script Consistency as a Missing Dimension in VLM Evaluation", "summary": "Researchers introduced PuMVR, a benchmark of 1,000 image-text pairs across Punjabi's three scripts, and found that state-of-the-art vision-language models show a systematic Script Gap, with accuracy differences up to 16% and script consistency rates as low as 24.8%. The findings reveal that current multilingual VLMs are not truly multi-script, highlighting the need for script-agnostic evaluation to ensure equitable AI access.", "body_md": "arXiv:2606.17188v1 Announce Type: new\nAbstract: Current multilingual evaluations for Vision-Language Models (VLMs) assume a one-to-one mapping between language and orthography, overlooking billions of users of multi-script languages. We introduce PuMVR (Punjabi Multimodal Visual Reasoning), a benchmark of 1,000 strictly parallel image-text instances across Punjabi's three active scripts: Gurmukhi, Shahmukhi, and Roman. Evaluating 10 state-of-the-art VLMs, we expose a substantial and systematic Script Gap. Models frequently solve visual tasks in one script while failing identical tasks in another, with accuracy deltas reaching 16%. Crucially, visual input boosts absolute performance uniformly yet does not close the orthographic gap. Furthermore, cross-script in-context transfer is highly brittle, exposing script-locked knowledge representation. Supported by McNemar tests across all script pairs, our findings demonstrate that current \"multilingual\" VLMs are not truly multi-script. We propose the Script Consistency Rate (SCR), which falls as low as 24.8% on our benchmark, as a mandatory metric for script-agnostic evaluation to ensure equitable AI access. Data and code are available at: https://github.com/prabhjotschugh/Not-Truly-Multilingual-PuMVR.", "url": "https://wpnews.pro/news/not-truly-multilingual-script-consistency-as-a-missing-dimension-in-vlm", "canonical_source": "https://arxiv.org/abs/2606.17188", "published_at": "2026-06-17 04:00:00+00:00", "updated_at": "2026-06-17 04:24:49.855103+00:00", "lang": "en", "topics": ["large-language-models", "computer-vision", "natural-language-processing", "ai-ethics", "ai-research"], "entities": ["PuMVR", "Punjabi Multimodal Visual Reasoning", "Gurmukhi", "Shahmukhi", "Roman", "Script Consistency Rate", "McNemar", "arXiv"], "alternates": {"html": "https://wpnews.pro/news/not-truly-multilingual-script-consistency-as-a-missing-dimension-in-vlm", "markdown": "https://wpnews.pro/news/not-truly-multilingual-script-consistency-as-a-missing-dimension-in-vlm.md", "text": "https://wpnews.pro/news/not-truly-multilingual-script-consistency-as-a-missing-dimension-in-vlm.txt", "jsonld": "https://wpnews.pro/news/not-truly-multilingual-script-consistency-as-a-missing-dimension-in-vlm.jsonld"}}