04:00
2026-06-26
arxiv.org
large-language-models
Staying VIGILant: Mitigating Visual Laziness via Counterfactual Visual Alignment in MLLMs
Researchers propose VIGIL, a reinforcement-learning post-training framework that reduces hallucinations in multimodal large language models by maximizing mutual information between visual input and geโฆ