Answer Self-Consistency with Margin-Triggered Question Re-Arbitration for the CVPR 2026 VidLLMs Challenge

wpnews.pro

cd /news/computer-vision/answer-self-consistency-with-margin-… · home › topics › computer-vision › article

[ARTICLE · art-21128] src=arxiv.org pub=2026-06-04T04:00Z topic=computer-vision verified=true sentiment=· neutral

Answer Self-Consistency with Margin-Triggered Question Re-Arbitration for the CVPR 2026 VidLLMs Challenge

Researchers proposed Answer Self-Consistency with Margin-Triggered Question Re-Arbitration (ASC-MQRA) for the CVPR 2026 VidLLMs Challenge Track 2, a training-free test-time reasoning framework that improves video relational reasoning through multiple stochastic question-answering runs. The ASC component achieved 81.16% average accuracy on the test set by aggregating answer choices across runs, while the MQRA module, designed to re-arbitrate low-confidence examples, showed validation improvements but slightly degraded test performance. The team submitted ASC without re-arbitration as their final solution, demonstrating that answer-level self-consistency substantially outperforms single-pass inference for multimodal reasoning in videos.

read1 min publishedJun 4, 2026

arXiv:2606.04323v1 Announce Type: new Abstract: In this report, we present our solution for Track 2 of the CVPR 2026 VidLLMs Challenge. This track evaluates visual relational reasoning in videos, where models must infer relations that are not always explicitly visible. We propose Answer Self-Consistency with Margin-Triggered Question Re-Arbitration (ASC-MQRA), a training-free test-time reasoning framework built on a multimodal reasoning model. The core ASC component performs multiple stochastic video question-answering runs and aggregates their answer choices through answer-level self-consistency. This substantially improves over single-pass inference and forms our final test submission. We further study MQRA, a conditional re-arbitration module for low-margin examples where the first-stage vote distribution indicates uncertainty. Our vote-margin analysis shows that low-margin examples often retain the ground-truth answer among the top candidates, motivating MQRA to narrow the candidate set and re-watch the video only over the retained candidates. On validation, MQRA further improves over ASC, indicating that low-margin vote distributions can provide a useful uncertainty signal. On test, however, MQRA slightly degrades performance relative to ASC, suggesting that re-arbitration is sensitive to the size and category distribution of the triggered subset. Our final test submission therefore uses ASC without re-arbitration, achieving 72.73 average accuracy and 78.34 category-wise macro average accuracy on validation, and 81.16 average accuracy and 80.91 category-wise macro average accuracy on test. This report details our prompting strategy, implementation setup, ablation studies, and diagnostic analyses. The code is available at https://github.com/data-analytics-labo/ASC-MQRA

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/answer-self-consistency-…

Read original on arxiv.org → arxiv.org/abs/2606.04323

mentioned entities

CVPR 2026 VidLLMs Challenge

ASC-MQRA

Answer Self-Consistency with Margin-Triggered Question Re-Arbitration

metadata

sluganswer-self-consistency-with-margin-triggered-question-re-arbitration-for-the

topic#computer-vision

secondary4 topics

sentimentneutral

langen

canonicalarxiv.org

navigation

← prevHow FinOps Teams Trace Per-Reque…

next →SharkFlow Legal — devto

── more in #computer-vision 4 stories · sorted by recency

arxiv.org · 4 Jun · #computer-vision

VAMPS: Visual-Assisted Mathematical Problem Solving Benchmark

arxiv.org · 4 Jun · #computer-vision

Pinpoint: Grounded Worldwide Image Geolocation via Cross-Source Retrieval and Reranking

arxiv.org · 4 Jun · #computer-vision

SBP-Net: Learning Thin Structure Reconstruction with Sliding-Box Projections

arxiv.org · 4 Jun · #computer-vision

StandardE2E: A Unified Framework for End-to-End Autonomous Driving Datasets

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required