arXiv:2606.04098v1 Announce Type: new Abstract: Video misinformation increasingly operates at the semantic and evidential level: authentic footage may be selectively edited, temporally reordered, spliced across sources, or augmented with AI-generated content to construct false narratives. Such evidence-dependent manipulations cannot be reliably verified from the input video alone, because the missing, reordered, replaced, or recontextualized evidence lies outside the video itself. We introduce \textbf{EVID-Bench}, a benchmark for search-grounded video misinformation detection, where a system must search the open web for related videos and identify what information is false through cross-video comparison. EVID-Bench comprises 222 videos spanning 9 manipulation types across 3 categories: AI generation, single-source editing, and multi-source editing. All samples are verified to be undetectable by frontier models through visual inspection alone. We evaluate nine frontier multimodal models using a retrieval-augmented verification baseline. The best system achieves only 61.43% point-level accuracy and 43.24% video-level accuracy, while AI-generated manipulations remain especially challenging. Error analysis reveals recurring challenges: models fixate on irrelevant anchors, misattribute synthetic content to editorial splicing, and terminate search prematurely before fully explaining the manipulation.
When Seeing Is Not Believing -- A Benchmark for Search-Grounded Video Misinformation Detection
Researchers introduced EVID-Bench, a benchmark requiring systems to search the open web for related videos and identify false information through cross-video comparison, as video misinformation increasingly relies on evidence-level manipulations undetectable from visual inspection alone. The benchmark includes 222 videos spanning nine manipulation types, with the best-performing system achieving only 61.43% point-level accuracy and 43.24% video-level accuracy. The findings reveal that frontier multimodal models struggle with AI-generated manipulations, fixating on irrelevant anchors and terminating searches prematurely before fully explaining the manipulation.
Run your AI side-project on zahid.host
EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.