{"slug": "perception-verdict-and-evolution-hindsight-driven-self-refining-forensics-agent", "title": "Perception, Verdict, and Evolution: Hindsight-Driven Self-Refining Forensics Agent for AI-Generated Image Detection", "summary": "Researchers propose ForeAgent, an agentic forensics framework for AI-generated image detection that uses a Perception-Verdict architecture and a Hindsight-Driven Self-Refining strategy to iteratively improve. The system achieves state-of-the-art performance on the Chameleon benchmark with 82.18% accuracy and 93.3% mean accuracy on AIGCDetect-Benchmark, outperforming existing methods including GPT-5.", "body_md": "arXiv:2606.26552v1 Announce Type: new\nAbstract: The rapid advancement of generative models presents a significant challenge to existing deepfake detection methods, particularly given the widespread dissemination of highly realistic AI-generated images. Although Multimodal Large Language Models (MLLMs) show strong potential for this task, existing approaches suffer from two key limitations: insufficient sensitivity to fine-grained forensic artifacts and reliance on static synthetic supervision from frontier models, leading to limited flexibility and high-cost. To address these issues, we propose ForeAgent, an agentic forensics framework for AI-generated image detection with iterative self-evolution. First, ForeAgent adopts a Perception-Verdict architecture that aggregates multi-view cues spanning semantic, spatial, and frequency-domain features, and leverages an MLLM as a verdict module to fuse these signals for a logical-grounded verdict. Second, to enable continual self-improvement, we introduce a Hindsight-Driven Self-Refining strategy following a Sampling-Reflection-Evolution paradigm. The agent performs inference rollouts on training instances. Guided by ground-truth labels as hindsight, it reflects on failure cases and low-quality reasoning trajectories to regenerate higher-quality reasoning traces. These synthesized samples are then strictly filtered through a dual-expert quality gating module. ForeAgent continuously evolves via fine-tuning on self-curated high-quality samples. Extensive experiments demonstrate that ForeAgent achieves state-of-the-art performance on the Chameleon benchmark, reaching 82.18% accuracy (+16.41% over AIDE), and achieves 93.3% mean accuracy on AIGCDetect-Benchmark across 16 generators. In addition, external evaluation shows that ForeAgent produces more consistent and causally grounded reasoning compared to GPT-5 and GPT-5-mini.", "url": "https://wpnews.pro/news/perception-verdict-and-evolution-hindsight-driven-self-refining-forensics-agent", "canonical_source": "https://arxiv.org/abs/2606.26552", "published_at": "2026-06-26 04:00:00+00:00", "updated_at": "2026-06-26 04:09:59.421976+00:00", "lang": "en", "topics": ["artificial-intelligence", "computer-vision", "ai-research", "generative-ai", "ai-safety"], "entities": ["ForeAgent", "Chameleon benchmark", "AIGCDetect-Benchmark", "GPT-5", "Multimodal Large Language Models", "AIDE"], "alternates": {"html": "https://wpnews.pro/news/perception-verdict-and-evolution-hindsight-driven-self-refining-forensics-agent", "markdown": "https://wpnews.pro/news/perception-verdict-and-evolution-hindsight-driven-self-refining-forensics-agent.md", "text": "https://wpnews.pro/news/perception-verdict-and-evolution-hindsight-driven-self-refining-forensics-agent.txt", "jsonld": "https://wpnews.pro/news/perception-verdict-and-evolution-hindsight-driven-self-refining-forensics-agent.jsonld"}}