{"slug": "vigilformer-deformable-attention-for-video-anomaly-detection-with-causal-risk", "title": "VigilFormer: Deformable Attention for Video Anomaly Detection with Causal Risk Inference", "summary": "Researchers introduced VigilFormer, a video anomaly detection framework combining deformable spatio-temporal attention with causal temporal modeling, achieving state-of-the-art AUC scores of 87.83% on UCF-Crime, 97.21% on ShanghaiTech, and 89.74% on CUHK Avenue at 41.5 FPS on a single GPU. The model uses a Deformable Spatio-Temporal Encoder to reduce computational cost and an Adaptive Confidence Scheduler to skip low-information frames, outperforming existing weakly-supervised methods in both accuracy and speed.", "body_md": "arXiv:2606.14724v1 Announce Type: new\nAbstract: Video anomaly detection in surveillance settings must balance detection accuracy against real-time throughput, a tension that existing methods address either through stronger feature extractors or more efficient architectures, but rarely both. We present VigilFormer, a unified framework that combines deformable spatio-temporal attention with causal temporal modeling to detect anomalies in untrimmed surveillance video. The proposed Deformable Spatio-Temporal Encoder (DSTE) attends to a sparse set of informative locations across frames, avoiding the quadratic cost of dense attention while retaining the ability to capture irregular motion patterns. A Causal Anomaly Classifier (CAC) applies dilated causal convolutions over snippet-level features and optimizes a contrastive multiple-instance learning objective that separates anomalous and normal representations without frame-level labels. To meet deployment constraints, an Adaptive Confidence Scheduler (ACS) dynamically skips low-information frames at inference time, reducing redundant computation in static scenes. Evaluated on UCF-Crime, ShanghaiTech, and CUHK Avenue, VigilFormer achieves AUC scores of 87.83%, 97.21%, and 89.74% respectively, at 41.5 FPS on a single GPU, outperforming recent weakly-supervised methods in both accuracy and speed.", "url": "https://wpnews.pro/news/vigilformer-deformable-attention-for-video-anomaly-detection-with-causal-risk", "canonical_source": "https://arxiv.org/abs/2606.14724", "published_at": "2026-06-16 04:00:00+00:00", "updated_at": "2026-06-16 04:18:27.641272+00:00", "lang": "en", "topics": ["computer-vision", "machine-learning", "artificial-intelligence"], "entities": ["VigilFormer", "UCF-Crime", "ShanghaiTech", "CUHK Avenue", "Deformable Spatio-Temporal Encoder", "Causal Anomaly Classifier", "Adaptive Confidence Scheduler"], "alternates": {"html": "https://wpnews.pro/news/vigilformer-deformable-attention-for-video-anomaly-detection-with-causal-risk", "markdown": "https://wpnews.pro/news/vigilformer-deformable-attention-for-video-anomaly-detection-with-causal-risk.md", "text": "https://wpnews.pro/news/vigilformer-deformable-attention-for-video-anomaly-detection-with-causal-risk.txt", "jsonld": "https://wpnews.pro/news/vigilformer-deformable-attention-for-video-anomaly-detection-with-causal-risk.jsonld"}}