cd /news/computer-vision/architectural-bias-in-face-presentat… · home topics computer-vision article
[ARTICLE · art-33438] src=arxiv.org ↗ pub= topic=computer-vision verified=true sentiment=· neutral

Architectural Bias in Face Presentation Attack Detection: A Comparative Study of Vision Transformers and Convolutional Neural Networks

A new study comparing Vision Transformers and CNNs for face presentation attack detection found that pretrained Vision Transformers reduce demographic bias and improve accuracy. The DeiT-S model achieved 97.27% accuracy and reduced the inter-ethnic accuracy gap by 83% compared to prior work, while also generalizing 3.6x better to unseen demographic groups. The findings suggest architectural design influences fairness in biometric systems.

read1 min views17 publishedJun 18, 2026

arXiv:2606.18510v1 Announce Type: new Abstract: Face Presentation Attack Detection (PAD) systems constitute a critical security layer in biometric authentication; however, existing approaches exhibit systematic performance disparities across demographic groups, disproportionately affecting individuals with darker skin tones. This paper presents a comparative empirical investigation of whether Vision Transformer architectures reduce demographic bias in face PAD systems relative to convolutional baselines. Experiments are conducted on the CASIA-SURF Cross-Ethnicity Face Anti-Spoofing (CeFA) dataset. Three architectures are evaluated: a Multimodal ViT-Tiny trained from scratch, a ResNet18 CNN baseline, and a pretrained DeiT-S fine-tuned on CeFA across African, East Asian, and zero-shot Central Asian demographic groups. DeiT-S achieves the highest overall accuracy of 97.27% and the lowest EER of 0.86%, outperforming ResNet18 at 90.15% accuracy. In terms of fairness, DeiT-S reduces the inter-ethnic ACER gap between African and East Asian subjects to 0.13%, compared to 0.75% reported in an LBP-based work [6], representing an 83% reduction. Most notably, while ResNet18 records a BPCER of 10.44% on zero-shot Central Asian subjects, DeiT-S maintains 2.89% on the same unseen group, demonstrating a 3.6x generalization advantage. These results suggest that pretrained Vision Transformers achieve superior PAD accuracy, produce smaller demographic performance gaps, and generalize more equitably across unseen demographic groups, indicating that cross-demographic fairness in PAD may partly be influenced by architectural design.

── more in #computer-vision 4 stories · sorted by recency
── more on @vision transformer 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/architectural-bias-i…] indexed:0 read:1min 2026-06-18 ·