cd /news/computer-vision/utvaa-ultra-tiny-vision-transformer-… · home topics computer-vision article
[ARTICLE · art-28917] src=arxiv.org ↗ pub= topic=computer-vision verified=true sentiment=↑ positive

UtVAA: Ultra-tiny Vision Transformer with Affix Attention for Mobile Image Classification

Researchers introduced UtVAA, an ultra-tiny Vision Transformer architecture with a novel Affix Attention block for mobile image classification. The smallest variant has 204.67K parameters and 53.95M FLOPs, achieving competitive accuracy on CIFAR-10, CIFAR-100, and tomato disease datasets. This work enables transformer-based models to run on resource-constrained devices without significant performance loss.

read1 min views1 publishedJun 16, 2026

arXiv:2606.14735v1 Announce Type: new Abstract: Vision Transformers (ViTs) have demonstrated strong representation capability in image classification. However, their quadratic self-attention complexity and large parameter counts limit deployment on resource-constrained mobile and edge devices. This paper introduces UtVAA, an ultra-tiny Vision Transformer architecture designed for efficient visual recognition under strict computational budgets. It incorporates a novel Affix Attention block that combines depthwise-pointwise local feature extraction, linear self-attention, coordinate attention for spatial dependency modelling, and a lightweight ternary fusion strategy to integrate local and global representations. In addition, Dilated Bottleneck blocks expand the receptive field using dilated depthwise separable convolutions while maintaining low FLOPs and stable optimisation through residual connections. UtVAA is implemented in scalable Tiny, Medium, and Large variants, with the smallest model containing 204.67K parameters and 53.95M FLOPs. Experimental results on CIFAR-10, CIFAR-100, PlantVillage-Tomato and SLIF-Tomato datasets show that UtVAA achieves competitive accuracy within a sub-million-parameter regime. Overall, the results demonstrate that transformer-based vision models can be redesigned into ultra-tiny architectures without significant loss in discriminative performance, making UtVAA suitable for mobile and edge deployment. Code is available at https://github.com/romiyal/UtVAA

── more in #computer-vision 4 stories · sorted by recency
── more on @utvaa 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/utvaa-ultra-tiny-vis…] indexed:0 read:1min 2026-06-16 ·