ActQuant: Sub-4-bit Action-Guided Quantization for Vision-Language-Action Models

wpnews.pro

cd /news/machine-learning/actquant-sub-4-bit-action-guided-qua… · home › topics › machine-learning › article

[ARTICLE · art-14046] src=arxiv.org ↗ pub=2026-05-26T04:00Z topic=machine-learning verified=true sentiment=↑ positive

ActQuant: Sub-4-bit Action-Guided Quantization for Vision-Language-Action Models

Researchers introduced ActQuant, a sub-4-bit quantization framework for Vision-Language-Action (VLA) models that reduces computational demands for edge deployment. The method achieved 95% performance retention at 3 bits-per-weight on the LIBERO benchmark and compressed the OpenVLA-OFT backbone from 14.3 GB to 2.7 GB (5.3x) at 2.5 bits-per-weight. On a physical UR3 robotic arm, ActQuant maintained baseline success rates while reducing memory footprint by 2.5x.

read1 min views8 publishedMay 26, 2026

arXiv:2605.24011v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models exhibit remarkable action generation for embodied intelligence, but their heavy compute make deployment on edge platforms impractical. Aggressive, sub-4-bit weight quantization is the natural solution, yet existing post-training quantization (PTQ) methods suffer severe performance degradation in this regime. To address this, we introduce ActQuant, an action-guided mixed-precision PTQ framework that operates in two stages: (1) an inter-tensor bit allocator that assigns each weight matrix a single bit-width based on how much it contributes to predicting the agent's actions; (2) an intra-tensor scale optimizer tunes per-block quantization scales using action-aware curvature, so that dynamic range is concentrated on the weights most influential for control. To deliver the on-device benefits of our aggressive quantization, we further introduce OmniModel.cpp, an agentic conversion pipeline that ports architectures into a native C/C++ runtime with efficient low-bit kernels. We evaluate ActQuant both in simulation and on a real-world 6-DoF UR3 arm, with all models deployed through OmniModel.cpp. On the LIBERO benchmark, ActQuant is the only method that operates at or below 3 bits-per-weight, retaining 95.0% on OpenVLA-OFT and 94.8% on $\pi_{0.5}$. Pushed further, ActQuant reaches 2.5 bpw at 90.1% on OpenVLA-OFT, compressing the backbone from 14.3 GB to 2.7 GB (5.3$\times$). On the physical UR3 arm, $\pi_{0.5}$ quantized with ActQuant retains the baseline's success rate while reducing the memory footprint by 2.5$\times$.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/actquant-sub-4-bit-actio…

Read original on arxiv.org → arxiv.org/abs/2605.24011

mentioned entities

ActQuant

OmniModel.cpp

OpenVLA-OFT

π0.5

LIBERO

UR3

metadata

slugactquant-sub-4-bit-action-guided-quantization-for-vision-language-action-models

topic#machine-learning

secondary4 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevShow HN: Self-hosted collaborati…

next →Google Enters The Ecommerce Wars…

── more in #machine-learning 4 stories · sorted by recency

arxiv.org · 3 Jun · #machine-learning

AURA: Action-Gated Memory for Robot Policies at Constant VRAM

machinebrief.com · 10 Jul · #machine-learning

VTC: The Future of Tensor Compilation Optimization

machinebrief.com · 10 Jul · #machine-learning

Breaking Down Long-Context Transformer Bottlenecks

machinebrief.com · 10 Jul · #machine-learning

Breaking Communication Barriers in Decentralized AI Training

── more on @actquant 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI unveils Grok 4.5 AI model ahead of July 2026 public release

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required