cd /news/autonomous-vehicles/drivestack-vla-render-teacher-alignm… · home topics autonomous-vehicles article
[ARTICLE · art-37210] src=arxiv.org ↗ pub= topic=autonomous-vehicles verified=true sentiment=↑ positive

DriveStack-VLA: Render-Teacher Alignment for BEV-Based DeepStack Vision-Language-Action Model

Researchers introduced DriveStack-VLA, a framework that enhances Vision-Language-Action driving models with Bird-Eye-View representations and Render-Teacher Alignment to improve spatial intelligence. The model achieved state-of-the-art results on NAVSIMv1, NAVSIMv2, and Bench2Drive benchmarks, demonstrating superior motion planning and safety-critical perception.

read1 min views5 publishedJun 24, 2026

arXiv:2606.24051v1 Announce Type: new Abstract: Vision-Language-Action driving models convert a pretrained Vision-Language Model into a driving policy, allowing them to use world knowledge and follow language guidances. However, existing VLA driving models still lack driving-oriented spatial intelligence: their policies are mainly grounded on perspective image tokens and language priors, while precise motion planning requires metric geometry, top-down scene structure, and attention to safety-critical perceptual cues. This limitation makes current models vulnerable to weak visual geometry modeling and perceptual coverage in expert demonstrations. In this paper, we present DriveStack-VLA, a framework built upon a large VLM backbone. To strengthen the spatial grounding of VLA driving, we develop dual visual modeling components. We inject a Bird-Eye-View representation into the Large Language Model decoder through a DeepStack-style connection, and propose Render-Teacher Alignment to align the perceptual focus of real images with that of rasterized images. Furthermore, to bridge the gap in multimodal trajectory selection, we introduce a head-based self-critique module that ranks sampled trajectories and conditionally refines the best one. DriveStack-VLA achieves 91.6 PDMS on NAVSIMv1, 91.0 EPDMS on NAVSIMv2 (with the human penalty filter enabled), and a driving score of 79.49 with a success rate of 56.36% on the closed-loop Bench2Drive. More visualizations are available on our project page: https://anonymous.4open.science/w/drivestack-vla/.

── more in #autonomous-vehicles 4 stories · sorted by recency
── more on @drivestack-vla 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/drivestack-vla-rende…] indexed:0 read:1min 2026-06-24 ·