cd /news/computer-vision/navi-orbital-first-in-orbit-demonstr… · home topics computer-vision article
[ARTICLE · art-32052] src=arxiv.org ↗ pub= topic=computer-vision verified=true sentiment=↑ positive

NAVI-Orbital: First In-Orbit Demonstration of a Zero-Shot Vision-Language Model for Autonomous Earth Observation

On April 16, 2026, NAVI-Orbital became the first system to demonstrate a zero-shot vision-language model performing autonomous multi-modal inference entirely onboard a Low Earth Orbit spacecraft. The system uses Gemma 3 to classify scenes, generate text descriptions, and respond to operator queries via natural language, achieving 88.16% accuracy on the AID benchmark and processing live uncorrected YAM-9 imagery with hardware-accelerated GPU inference. This marks a shift from conventional acquire-then-downlink approaches to in-orbit semantic compression of Earth observations.

read1 min views4 publishedJun 18, 2026

arXiv:2606.18271v1 Announce Type: new Abstract: As Earth Observation data generation outpaces downlink bandwidth and human-in-the-loop processing, a widening gap has emerged between onboard collection and actionable ground intelligence. This paper presents NAVI-Orbital, a software system deployed on a Low Earth Orbit (LEO) spacecraft. On April 16, 2026, NAVI-Orbital achieved what is, to the authors' knowledge, the first in-orbit demonstration of a vision-language model performing autonomous multi-modal inference entirely onboard. NAVI-Orbital uses a local vision-language model (Gemma 3) to classify each captured scene, produce a text description of its content and the relationships between its features, and respond to operator follow-up via natural-language dialogue. The system is re-tasked through plain-English prompts in place of conventional command sequences, and is orchestrated by a graph-based state machine (LangGraph) coordinating dedicated agents for detection and dialogue. Results across ground benchmarking (88.16% accuracy on the 7,960-image curated AID benchmark), Flatsat validation, and live in-orbit captures of newly acquired, previously unseen Earth imagery (including uncorrected YAM-9 imagery, processed onboard with hardware-accelerated GPU inference and no fine-tuning for the flight instrument) demonstrate the feasibility of running foundation models on satellite-class edge computers to invert the conventional acquire-then-downlink-everything bandwidth profile through semantic compression of Earth observations in-orbit.

── more in #computer-vision 4 stories · sorted by recency
── more on @navi-orbital 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/navi-orbital-first-i…] indexed:0 read:1min 2026-06-18 ·