cd /news/artificial-intelligence/cosmos-3-omnimodal-world-models-for-… · home topics artificial-intelligence article
[ARTICLE · art-19911] src=arxiv.org pub= topic=artificial-intelligence verified=true sentiment=↑ positive

Cosmos 3: Omnimodal World Models for Physical AI

NVIDIA researchers introduced Cosmos 3, a family of omnimodal world models that jointly process and generate language, image, video, audio, and action sequences within a unified mixture-of-transformers architecture. The models establish a new state-of-the-art across understanding and generation tasks, subsuming vision-language models, video generators, world simulators, and world-action models into a single framework for Physical AI. NVIDIA released the code, model checkpoints, synthetic datasets, and evaluation benchmark under an open license to accelerate research and deployment of embodied agents.

read1 min publishedJun 3, 2026

arXiv:2606.02800v1 Announce Type: new Abstract: We introduce Cosmos 3, a family of omnimodal world models designed to jointly process and generate language, image, video, audio, and action sequences within a unified mixture-of-transformers architecture. By supporting highly flexible input-output configurations, Cosmos 3 seamlessly unifies critical modalities for Physical AI -- effectively subsuming vision-language models, video generators, world simulators, and world-action models into a single framework. Our evaluation demonstrates that Cosmos 3 establishes a new state-of-the-art across a diverse suite of understanding and generation tasks, demonstrating omnimodal world models as scalable, general-purpose backbones for embodied agents. Our post-trained Cosmos 3 models were ranked as the best open-source Text-to-Image and Image-to-Video models by Artificial Analysis, and the best policy model by RoboArena at the time the technical report was written. To accelerate open research and deployment in Physical AI, we make our code, model checkpoints, curated synthetic datasets, and evaluation benchmark available under the Linux Foundation's OpenMDW-1.1 https://openmdw.ai/license/1-1/ License at https://github.com/nvidia/cosmos}{github.com/nvidia/cosmos and https://huggingface.co/collections/nvidia/cosmos3 . The project website is available at https://research.nvidia.com/labs/cosmos-lab/cosmos3 .

── more in #artificial-intelligence 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/cosmos-3-omnimodal-w…] indexed:0 read:1min 2026-06-03 ·