Nvidia's new world model will spur robotics leap

wpnews.pro

cd /news/artificial-intelligence/nvidia-s-new-world-model-will-spur-r… · home › topics › artificial-intelligence › article

[ARTICLE · art-19524] src=thedeepview.com ↗ pub=2026-06-01T05:30Z topic=artificial-intelligence verified=true sentiment=↑ positive

Nvidia's new world model will spur robotics leap

Nvidia unveiled Cosmos 3, a new open-source world foundation model at Computex in Taipei, designed to improve generalization capabilities for physical AI and robotics development. The model, which processes text, video, images, sound and action, introduces a mixture-of-transformers architecture and is trained on 20 trillion tokens to help developers build more adaptable AI agents. The release addresses a key barrier to physical AI deployment, as Nvidia aims to provide a foundation for solving what it calls the "holy grail" of robotics generalization.

read3 min views15 publishedJun 1, 2026

As the AI industry looks beyond language models, Nvidia is betting big on the buzzy new technology powering physical AI: world models.

At Nvidia GTC Taipei at Computex, the company unveiled Cosmos 3, a new generalist world foundation model that it calls a "fully open omnimodel," capable of reasoning and generation across text, video, images, ambient sound and action. This iteration of the Cosmos world model family builds on a previous generations by providing improved generalization capabilities, which is a major barrier to physical AI development and deployment.

"We wanted to build this Cosmo 3 model to help physical AI developers to build more generalizable physical AI models," Ming-Yu Liu, Nvidia's VP of Cosmos Labs,** **told The Deep View.

Cosmos 3 debuts a number of world model innovations, Liu said:

The model utilizes a new architecture called "mixture-of-transformers," which combines the best aspects of two types of transformers: one for reasoning and one for generation. This enables it to understand object interactions, motion, and spatiotemporal relationships before generating video or action paths.
Cosmos 3 also doesn’t treat just one kind of data as a first-class citizen, said Liu. Instead, being omnimodal, it reasons with and generates "image, video, sound, and action, together with text," he said.
Additionally, Cosmos 3 is trained on one of the largest multimodal datasets for physical AI, spanning 20 trillion tokens, 1 billion images and 400 million authentic and synthetic videos.

The model comes in several sizes: Super, the larger model for high-quality physics and accuracy, and Nano, for more efficient, quick generation needs, both of which are available now. Edge, which offers real-time inference for edge computing, will be available soon.

The models are also open-source, which Liu said offers developers more control and usability in physical AI development, a process that can be "challenging to do with API assets only." That allows enterprises to run them locally, customize them for their needs, and better control data security.

Because the foundation models themselves are "just a starting point for physical AI developers," the goal is to integrate these models into ecosystems to provide a foundation for solving critical problems, he said.

Cosmos 3 is just one step in the right direction in solving one of physical AI’s most pressing challenges. "We believe that the key problem to solve in physical AI is the generalization capability of the agent," Liu said. "To be clear, [Cosmos] is not yet solving the problem, but I think this architecture provides a great foundation to solve what I think is the holy grail in robotics."

Our Deeper View #

With Cosmos, Nvidia is feeding the open model ecosystem, both for the benefit of the ecosystem and for its own benefit. Along with providing the foundation for developers to create what Liu calls robotics’ "holy grail", any opportunity to feed a market that will inevitably demand more compute is an opportunity for Nvidia to make money in the end, as well as potentially make its own chips better through extreme hardware co-design. And while the benefits would extend back to Nvidia, a rising tide lifts all boats. As the industry broadly embraces the promise of physical AI, Nvidia's sharing of its resources and innovation will help stimulate further innovation.

source & further reading

thedeepview.com — original article Cisco bets small models can solve AI's big problem Halliday rebuilds smart glasses around meetings Study: How AI porn distorts teens' reality

~/api · this article 200

$curl api.wpnews.pro/v1/news/nvidia-s-new-world-model…

Read original on thedeepview.com → www.thedeepview.com/articles/nvidia-s-new-world-…

mentioned entities

Nvidia

Cosmos 3

Ming-Yu Liu

Cosmos Labs

GTC Taipei

Computex

metadata

slugnvidia-s-new-world-model-will-spur-robotics-leap

topic#artificial-intelligence

secondary4 topics

sentimentpositive

canonicalthedeepview.com

navigation

← prevNVIDIA AI Cloud Ecosystem Expand…

next →Query, Key, Values

── more in #artificial-intelligence 4 stories · sorted by recency

runtimewire.com · 21 Jul · #artificial-intelligence

Gizmo generates editable 3D environments for robot training from text and images

marktechpost.com · 21 Jul · #artificial-intelligence

NVIDIA Releases Cosmos 3 Edge: A 4B-Parameter Open World Model That Reasons and Generates Robot Actions On-Device

startupfortune.com · 22 Jul · #artificial-intelligence

Intel cuts jobs in its fastest-growing division two days before Q2 earnings

siliconangle.com · 22 Jul · #artificial-intelligence

AI server maker Supermicro’s stock gains on $60B order backlog and stronger margins

── more on @nvidia 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 26 May · #ai-agents

Think, Durable Objects, and the Real Shape of AI Applications

wpnews · 8 Jul · #ai-tools

What's the Future of Clay?

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required