Nemotron 3 Ultra now available on AI Gateway

wpnews.pro

cd /news/large-language-models/nemotron-3-ultra-now-available-on-ai… · home › topics › large-language-models › article

[ARTICLE · art-21819] src=vercel.com ↗ pub=2026-06-04T07:00Z topic=large-language-models verified=true sentiment=↑ positive

Nemotron 3 Ultra now available on AI Gateway

Nvidia's Nemotron 3 Ultra, an open Mixture-of-Experts reasoning model with a 1M token context window, is now available on Vercel AI Gateway. The model is designed for long-running agent workflows, offering up to 350 tokens per second throughput and up to 30% lower cost on agentic tasks. AI Gateway provides a unified API for the model with features including custom reporting, zero data retention, and dynamic provider sorting by latency and cost.

read1 min views14 publishedJun 4, 2026

Nemotron 3 Ultra from Nvidia is now available on Vercel AI Gateway. Nemotron 3 Ultra is an open Mixture-of-Experts reasoning model built for orchestrating long-running agent workflows, with a 1M token context window. The model targets multi-turn agent workflows: planning, tool use, sub-agent delegation, and error recovery. Throughput reaches up to 350 tokens per second, with up to 30% lower cost on agentic tasks.

To use Nemotron 3 Ultra, set model to `nvidia/nemotron-3-ultra-550b-a55b`

in the [AI SDK](https://ai-sdk.dev/).

AI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in custom reporting, Zero Data Retention support, dynamic provider sorting by latency and cost, and more. AI Gateway reflects provider pricing with no markup and does not charge a platform fee on inference, including on Bring Your Own Key (BYOK) requests.

Learn more about AI Gateway, view the AI Gateway model leaderboard or try it in our model playground.

source & further reading

vercel.com — original article Gemini 3.6 Flash and Gemini 3.5 Flash-Lite are now available on AI Gateway Service tiers now available on AI Gateway Laguna S 2.1 is now available on AI Gateway

~/api · this article 200

$curl api.wpnews.pro/v1/news/nemotron-3-ultra-now-ava…

Read original on vercel.com → vercel.com/changelog/nemotron-3-ultra-now-availa…

mentioned entities

Nemotron 3 Ultra

Nvidia

Vercel AI Gateway

AI SDK

metadata

slugnemotron-3-ultra-now-available-on-ai-gateway

topic#large-language-models

secondary4 topics

sentimentpositive

canonicalvercel.com

navigation

← prevFoxconn and Intel join SambaNova…

next →Vibe coding gets you the prototy…

── more in #large-language-models 4 stories · sorted by recency

runtimewire.com · 21 Jul · #large-language-models

Gizmo generates editable 3D environments for robot training from text and images

siliconangle.com · 22 Jul · #large-language-models

AI server maker Supermicro’s stock gains on $60B order backlog and stronger margins

marktechpost.com · 22 Jul · #large-language-models

Poolside Releases Laguna S 2.1, an Open-Weight Agentic Coding Model Punching Above Its Weight Class on SWE-Bench Multilingual

jonready.com · 22 Jul · #large-language-models

Agent swarms are great for local AI

── more on @nemotron 3 ultra 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 26 May · #ai-agents

Think, Durable Objects, and the Real Shape of AI Applications

wpnews · 8 Jul · #ai-tools

What's the Future of Clay?

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required