{"slug": "nemotron-3-ultra-now-available-on-ai-gateway", "title": "Nemotron 3 Ultra now available on AI Gateway", "summary": "Nvidia's Nemotron 3 Ultra, an open Mixture-of-Experts reasoning model with a 1M token context window, is now available on Vercel AI Gateway. The model is designed for long-running agent workflows, offering up to 350 tokens per second throughput and up to 30% lower cost on agentic tasks. AI Gateway provides a unified API for the model with features including custom reporting, zero data retention, and dynamic provider sorting by latency and cost.", "body_md": "Nemotron 3 Ultra from Nvidia is now available on [Vercel AI Gateway](https://vercel.com/ai-gateway).\n\nNemotron 3 Ultra is an open Mixture-of-Experts reasoning model built for orchestrating long-running agent workflows, with a 1M token context window. The model targets multi-turn agent workflows: planning, tool use, sub-agent delegation, and error recovery. Throughput reaches up to 350 tokens per second, with up to 30% lower cost on agentic tasks.\n\nTo use Nemotron 3 Ultra, set model to `nvidia/nemotron-3-ultra-550b-a55b`\n\nin the [AI SDK](https://ai-sdk.dev/).\n\nAI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in [custom reporting](https://vercel.com/changelog/custom-reporting-ai-gateway), [Zero Data Retention support](https://vercel.com/blog/zdr-on-ai-gateway), [dynamic provider sorting by latency and cost](https://vercel.com/changelog/sort-providers-by-cost-latency-or-throughput-on-ai-gateway), and more. AI Gateway reflects provider pricing with no markup and does not charge a platform fee on inference, including on [Bring Your Own Key](https://vercel.com/docs/ai-gateway#bring-your-own-key) (BYOK) requests.\n\nLearn more about [AI Gateway](https://vercel.com/docs/ai-gateway), view the [AI Gateway model leaderboard](https://vercel.com/ai-gateway/leaderboards) or try it in our [model playground](https://vercel.com/ai-gateway/models/nemotron-3-ultra-550b-a55b).", "url": "https://wpnews.pro/news/nemotron-3-ultra-now-available-on-ai-gateway", "canonical_source": "https://vercel.com/changelog/nemotron-3-ultra-now-available-on-ai-gateway", "published_at": "2026-06-04 07:00:00+00:00", "updated_at": "2026-06-04 18:53:12.574598+00:00", "lang": "en", "topics": ["large-language-models", "ai-infrastructure", "ai-products", "ai-tools", "ai-agents"], "entities": ["Nemotron 3 Ultra", "Nvidia", "Vercel AI Gateway", "AI SDK"], "alternates": {"html": "https://wpnews.pro/news/nemotron-3-ultra-now-available-on-ai-gateway", "markdown": "https://wpnews.pro/news/nemotron-3-ultra-now-available-on-ai-gateway.md", "text": "https://wpnews.pro/news/nemotron-3-ultra-now-available-on-ai-gateway.txt", "jsonld": "https://wpnews.pro/news/nemotron-3-ultra-now-available-on-ai-gateway.jsonld"}}