NVIDIA Nemotron 3 Ultra is Here – and it’s Free to Use in Kilo

wpnews.pro

cd /news/large-language-models/nvidia-nemotron-3-ultra-is-here-and-… · home › topics › large-language-models › article

[ARTICLE · art-21561] src=blog.kilo.ai ↗ pub=2026-06-04T14:05Z topic=large-language-models verified=true sentiment=↑ positive

NVIDIA Nemotron 3 Ultra is Here – and it’s Free to Use in Kilo

NVIDIA released Nemotron 3 Ultra, a 550-billion-parameter open-weights model, for free use in Kilo Code for a limited time. The model, announced by CEO Jensen Huang at Computex 2026, is the top open model on the PinchBench agentic benchmarking tool and achieves 5x higher throughput than comparable models. The release signals the growing viability of open-weight models for agentic coding tasks.

read3 min views28 publishedJun 4, 2026

The open-weight future is looking bright

We are incredibly excited to announce that ** NVIDIA Nemotron 3 Ultra** is now available to use in Kilo Code!

NVIDIA just dropped a game-changer for agentic coding, and you can experience the most powerful open-weights model right now directly in your terminal, VS Code, or JetBrains IDE, powered by Kilo Code.

Even better?** NVIDIA Nemotron 3 Ultra is FREE in Kilo for a limited time.**

Meet the 550B Heavyweight: Nemotron 3 Ultra

Introduced by NVIDIA CEO Jensen Huang during his keynote last weekend at Computex 2026 in Taipei, **Nemotron 3 Ultra **is NVIDIA’s flagship open-weights model. But its size isn’t just for show—it is incredibly efficient. On stage, Huang noted the model’s high PinchBench score—it’s currently the top open model on the agentic benchmarking tool. As he put it, the model is “frontier smart” and achieves 5x higher throughput compared to other open models in its class.

Nemotron 3 Super, a 120B-parameter open hybrid MoE model NVIDIA released earlier this year, has become daily driver for many on Kilo. But it has its limitations around planning and long-horizon tasks. The release of Nemotron 3 Ultra sends a signal to the industry that open-weight models are here to stay.

Built on a hybrid Mamba-Transformer Mixture-of-Experts (MoE) architecture, it boasts a massive 550 billion total parameters, but only activates 55 billion parameters per token during a forward pass. This means you get the reasoning capabilities of a frontier-class model while maintaining blazing-fast inference speeds, delivering over 300 tokens per second.

Benchmarks & Why it Shines in Kilo Code

Built with contributions from the Nemotron Coalition, Nemotron 3 Ultra was explicitly engineered for tool use, agentic reasoning, and complex coding environments. Here is why it pairs perfectly with Kilo:

The Open-Weights Champion: Nemotron 3 Ultra currently holds the “Best Open-Weights” title on thePinchBench Agentic leaderboardwith an impressive90% median success rate.** High Intelligence and Speed:**It scores a 48 on the Artificial Analysis Intelligence Index, making it the smartest open model from the US to date and placing it in the optimal quadrant for both high capability and fast output speed.A strong Qwen Competitor: In KiloBench, our internal evals, 3 Ultra performed very similarly to Qwen 3.7 Plus. It will be interesting to see who wins for agentic tasks like coding an planning on theKilo Leaderboard.1 Million Token Context Window: Nemotron 3 Ultra natively supports up to 1,000,000 tokens of context. You can load entire codebases, deep API documentation, and massive error logs into Kilo without worrying about forced truncations or the model losing the plot mid-session.Built for Agentic Workflows: The Nemotron 3 family is heavily optimized for multi-environment reinforcement learning (including SWE-RL). It excels at the exact operations coding agents run in their inner loops: multi-step planning, codebase navigation, tool calling, and structured code generation.

Open and Customizable, Deployable Anywhere

At Kilo, we believe in the power of open source. That is why Nemotron 3 Ultra is such a natural fit for our platform. Just as Kilo provides an open-source foundation for your coding workflows, Nemotron 3 Ultra is fully open and customizable, and deployable anywhere.

The Nemotron models are released with open weights, datasets, and recipes, giving organizations total transparency and control to customize models for domain-specific workflows and deploy them exactly where their applications and data reside. Developers can leverage tools like NVIDIA NeMo to customize, evaluate, and optimize the model for their specific use cases. Because the Nemotron family of models is open, organizations can deploy them in entirely self-hosted environments that meet strict regulatory, sovereignty, or data localization requirements—putting you firmly in the driver’s seat.

Give Nemotron 3 Ultra a spin today wherever you use Kilo. It’s totally free in all of our products and features for a limited time!

source & further reading

blog.kilo.ai — original article What’s new: Air gets more agents, local models, and Java/Kotlin code intelligence RubyMine 2026.2: Agentic Debugging, Native GitHub Copilot Integration, Default Symbol-Based Code Insight, and More What’s New in PyCharm 2026.2

~/api · this article 200

$curl api.wpnews.pro/v1/news/nvidia-nemotron-3-ultra-…

Read original on blog.kilo.ai → blog.kilo.ai/p/nvidia-nemotron-3-ultra

mentioned entities

NVIDIA

Nemotron 3 Ultra

Kilo Code

Jensen Huang

Computex 2026

PinchBench

Nemotron 3 Super

metadata

slugnvidia-nemotron-3-ultra-is-here-and-its-free-to-use-in-kilo

topic#large-language-models

secondary4 topics

sentimentpositive

canonicalblog.kilo.ai

navigation

← prevmicroagi's Shift offers free apa…

next →Your boss has token regret

── more in #large-language-models 4 stories · sorted by recency

marktechpost.com · 22 Jul · #large-language-models

Poolside Releases Laguna S 2.1, an Open-Weight Agentic Coding Model Punching Above Its Weight Class on SWE-Bench Multilingual

github.com · 22 Jul · #large-language-models

SynnoDB – Synthesizing Database engines for your workloads

byteiota.com · 22 Jul · #large-language-models

NVIDIA Cosmos 3 Edge: On-Device Robot AI for Developers

jonready.com · 22 Jul · #large-language-models

Agent swarms are great for local AI

── more on @nvidia 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 26 May · #ai-agents

Think, Durable Objects, and the Real Shape of AI Applications

wpnews · 8 Jul · #ai-tools

What's the Future of Clay?

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required