cd /news/large-language-models/nvidia-nemotron-3-ultra-glm-5-2-the-… · home topics large-language-models article
[ARTICLE · art-44803] src=dev.to ↗ pub= topic=large-language-models verified=true sentiment=↑ positive

NVIDIA Nemotron 3 Ultra & GLM-5.2: The Open Model Flood Is Here (June 2026)

NVIDIA released Nemotron 3 Ultra, a 550-billion-parameter open model under a fully permissive license, competitive with GPT-4.5 on code and reasoning. Z.AI launched GLM-5.2 with MIT-licensed weights, excelling in long-context and multilingual tasks on consumer hardware. Google DeepMind added computer use capabilities to Gemini 3.5 Flash, enabling low-latency browser automation. These June 2026 releases demonstrate that open models now rival proprietary ones across performance, licensing, and deployment flexibility.

read1 min views1 publishedJun 30, 2026

June 2026 is shaping up to be the month open models stopped playing catch-up. Three major releases in as many weeks have shifted the landscape, and none of them involve the usual frontier-lab drama.

On June 4, NVIDIA quietly dropped Nemotron 3 Ultra — a 550-billion-parameter behemoth under a fully permissive open license. That's not "open-weight with strings attached" — it's the most capable model you can download, modify, and deploy commercially without asking permission. Early benchmarks show it competitive with GPT-4.5-class models on code generation and reasoning tasks, while significantly outperforming Llama 4 on mathematical reasoning. If you have the hardware (think 8×H100 nodes minimum), this is the new default for self-hosted enterprise AI.

Z.AI launched GLM-5.2 on June 13, and it arrived with full MIT-licensed weights within the week. What makes this noteworthy isn't just the permissive license — it's that GLM-5.2 punches well above its weight class on long-context retrieval and multilingual benchmarks. Developers running locally can deploy it on consumer-grade hardware with quantization, making it a strong contender for privacy-sensitive applications. The API tier starts at ~$18/month, but the real value is in the self-hosted path.

Google DeepMind also shipped computer use capabilities in Gemini 3.5 Flash this month. Think Claude's computer-use agent paradigm, but running on the fastest Flash-tier model Google offers. Early demos show agents completing multi-step browser tasks — form filling, data extraction, web scraping — at significantly lower latency than competing solutions.

The throughline is clear: open models are no longer a compromise. Whether you need 550B monsters for reasoning, MIT-licensed alternatives for compliance, or fast agents for automation, June 2026 delivered on all fronts.

── more in #large-language-models 4 stories · sorted by recency
── more on @nvidia 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/nvidia-nemotron-3-ul…] indexed:0 read:1min 2026-06-30 ·