June 2026 is shaping up to be the month open models stopped playing catch-up. Three major releases in as many weeks have shifted the landscape, and none of them involve the usual frontier-lab drama.
On June 4, NVIDIA quietly dropped Nemotron 3 Ultra — a 550-billion-parameter behemoth under a fully permissive open license. That's not "open-weight with strings attached" — it's the most capable model you can download, modify, and deploy commercially without asking permission. Early benchmarks show it competitive with GPT-4.5-class models on code generation and reasoning tasks, while significantly outperforming Llama 4 on mathematical reasoning. If you have the hardware (think 8×H100 nodes minimum), this is the new default for self-hosted enterprise AI.
Z.AI launched GLM-5.2 on June 13, and it arrived with full MIT-licensed weights within the week. What makes this noteworthy isn't just the permissive license — it's that GLM-5.2 punches well above its weight class on long-context retrieval and multilingual benchmarks. Developers running locally can deploy it on consumer-grade hardware with quantization, making it a strong contender for privacy-sensitive applications. The API tier starts at ~$18/month, but the real value is in the self-hosted path.
Google DeepMind also shipped computer use capabilities in Gemini 3.5 Flash this month. Think Claude's computer-use agent paradigm, but running on the fastest Flash-tier model Google offers. Early demos show agents completing multi-step browser tasks — form filling, data extraction, web scraping — at significantly lower latency than competing solutions.
The throughline is clear: open models are no longer a compromise. Whether you need 550B monsters for reasoning, MIT-licensed alternatives for compliance, or fast agents for automation, June 2026 delivered on all fronts.