Key Points #
- OpenAI's new GPT-5.6 generation includes the flagship Sol and two cheaper tiers, Terra and Luna.
- Sol beats Anthropic's Claude Mythos 5, especially in agentic coding.
- The US government is restricting access to select partners for now. OpenAI says the policy hurts developers and businesses.
OpenAI's new flagship GPT-5.6 Sol claims a lead over Anthropic's Claude Mythos in agentic coding and goes toe to toe with it in cybersecurity. Access stays limited to a handful of partners for now.
OpenAI has unveiled GPT-5.6 Sol, a new generation of models built to compete with Claude's Mythos class. The limited preview is only open to select partners through the API and Codex, at the explicit direction of the US government. The same government previously yanked Anthropic's Mythos-class model Fable 5 off the market.
OpenAI isn't subtle about its frustration. "We don’t believe this kind of government access process should become the long-term default. It keeps the best tools from users, developers, enterprises, cyber defenders, and global partners who need them."
GPT-5.6 also brings a new layered naming scheme that looks a lot like Claude's. The number (x.6) marks the generation, while Sol, Terra, and Luna are permanent performance tiers that can evolve on their own. Sol is the flagship. Terra matches GPT-5.5 at half the cost. Luna is the budget option. On top of that, there's a "max" mode for deeper reasoning and an "ultra" mode that farms out complex tasks to sub-agents running in parallel.
Sol edges past Claude Mythos in agentic coding #
OpenAI's benchmark numbers put Sol ahead of Anthropic's Claude Mythos 5 in agentic coding. On Terminal-Bench 2.1, Sol scores 88.8 percent. Sol Ultra hits 91.9, Claude Mythos 5 lands at 88 percent, and Fable 5 trails at 84.3.
Sol also shows gains in biology. On GeneBench v1, a benchmark for genomics and quantitative biology, it beats GPT-5.5 (30 percent vs. 22 percent best case) while burning fewer tokens.
On ExploitBench, which tests how well AI agents can find and exploit real security flaws in Google's V8 JavaScript engine all the way to full code execution, Sol matches Mythos Preview's performance while using roughly a third of the output tokens, OpenAI says.
On ExploitGym, a benchmark built by UC Berkeley researchers with OpenAI and other labs, all three GPT-5.6 models get better as reasoning effort goes up. That points to room for scaling with more compute. Claude numbers for this benchmark aren't available yet.
OpenAI calls Sol its most capable cybersecurity model yet but frames it as a defender, not an attacker. The model is better at spotting and fixing flaws than at running full end-to-end attacks on its own, the company says. Mythos pulled that off in a different benchmark.
In tests with Chromium and Firefox, Sol found bugs and exploitation primitives but never produced an autonomous full-chain exploit. OpenAI says GPT-5.6 Sol is still below the "Cyber Critical" threshold in its Preparedness Framework.
Pricing, availability, and a Cerebras launch in July #
Per million tokens, OpenAI charges $5 input and $30 output for Sol, $2.50 and $15 for Terra, and $1 and $6 for Luna. The company has also revamped its prompt caching system with explicit cache breakpoints and a guaranteed minimum lifetime of 30 minutes. Cache writes cost 1.25x the regular input price. Cache reads still get a 90 percent discount.
Since Sol uses fewer tokens to match or beat competitors across several benchmarks, the effective cost per task could end up lower than previous generations. That would push back against the trend of AI models getting pricier with each release, a frequent criticism lately, and a competitive weak spot against cheaper Chinese models.
In July, Sol is set to go live on Cerebras at up to 750 tokens per second.
AI News Without the Hype – Curated by Humans
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
Subscribe now