I Ran Claude Code on Every New Claude Model. Here's What Actually Ships. An engineer spent a month routing all of Anthropic's 2026 Claude models—Haiku, Sonnet 4.6, Opus 4.8, Fable 5, and Mythos 5—through Claude Code across real codebases. The developer found that Sonnet 4.6 should be the default for everyday coding, Opus 4.8 excels at judgment-heavy tasks, and Fable 5 is best for long-horizon migrations. The key insight is that routing tasks to the right model doubled throughput and avoided wasted token costs. Fable, Mythos, Opus 4.8, Sonnet 4.6, Haiku — Anthropic's 2026 lineup is no longer "one model you talk to." It's a fleet you route between. I spent a month inside Claude Code orchestrating all of them across real codebases. Here's which model to reach for, when, and the routing playbook that quietly doubled my throughput. Last time I wrote about Claude Skills and called Claude Code the killer host for them. Since then, two things happened that changed how I work day to day. First, the models got genuinely strange-good . In the span of a few months Anthropic shipped Sonnet 4.6, Opus 4.8, and then an entirely new tier above Opus — the Mythos class — released to the public as Claude Fable 5 . We went from "the AI suggested a decent diff" to Stripe reporting that Fable 5 ran a codebase-wide migration on a 50-million-line Ruby codebase in a single day — work that would've taken a team over two months by hand. Second, Claude Code stopped being a single-model tool. With a fleet of models at different price/speed/intelligence points, the highest-leverage skill in 2026 isn't prompting — it's routing . Knowing which model to put on which task is the difference between burning $200 of tokens on a typo fix and one-shotting a multi-service refactor. So I did the obvious thing: I wired all of them into Claude Code and ran them against real work for a month — bug fixes, migrations, greenfield features, test suites, the boring stuff and the scary stuff. This is what I learned. Forget "Claude" as one thing. In 2026 it's a graded ladder, and each rung exists for a reason. | Model | Class | Sweet spot | Price in / out per M tokens | |---|---|---|---| Haiku | Fast tier | High-volume, latency-sensitive, cheap glue work | Lowest | Sonnet 4.6 | Workhorse | Everyday coding, agents, 1M context | $3 / $15 | Opus 4.8 | Heavy lifter | Architecture, refactors, judgment-heavy work | $5 / $25 $10 / $50 fast mode | Fable 5 | Mythos-class safe | Long-horizon, frontier coding, vision, research | $10 / $50 | Mythos 5 | Mythos-class restricted | Cyber defense, life sciences — vetted access only | $10 / $50 | A few things worth knowing about how these actually relate: Here's the mental model I settled on after a month. Think of it as a triage flow: php flowchart TD A New task -- B{How long-horizon