The weights are identical. The invoice is not.
Call Claude Sonnet 4.6 through Anthropic’s API, through AWS Bedrock, or through OpenRouter and you get the same model — same tokenizer, same outputs, same context window. What changes is everything around the weights: how you’re billed, which credits you can burn, how fast a new model lands, and how you authenticate. This isn’t a model benchmark. It’s a delivery comparison — three doors to the same room — for a developer already shipping with Claude.
Short version first, then the receipts.
The 10-second answer #
On AWS with committed spend or credits → Bedrock. Same per-token price, and usage retires against the AWS bill you already negotiated.**Want the newest Claude day-one, or frictionless caching/batch → Anthropic first-party.**Route across model vendors, want one key, no contract → OpenRouter. Passthrough pricing, multi-provider failover, one API.
No clean match? Default to Anthropic first-party — it’s the reference everything else chases.
The numbers (as of 2026-06-29) #
Per-token list price is the same on all three. Nobody marks up the tokens.
| Axis | Anthropic | AWS Bedrock | OpenRouter |
|---|---|---|---|
| Sonnet 4.6 ($/Mtok in/out) | $3 / $15 | $3 / $15 | $3 / $15 |
| Opus 4.8 ($/Mtok in/out) | $5 / $25 | $5 / $25 | $5 / $25 |
| Surcharge | inference_geo:"us" 1.1x (Opus 4.6+) |
+10% regional endpoints | 5.5% credit-load fee; 5% BYOK tail >1M req/mo |
| Spend cloud credits | no | yes — retires AWS commit | no |
| New-model freshness | day-one | lag + per-region rollout | usually fast (routes upstream) |
| Auth | API key | IAM / SigV4 | OpenRouter key (OpenAI-compatible) |
| Failover | Anthropic owns it | AWS SLA; you wire multi-region | built-in multi-provider |
So the decision is never “who’s cheapest per token.” It’s “whose billing, freshness, and limits fit how I operate.”
Anthropic first-party #
Flat $5 / $25 on Opus, $3 / $15 on Sonnet, global routing by default, no surcharge. One knob: on Opus 4.6+, inference_geo: "us"
(US-only residency) applies a 1.1x multiplier across input, output, and cache tokens. Leave it global and you pay standard. You also get every new Claude first — this is where Anthropic launches — and the simplest billing: prepaid credits or a monthly USD invoice, no fee math.
AWS Bedrock #
Matches direct pricing in standard regions, with the same caching and batch mechanics. Where it diverges:
Regional endpoints cost ~10% more than global (guaranteed data routing through a specific geography). If compliance forces you regional, budget the premium.New models lag the direct API and roll out region by region.“On Bedrock” rarely means “in every Bedrock region at once.” Check the region table before you architect.** Non-token costs can dwarf the model bill**on RAG/agents: CloudWatch log ingestion, OpenSearch Serverless for Knowledge Bases (~$350/mo floor), per-step Agent charges. Not model price, but real spend.
What you buy: IAM, VPC endpoints, KMS, CloudTrail, and the compliance umbrellas (HIPAA, FedRAMP). For regulated work that posture often beats chasing a few percent on tokens. (Note: Claude Platform on AWS is a newer Anthropic-operated path billed through AWS Marketplace in Claude Consumption Units at $0.01/CCU — same underlying rates, different mechanics. Standard Bedrock via bedrock-runtime
is the partner-operated path that matches direct.)
OpenRouter #
Tokens pass through untouched — no inference markup. The fee is on the money, not the tokens: credits with a card costs 5.5% plus a floor, so a $100 top-up nets ~$94.50 of inference; bring-your-own-key is free to 1M requests/month, then 5% of the on-platform call cost. Budget ~5–7% all-in.
What you buy: automatic failover across providers, per-key budget caps, one billing view across every model vendor. For Claude Code specifically, OpenRouter now exposes an Anthropic-compatible endpoint — thinking blocks, native tool use, streaming, and multi-turn context pass through with no local proxy, so the only overhead is the credit fee.
Billing and credits: the real decision #
For most teams the choice settles here, not on price. If your company carries an **AWS Enterprise Discount Program commitment** or a pile of AWS credits, Bedrock lets Claude usage **retire against spend you’ve already promised to make**. That’s not a discount — it’s making a sunk commitment do double duty, and it usually beats any caching optimization. One AWS invoice, existing terms, no new vendor onboarding. OpenRouter’s equivalent lever isn’t credits, it’s breadth: one balance funds Claude, GPT, Gemini, and a long tail behind a single OpenAI-compatible endpoint, with a kill-switch to reroute off a vendor. Anthropic first-party is the clean option when you have no cloud-credit story and just want to pay for what you use.
## Cost levers (work on every channel)
These apply no matter which door you pick:
Prompt caching— cache reads cost ~0.1x base input (Sonnet cached input drops $3 → $0.30/Mtok). Biggest lever for repeated system prompts or RAG prefixes. Writes are 1.25x (5-min) / 2x (1-hour).Batch API— 50% off input and output for async work (classification, tagging, bulk summarization). Sonnet batch is $1.50 / $7.50. First-party and Bedrock only; OpenRouter routes sync.Model routing— don’t run Opus for work Sonnet ($3/$15) or Haiku ($1/$5) handles. That’s up to a 5x spread. (The same active-vs-total-params instinct that governspicking a local modelapplies here: match the model to the job, not to the brand.)Tokenizer drift— Opus 4.7+ uses a new tokenizer that can map the same text to ~1.0–1.35x more tokens. List price is unchanged, but youreffectivetask cost can rise. Measure your own counts; don’t trust a word-count estimate.
Verdict by situation #
Same model, different doors. Pick by your actual constraint:
“We run on AWS / have committed spend.”→ Bedrock.Parity pricing, usage retires against your AWS bill, IAM you already manage. Accept the freshness lag; check the region table.“I want the newest model day-one and frictionless caching/batch.”→ Anthropic first-party.The reference path.“I route across vendors / want one key / have no contract.”→ OpenRouter.Passthrough tokens, multi-provider failover, OpenAI-compatible. Pay the few-percent credit fee for consolidation.“I need strict data residency.”→ First-party(inference_geo
) orBedrock regional(+10%). Not OpenRouter — it routes to whichever upstream serves the model. (If residency is a hard line for you, it’s worth treatingavailability as a regulatory variable, not just a billing one.)
The one-line rule: choose the provider whose billing and operational model fits how you already work — the tokens cost the same no matter which door you walk through.
We re-verify these numbers every quarter. Reply with your provider and region if you want the matrix expanded for your case.
Sources #
Verify against the official pages before budgeting — third-party tables lag.
- Anthropic pricing:
[https://platform.claude.com/docs/en/about-claude/pricing](https://platform.claude.com/docs/en/about-claude/pricing) - Claude Opus on Amazon Bedrock (AWS):
https://aws.amazon.com/blogs/machine-learning/claude-opus-4-5-now-in-amazon-bedrock/ - OpenRouter Claude model page:
[https://openrouter.ai/anthropic/claude-opus-4.5](https://openrouter.ai/anthropic/claude-opus-4.5) - OpenRouter + Claude Code setup:
[https://openrouter.ai/blog/tutorials/claude-code-openrouter/](https://openrouter.ai/blog/tutorials/claude-code-openrouter/)
Price table verified against Anthropic, AWS Bedrock, and OpenRouter live docs on 2026-06-29. Provider terms change; confirm current rates before committing.