# What's new in Claude Opus 4.8

> Source: <https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-8>
> Published: 2026-05-28 17:13:56+00:00

We use cookies to deliver and improve our services, analyze site usage, and if you agree, to customize or personalize your experience and market our services to you. You can read our Cookie Policy [here](https://www.anthropic.com/legal/cookies).

Claude Opus 4.8 is Anthropic's most capable generally available model to date. It builds on Claude Opus 4.7. This page summarizes everything new at launch, including fast mode (research preview on the Claude API) and a lower 1,024-token minimum cacheable prompt length.

| Model | API model ID | Description |
|---|---|---|
| Claude Opus 4.8 | claude-opus-4-8 | Anthropic's most capable model for complex reasoning, long-horizon agentic coding, and high-autonomy work |

Claude Opus 4.8 supports the [1M token context window](/docs/en/build-with-claude/context-windows) by default on the Claude API, Amazon Bedrock, and Vertex AI (200k on Microsoft Foundry), 128k max output tokens, [adaptive thinking](/docs/en/build-with-claude/adaptive-thinking), and the same set of tools and platform features as Claude Opus 4.7.

For complete pricing and specs, see the [models overview](/docs/en/about-claude/models/overview).

Claude Opus 4.8 accepts `role: "system"`

messages immediately after a user turn in the `messages`

array (subject to [placement rules](/docs/en/build-with-claude/mid-conversation-system-messages#limitations)). This lets you append updated instructions later in a long-running conversation without restating the full system prompt, which preserves [prompt cache](/docs/en/build-with-claude/prompt-caching) hits on the earlier turns and reduces input cost on agentic loops. No beta header is required. See [Mid-conversation system messages](/docs/en/build-with-claude/mid-conversation-system-messages) for usage details.

The `stop_details`

object on refusal responses (available since Claude Opus 4.7) is now publicly documented. When Claude declines to complete a request, this object describes the category of refusal, in addition to the existing `refusal`

stop reason, making it easier for your application to tell apart different classes of declined request and to route the user to the right next step. No beta header is required. See [Handling stop reasons](/docs/en/build-with-claude/handling-stop-reasons) for the category list and handling guidance.

The [effort parameter](/docs/en/build-with-claude/effort) default on Claude Opus 4.8 is `high`

on all surfaces, including the Claude API and Claude Code. If you set effort explicitly today, your setting is unchanged. See [Effort](/docs/en/build-with-claude/effort) for per-level guidance.

[Fast mode](/docs/en/build-with-claude/fast-mode) is now available for Claude Opus 4.8 as a research preview on the Claude API. Set `speed: "fast"`

to get up to 2.5x higher output tokens per second from the same model at premium pricing. See [Fast mode](/docs/en/build-with-claude/fast-mode) for access, supported models, and pricing.

The minimum cacheable prompt length on Claude Opus 4.8 is 1,024 tokens, lower than on Claude Opus 4.7. Prompts that were too short to cache on Claude Opus 4.7 can now create cache entries with no code changes. See [Prompt caching](/docs/en/build-with-claude/prompt-caching#cache-limitations) for per-model minimums.

These constraints are unchanged from Claude Opus 4.7, so code that already runs on Claude Opus 4.7 needs no changes. They apply to the Messages API only; Claude Managed Agents are unaffected.

Setting `temperature`

, `top_p`

, or `top_k`

to a non-default value returns a 400 error on Claude Opus 4.8, same as on Claude Opus 4.7. Omit these parameters and use prompting to guide the model's behavior.

Like Claude Opus 4.7, Claude Opus 4.8 does not support extended thinking budgets. Setting `thinking: {"type": "enabled", "budget_tokens": N}`

returns a 400 error. Use [adaptive thinking](/docs/en/build-with-claude/adaptive-thinking) and the [effort parameter](/docs/en/build-with-claude/effort) to control thinking depth.

```
# Before (Opus 4.6 or earlier)
thinking = {"type": "enabled", "budget_tokens": 32000}

# After (Opus 4.7 and later)
thinking = {"type": "adaptive"}
output_config = {"effort": "high"}
```

Compared with Claude Opus 4.7, Claude Opus 4.8 targets behavioral improvements in:

With [adaptive thinking](/docs/en/build-with-claude/adaptive-thinking) enabled, Claude Opus 4.8 triggers reasoning only when it judges the turn needs it. On simple lookups and short agentic steps it responds directly; on complex multi-step problems it reasons before answering. This reduces wasted thinking tokens on bimodal workloads compared to Claude Opus 4.7 at the same effort level. As on Claude Opus 4.7, thinking is off unless you explicitly set `thinking: {type: "adaptive"}`

in your request.

These are not API breaking changes but may require prompt updates. See [Migrating to Claude Opus 4.8](/docs/en/about-claude/models/migration-guide#migrating-from-claude-opus-47) for full guidance.

For step-by-step migration instructions and the full migration checklist, see [Migrating to Claude Opus 4.8](/docs/en/about-claude/models/migration-guide#migrating-from-claude-opus-47). If you use Claude Code or the Agent SDK, the [Claude API skill](/docs/en/agents-and-tools/agent-skills/claude-api-skill) can apply these migration steps to your codebase automatically.

Step-by-step upgrade instructions from Claude Opus 4.7.

Per-level effort guidance, including the new defaults.

The only supported thinking-on mode on Claude Opus 4.8.

How mid-conversation system messages preserve cache hits.

Refusal stop details and how to handle them.

Higher output speed at premium pricing.

Was this page helpful?