Claude API Tool Versions: response_inclusion Cuts Agent Bloat Now

wpnews.pro

cd /news/ai-tools/claude-api-tool-versions-response-in… · home › topics › ai-tools › article

[ARTICLE · art-34912] src=byteiota.com ↗ pub=2026-06-20T14:23Z topic=ai-tools verified=true sentiment=↑ positive

Claude API Tool Versions: response_inclusion Cuts Agent Bloat Now

Anthropic updated three Claude API tool versions on June 19, introducing a response_inclusion parameter that strips consumed tool results from agent responses to reduce bloat and costs, and a code execution tool version that discloses the 90-second per-cell limit to Claude for better reasoning. The updates are generally available with no beta header required.

read3 min views1 publishedJun 20, 2026

Claude API Tool Versions: response_inclusion Cuts Agent Bloat Now — Image: Byteiota (auto-discovered)

On June 19, Anthropic updated three Claude API tool versions in the release notes — no blog post, no announcement. The headline feature is response_inclusion

, a new parameter in web_search_20260318

and web_fetch_20260318

that strips consumed tool results from your agent responses. The second is code_execution_20260521

, which finally exposes the 90-second per-cell limit in the tool description so Claude can actually reason about it. Both are GA. No beta header required.

The Agentic Bloat Problem #

Here is what happens in a five-step web research agent without this optimization:

Step 1: Claude fetches a webpage — 5,000 tokens of raw HTML land in the result block.
Step 2: Claude processes the result, extracts the key data.
Step 3: Claude searches for related information — another 3,000 tokens.
Step 4: Claude cross-references and synthesizes.
Step 5: Final answer returned to your application.

Without response_inclusion

, your API response carries all of those tool result blocks — the raw HTML, the search payloads — even though Claude already consumed them in steps 2 and 4. You are paying output token rates to transmit data your client never asked for and will throw away. At Claude Sonnet 4.6 output pricing ($15 per million tokens), an agent running 1,000 times a day that carries 3,000 unnecessary tokens per run costs an extra $16,000 a year.

What response_inclusion Does #

The new parameter tells the API whether to include consumed tool result blocks in the response. When a search or fetch result was consumed by Claude in the same turn — processed, used, done — you can drop it from the response payload entirely. The underlying execution still happened. Claude still saw the result. You just do not carry the raw output forward.

Upgrade is a one-line change:

tools=[{"type": "web_search_20260209", "name": "web_search"}]

tools=[{
    "type": "web_search_20260318",
    "name": "web_search",
    "response_inclusion": "none"
}]

The same swap applies to web_fetch_20260318

. No beta header. Supported on Claude Fable 5, Opus 4.8, Mythos 5, Opus 4.7, Opus 4.6, and Sonnet 4.6. Check the web search tool documentation for the full parameter reference.

The code execution tool has always had a 90-second per-cell wall-clock limit. Code that exceeds it returns a detection_timeout

result. The problem: Claude had no way to know about this limit from the tool spec itself, so it would generate long-running computations and hit the timeout without any model-level awareness that the constraint existed.

code_execution_20260521

does not change the limit. It discloses it. The tool description Claude reads now states the 90-second constraint explicitly. Claude can now structure multi-step computations across cells to stay within budget, flag operations likely to time out, and reason about execution cost in its planning — rather than writing code and discovering the limit at runtime. See the code execution tool documentation for the full updated spec.

tools=[{"type": "code_execution_20260521", "name": "code_execution"}]

How to Upgrade Today #

All three versions are GA, no beta header required. To upgrade your agent:

Replace web_search_20260209

withweb_search_20260318

and add"response_inclusion": "none"

Replace web_fetch_20260209

withweb_fetch_20260318

and add"response_inclusion": "none"

Replace code_execution_20260120

withcode_execution_20260521

The Claude Developer Platform release notes confirm that dynamic filtering from 20260209

— which cut input tokens by roughly 24% by post-processing search results before they hit the context window — carries forward in 20260318

. You lose nothing, you gain the response optimization.

These are not demo-era features anymore. The Anthropic tool layer is being optimized for production economics, version by version: input tokens in February, output tokens in June. If you are running agents in production with Claude with native tools and have not upgraded your tool versions recently, this is worth a fifteen-minute update pass. The broader guide to controlling token costs in Claude agent workflows is also worth a read if you want to go deeper on context management.

source & further reading

byteiota.com — original article Claude Platform WIF Is GA: Ditch the Static API Key Now Stack Overflow for Agents: AI Coding Memory Layer Lands AutoJack: One Web Page Can RCE the Host Running Your AI Agent

~/api · this article 200

$curl api.wpnews.pro/v1/news/claude-api-tool-versions…

Read original on byteiota.com → byteiota.com/claude-api-tool-versions-response_i…

mentioned entities

Anthropic

Claude

Claude Sonnet 4.6

Claude Fable 5

Opus 4.8

Mythos 5

Opus 4.7

Opus 4.6

metadata

slugclaude-api-tool-versions-response-inclusion-cuts-agent-bloat-now

topic#ai-tools

secondary2 topics

sentimentpositive

canonicalbyteiota.com

navigation

← prevShow HN: Multi-Agent AI trading …

next →How to become an AI infrastructu…

── more in #ai-tools 4 stories · sorted by recency

dev.to · 20 Jun · #ai-tools

"I Stopped Pretending Every AI Provider Was the Same"

dev.to · 20 Jun · #ai-tools

I Ran Claude Code on Every New Claude Model. Here's What Actually Ships.

technicalstrat.com · 20 Jun · #ai-tools

Two production Next.js apps, built solo with Cursor+Claude, $13,945

augmentedswe.com · 20 Jun · #ai-tools

Claude Code learning hub

── more on @anthropic 3 stories trending now

wpnews · 19 Jun · #artificial-intelligence

From Dream Job to 'The Gulag': Inside Staff Revolt Zuckerberg's Brutal AI Push

wpnews · 19 Jun · #artificial-intelligence

Stop Guessing Which Library to Use — I Built an AI Capability Discovery Engine

wpnews · 19 Jun · #large-language-models

I Cut My AI Agent's Token Bill by 62% in One Weekend. Here's the Receipts.

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required

Claude API Tool Versions: response_inclusion Cuts Agent Bloat Now

The Agentic Bloat Problem #

What response_inclusion Does #

Code Execution: Claude Was Always Flying Blind #

How to Upgrade Today #

Run your AI side-project on zahid.host