Case Study: Pydantic AI Tool Context

A case study comparing Codex GPT-5.5 runs with and without GitHits found that using GitHits reduced token usage by 508,192 and saved 90 seconds when fixing Pydantic AI tools to use context-aware patterns. The fix involved changing tool decorators from @agent.tool_plain to @agent.tool with RunContext[SupportDeps] to enable per-run dependency access for tenant routing.

Back to blog /blog/ June 14, 2026 · 4 min read Case Study: Pydantic AI Tool Context A measured Codex run fixing Pydantic AI tools that ignored per-run dependencies. The fixture is a Pydantic AI support-routing agent. Each run passes tenant-local data through Pydantic AI dependencies: tenant, region, ticket priorities, and escalation policies. The generated code created the agent with deps type=SupportDeps , but registered both tools with @agent.tool plain . Those tools cannot receive RunContext , so they returned global fallback values. Both runs used Codex GPT-5.5 against the same fixture. The prompt was: Fix this Python fixture so pytest succeeds, preserving tenant-aware Pydantic AI routing tools. The target package was pydantic-ai 2.0.0b7 . Case study replay Pydantic AI tenant routing tools model Codex GPT-5.5Fix this Python fixture so pytest succeeds, preserving tenant-aware Pydantic AI routing tools. Without GitHits - tokens - 0 - time - 0s / 189s - Ready. Click "Watch Replay" to start. - The Pydantic AI tools now use @agent.tool with RunContext SupportDeps , so tenant, region, priorities, and escalation policies come from the active per-run dependencies. With GitHits - tokens - 0 - time - 0s / 99s - Ready. Click "Watch Replay" to start. - Used GitHits to confirm Pydantic AI 2.0.0b7's context-aware tool pattern, then changed the two tools from tool plain to tool with RunContext SupportDeps . Result | Run | Time | Tokens | Tools | |---|---|---|---| | With GitHits | 99s | 393,469 | 21 | | Without GitHits | 189s | 901,661 | 28 | Both runs produced a passing patch. The GitHits run used 508,192 fewer processed tokens and finished 90 seconds sooner. Failure The tools were registered with tool plain : php @agent.tool plain def lookup ticket ticket id: int - str: return f"{DEFAULT TENANT}:{ticket id}:{DEFAULT PRIORITY}" @agent.tool plain def escalation policy ticket id: int - str: return f"{DEFAULT TENANT}:{DEFAULT PRIORITY}:" f"{DEFAULT POLICY}:{DEFAULT REGION}" tool plain is correct for tools that do not need the run context. These tools depend on SupportDeps . The tests checked data flow: acme and globex must route differently. lookup ticket and escalation policy must read the same per-run dependencies.- Unknown tickets should still fall back to normal and standard , but the fallback must keep the active tenant and region. Fix Use context-aware tools and read ctx.deps : python from pydantic ai import Agent, RunContext @agent.tool def lookup ticket ctx: RunContext SupportDeps , ticket id: int - str: priority = ctx.deps.priorities.get ticket id, DEFAULT PRIORITY return f"{ctx.deps.tenant}:{ticket id}:{priority}" @agent.tool def escalation policy ctx: RunContext SupportDeps , ticket id: int - str: priority = ctx.deps.priorities.get ticket id, DEFAULT PRIORITY policy = ctx.deps.escalation policies.get priority, DEFAULT POLICY return f"{ctx.deps.tenant}:{priority}:{policy}:{ctx.deps.region}" The patch depends on three package facts: @agent.tool is the decorator for tools that receive RunContext . RunContext SupportDeps gives the tool access to the active dependency object.- Both tools need to read ctx.deps ; fixing only one still leaves inconsistent routing. Trace The replay shows seven GitHits tool calls: - Two search calls for Pydantic AI docs around agent.tool , RunContext , deps , and tool plain . - Three docs read calls on the current tools documentation. - One code grep call for the def tool implementation. - One code read call on the package source where Agent.tool is defined. Those calls gave the agent the package contract before editing: use @agent.tool when a tool needs RunContext , then access per-run dependencies through ctx.deps . The no-GitHits run had to reconstruct the same information from the local environment. It searched installed pydantic ai internals, found the package path, read exports, read agent source, read run-context source, patched, tested, cleaned local artifacts, reread the file, and tested again. That local probing accounts for most of the 508k-token gap. Evidence A passing fixture test proves the local behavior for the tested cases. The docs and source establish that the patch uses the intended Pydantic AI mechanism. The GitHits trace had a short evidence chain: - Docs showed the context-aware tool pattern for the current package. - Source search found the Agent.tool implementation surface. - Source reads confirmed that this was the right API boundary. pytest verified tenant-specific behavior in the fixture. The package has multiple valid tool decorators. The evidence points to the one that matches the data-flow requirement. The final patch did not rewrite the routing model, change the tests, or move tenant data into prompts. It changed the decorator and dependency access in the two tools. Accuracy Risk The incorrect alternatives are close to the correct patch: - Keep tool plain and add globals. - Put tenant data into the prompt. - Capture dependencies in a closure instead of using Pydantic AI’s run context. - Fix lookup ticket but leave escalation policy on defaults. - Change the tests to match global fallback behavior. All of those preserve the original bug or create a more brittle fixture. The GitHits run found the package mechanism in docs and source before editing. The no-GitHits run found the same mechanism by probing the installed package.