Agent-stdlib: A standard library for building agents Anthropic released agent-stdlib, a standard library for building AI agents that packages Claude Code skills, MCP servers, and a tool-gating hook derived from its engineering blog. The library includes components for evaluation, sandboxing, multi-agent orchestration, and parallel autonomous agents, with most parts being model-agnostic and portable across platforms like OpenCode and Cursor. A standard library for building agents. Anthropic's engineering blog documents how to build, evaluate, and run agents in production. Most of that knowledge never ships as something you can install; it stays prose you reopen when you hit the problem it solves. agent-stdlib packages the parts nobody else has: Claude Code skills, a few MCP servers, and a tool-gating hook. Each component names the article it comes from and says how it differs from any skill that already covers similar ground. The pack ships only what was missing. Topics that strong community skills already handle stay out, with pointers below. | Skill | What it gives you | Source article | |---|---|---| build-agent-evals | Build automated evals for an agent: pick a grader, choose pass@k vs pass^k, run the zero-to-one roadmap | | calibrate-eval-infrastructure Quantifying infrastructure noise in agentic coding evals https://www.anthropic.com/engineering/infrastructure-noise coding-agent-scaffold Raising the bar on SWE-bench Verified https://www.anthropic.com/engineering/swe-bench-sonnet durable-agent-architecture Scaling Managed Agents https://www.anthropic.com/engineering/managed-agents sandboxing-agentic-systems How we contain Claude https://www.anthropic.com/engineering/how-we-contain-claude using-the-think-step The "think" tool https://www.anthropic.com/engineering/claude-think-tool multi-agent-orchestration How we built our multi-agent research system https://www.anthropic.com/engineering/multi-agent-research-system parallel-autonomous-agents Building a C compiler with parallel Claudes https://www.anthropic.com/engineering/building-c-compiler Add the marketplace and install the plugin: /plugin marketplace add pebeto/agent-stdlib /plugin install agent-stdlib@agent-stdlib Skills trigger themselves when a task matches their description. You can also load one explicitly with the Skill tool. Three servers live under mcp-servers/ , each paired with a skill. They need uv https://docs.astral.sh/uv/ , which installs each server's one dependency from the script header on first run. think enabled . The no-op think tool, paired with using-the-think-step . tool-gateway enabled . search tools and call tool over a larger catalog, so the agent reaches many tools through two. Paired with the tool-scaling guidance in advanced-tool-use . code-execution opt-in . Presents tools as importable code and runs composed Python in a subprocess. It executes model-written code, so it is not enabled by default. Turn it on once you have wrapped it in real isolation; see mcp-servers/code-execution/README.md and the sandboxing-agentic-systems skill. think and tool-gateway are wired into the plugin's .mcp.json . To enable code-execution , point your client's MCP config at uv run .../mcp-servers/code-execution/server.py . runs the orchestrator-worker flow: it decomposes the question, dispatches /research