A standard library for building agents.
Anthropic's engineering blog documents how to build, evaluate, and run agents in production. Most of that knowledge never ships as something you can install; it stays prose you reopen when you hit the problem it solves. agent-stdlib
packages the parts nobody else has: Claude Code skills, a few MCP servers, and a tool-gating hook.
Each component names the article it comes from and says how it differs from any skill that already covers similar ground. The pack ships only what was missing. Topics that strong community skills already handle stay out, with pointers below.
| Skill | What it gives you | Source article |
|---|---|---|
build-agent-evals |
||
| Build automated evals for an agent: pick a grader, choose pass@k vs pass^k, run the zero-to-one roadmap | ||
calibrate-eval-infrastructure
Quantifying infrastructure noise in agentic coding evalscoding-agent-scaffold
Raising the bar on SWE-bench Verifieddurable-agent-architecture
Scaling Managed Agentssandboxing-agentic-systems
How we contain Claudeusing-the-think-step
The "think" toolmulti-agent-orchestration
How we built our multi-agent research systemparallel-autonomous-agents
Building a C compiler with parallel ClaudesAdd the marketplace and install the plugin:
/plugin marketplace add pebeto/agent-stdlib
/plugin install agent-stdlib@agent-stdlib
Skills trigger themselves when a task matches their description. You can also load one explicitly with the Skill
tool.
Three servers live under mcp-servers/
, each paired with a skill. They need uv, which installs each server's one dependency from the script header on first run.
think(enabled). The no-opthink
tool, paired withusing-the-think-step
.tool-gateway(enabled).search_tools
andcall_tool
over a larger catalog, so the agent reaches many tools through two. Paired with the tool-scaling guidance inadvanced-tool-use
.code-execution(opt-in). Presents tools as importable code and runs composed Python in a subprocess. It executes model-written code, so it is not enabled by default. Turn it on once you have wrapped it in real isolation; seemcp-servers/code-execution/README.md
and thesandboxing-agentic-systems
skill.
think
and tool-gateway
are wired into the plugin's .mcp.json
. To enable code-execution
, point your client's MCP config at uv run .../mcp-servers/code-execution/server.py
.
runs the orchestrator-worker flow: it decomposes the question, dispatches/research <question>
research-worker
subagents in parallel, and synthesizes a cited answer. Paired withmulti-agent-orchestration
.sets up lock-file coordination for unsupervised agents on one repo, using/autonomous-loop
scripts/locks.py
andscripts/autonomy_loop.sh
. Paired withparallel-autonomous-agents
.action-gating is aPreToolUse
hook that tiers Bash commands by risk and denies or asks on the dangerous ones. It stays off until you setAGENT_STDLIB_GATING=warn
orenforce
, and it only ever adds friction. Seehooks/README.md
.
Most of this pack is not Claude-specific. The MCP servers speak the open MCP protocol, the scripts are plain Python and Bash, and the skill content is harness-neutral procedural knowledge. To use it in OpenCode, Cursor, Cline, or a custom agent on any model, see AGENTS.md, which maps each component to its portable form.
These topics from the same blog have solid community skills, so they stay out of this pack. Reach for these instead:
Choosing an agent pattern(prompt chaining, routing, orchestrator-workers):markpitt/claude-skills
→agent-patterns
Context engineering(compaction, note-taking, just-in-time retrieval):muratcankoylan/agent-skills-for-context-engineering
Designing agent or MCP tools(consolidation, namespacing, token-efficient responses): the same pack'stool-design
Long-running build harness(initializer + coding agent, git-tracked state):eddiearc/long-running-harness
GAN-style generator/evaluator harness:affaan-m/everything-claude-code
→gan-style-harness
Contextual retrieval RAG: packaged versions exist on mcpmarket
This pack distills public writing on the Anthropic engineering blog. It is an independent project with no endorsement from Anthropic, and it ships no Anthropic code.
See CONTRIBUTING.md. Each skill stands alone in skills/<name>/SKILL.md
and opens with the article it distills.
MIT. See LICENSE.