Lessons from building Claude Code: How we use skills Anthropic has identified nine categories of effective skills for its Claude Code AI agent after cataloging hundreds of skills used internally. The company found that the most successful skills fit cleanly into a single category, while skills attempting to cover multiple areas confuse the agent and reduce output quality. The lessons aim to help developers structure skills more effectively by focusing on specific use cases such as library usage, verification, and data monitoring. Lessons from building Claude Code: How we use skills What we learned building and scaling hundreds of skills internally at Anthropic. What we learned building and scaling hundreds of skills internally at Anthropic. Skills have become one of the most used extension points in Claude Code. They’re flexible, easy to make, and easy to distribute. But this flexibility also makes it hard to know what works best. What type of skills are worth making? How do you structure a skill? When do you share them with others? We've been using skills in Claude Code extensively at Anthropic with hundreds of them in active use. These are the lessons we've learned about using skills to accelerate our development. Skills are folders of instructions, scripts, and resources that agents can discover and use to do things more accurately and efficiently. This blog post assumes familiarity with skills basics; if you’re new, start with our Introduction to agent skills course on Skilljar https://anthropic.skilljar.com/introduction-to-agent-skills . A common misconception we hear about skills is that they are “just markdown files.” They’re actually folders that can include scripts, assets, data, etc. that the agent can discover, explore and manipulate. In Claude Code, skills also have a wide variety of configuration options https://code.claude.com/docs/en/skills frontmatter-reference including registering dynamic hooks. We’ve found that some of the most effective skills in Claude Code use these configuration options and folder structure effectively. After cataloging all of our internal skills at Anthropic, we noticed they cluster into nine categories. The best skills fit cleanly into one; the ones that try to do too much straddle several and confuse the agent. This isn't a definitive list, but it is a useful framework for identifying gaps in your own skills library. These are skills that explain how to correctly use a library, CLI, or SDKs. They could be both for internal libraries or common libraries that Claude Code sometimes struggles to handle. These skills often included a folder of reference code snippets and a list of gotchas for Claude to avoid when writing a script. Examples include: billing-lib — your internal billing library: edge cases, footguns, etc. internal-platform-cli — every subcommand of your internal CLI wrapper with examples on when to use them. sandbox-proxy — configuring your org's egress gateway for dev work: which hosts are reachable, how to debug "connection refused" errors, how to add an allowlist entry.These are skills that describe how to test or verify that your code is working. They are often paired with playwright, tmux, or other external tools for verification. Verification skills have had the most measurable impact on Claude’s output quality internally. It can be worth having an engineer spend a week just making your verification skills excellent. Consider techniques like having Claude record a video of its output so you can see exactly what it tested, or enforcing programmatic assertions on state at each step. These are often done by including a variety of scripts in the skill. Examples include: signup-flow-driver — runs through signup → email verify → onboarding in a headless browser, with hooks for asserting state at each step checkout-verifier — drives the checkout UI with Stripe test cards, verifies the invoice actually lands in the right state tmux-cli-driver — for interactive CLI testing where the thing you're verifying needs a TTYThese are skills that connect to your data and monitoring stacks. These skills might include libraries to fetch your data with credentials, specific dashboard ids, etc., as well as instructions on common workflows or ways to get data. Examples include: funnel-query — "which events do I join to see signup → activation → paid" plus the table that actually has the canonical user id cohort-compare — compare two cohorts' retention or conversion, flag statistically significant deltas, link to the segment definitions grafana — datasource UIDs, cluster names, problem → dashboard lookup table datadog — field reference @request id vs trace id , service list, metric prefix conventionsThese are skills that automate repetitive workflows into one command. These skills are usually fairly simple instructions but might have more complicated dependencies on other skills or MCPs. For these skills, saving previous results in log files can help the model stay consistent and reflect on previous executions of the workflow. Examples include: standup-post — aggregates your ticket tracker, GitHub activity, and prior Slack → formatted standup, delta-only create-