Making your docs site agent-readable: llms.txt, MCP, and the .well-known files that actually matter

OrchestKit's documentation site implements a stack of standard files—llms.txt, OpenAPI specs, MCP endpoints, and .well-known identity files—to make its docs machine-readable for AI agents. The project, a free MIT-licensed plugin for Claude Code, publishes these resources at predictable paths so agents can fetch structured data without scraping HTML. The approach includes agent-card.json, a schema.org graph, and explicit robot.txt allowances for AI crawlers.

AI agents increasingly read your docs instead of a human. If your documentation site only emits HTML for a browser, an agent has to scrape and guess. There's a better surface — and most of it is a handful of small, standard files. Here's the full stack we ship on the OrchestKit docs site, why each piece exists, and how to verify it. llms.txt — the agent's table of contents A plain-text index at /llms.txt : what the product is, its constraints, and a link map to every machine-readable resource. Keep it under ~30k chars; put the exhaustive page list in /docs/llms.txt and the full corpus in /llms-full.txt . The win: an agent gets oriented in one fetch instead of crawling. Append .md to any page URL or send Accept: text/markdown and return the raw Markdown. Agents get clean tokens; humans still get the rendered page. Even a docs site has an API surface search, page fetch . Publish an OpenAPI document at a predictable path so an agent can call it without reverse-engineering. Pair it with RFC 9727 — a /.well-known/api-catalog linkset that enumerates every API entry point. The Model Context Protocol lets agents call your tools natively. We expose a read-only MCP server over Streamable HTTP at /api/mcp plus a discovery server-card.json . Two tools — search docs, get a doc by id — are enough to be useful. .well-known identity files agent-card.json A2A : declares your agent skills. agent-skills/index.json : the Agent Skills Discovery RFC, with a SHA-256 digest per skill so a consumer can verify it. oauth-protected-resource RFC 9728 : if your API is anonymous, authorization servers is a positive signal, not an omission.Emit a schema.org graph Organization , SoftwareApplication , WebSite linked by @id , with sameAs pointing at the registries that already verify you GitHub, your package registry, Wikidata . One canonical Organization block, reused everywhere, so the graph never sees conflicting identifiers. Never fabricate an aggregateRating — surface real signals e.g. GitHub stars as an InteractionCounter instead. robots.txt Explicitly allow the named AI crawlers you want GPTBot, ClaudeBot, OAI-SearchBot, Google-Extended… , and emit a Content-Signal directive. Link your sitemap and a schema-map. curl -s https://yoursite/llms.txt , fetch each .well-known path, and run your JSON-LD through a structured-data validator. If you build on Claude Code, the open-source OrchestKit docs site implements every item above — the source is on GitHub, MIT-licensed, and you can read the route handlers directly. I maintain OrchestKit a free, MIT plugin for Claude Code, 111 skills/37 agents/210 hooks . The agent-discovery surface described here is what its docs site ships today.