# Making your docs site agent-readable: llms.txt, MCP, and the .well-known files that actually matter

> Source: <https://dev.to/yonyonai/making-your-docs-site-agent-readable-llmstxt-mcp-and-the-well-known-files-that-actually-matter-33c6>
> Published: 2026-06-14 17:13:20+00:00

AI agents increasingly read your docs *instead of* a human. If your documentation site only emits HTML for a browser, an agent has to scrape and guess. There's a better surface — and most of it is a handful of small, standard files. Here's the full stack we ship on the OrchestKit docs site, why each piece exists, and how to verify it.

`llms.txt`

— the agent's table of contents
A plain-text index at `/llms.txt`

: what the product is, its constraints, and a link map to every machine-readable resource. Keep it under ~30k chars; put the exhaustive page list in `/docs/llms.txt`

and the full corpus in `/llms-full.txt`

. The win: an agent gets oriented in one fetch instead of crawling.

Append `.md`

to any page URL (or send `Accept: text/markdown`

) and return the raw Markdown. Agents get clean tokens; humans still get the rendered page.

Even a docs site has an API surface (search, page fetch). Publish an OpenAPI document at a predictable path so an agent can call it without reverse-engineering. Pair it with RFC 9727 — a `/.well-known/api-catalog`

linkset that enumerates every API entry point.

The Model Context Protocol lets agents call your tools natively. We expose a read-only MCP server over Streamable HTTP at `/api/mcp`

plus a discovery `server-card.json`

. Two tools — search docs, get a doc by id — are enough to be useful.

`.well-known`

identity files
`agent-card.json`

(A2A): declares your agent skills.`agent-skills/index.json`

: the Agent Skills Discovery RFC, with a SHA-256 digest per skill so a consumer can verify it.`oauth-protected-resource`

(RFC 9728): if your API is anonymous, `authorization_servers`

is a positive signal, not an omission.Emit a `schema.org`

graph (`Organization`

, `SoftwareApplication`

, `WebSite`

) linked by `@id`

, with `sameAs`

pointing at the registries that already verify you (GitHub, your package registry, Wikidata). One canonical Organization block, reused everywhere, so the graph never sees conflicting identifiers. Never fabricate an `aggregateRating`

— surface real signals (e.g. GitHub stars as an `InteractionCounter`

) instead.

`robots.txt`

Explicitly allow the named AI crawlers you want (GPTBot, ClaudeBot, OAI-SearchBot, Google-Extended…), and emit a `Content-Signal`

directive. Link your sitemap and a schema-map.

`curl -s https://yoursite/llms.txt`

, fetch each `.well-known`

path, and run your JSON-LD through a structured-data validator. If you build on Claude Code, the open-source **OrchestKit** docs site implements every item above — the source is on GitHub, MIT-licensed, and you can read the route handlers directly.

*I maintain OrchestKit (a free, MIT plugin for Claude Code, 111 skills/37 agents/210 hooks). The agent-discovery surface described here is what its docs site ships today.*
