{"slug": "writing-api-docs-an-ai-agent-can-actually-consume", "title": "Writing API docs an AI agent can actually consume", "summary": "FamNest engineer discovered that AI agents fail to call APIs correctly when documentation is written for humans, not machines. The coach agent in FamNest's agent graph repeatedly constructed malformed calls to a retrieval endpoint because the docs omitted exact schemas, enums, and bounds. The fix is to publish machine-readable schemas (e.g., Zod) instead of prose descriptions, ensuring agents can pattern-match without guessing.", "body_md": "*Your docs are written for a human who can guess. The agent calling your API can't*.\n\nI found the gap the embarrassing way: one of my own agents couldn't call one of my own APIs.\n\nFamNest runs a small agent graph — a router hands a parent's message to a retriever, the retriever grounds an answer in a vetted corpus, a coach agent (Groq, Llama 3.3 70B) drafts a reply, and a safety-reviewer agent signs off before anything reaches a human. The agents call internal endpoints the same way a third-party integrator would. And one afternoon the coach kept constructing malformed calls to the retrieval endpoint — wrong field name, missing a required filter, occasionally inventing a parameter that never existed.\n\nThe endpoint wasn't broken. The docs were. They were written for a human who could fill in the blanks, and the agent had no blanks to fill — only the tokens I gave it.\n\nThat's the whole lesson, and it's worth more than a trend.\n\nThe trend everyone's shipping — and where it stops\n\nIf you've touched developer tooling in 2026 you've watched llms.txt go from a September-2024 proposal to a routine piece of infrastructure. It's a Markdown file at your domain root that points AI systems at the content that matters, with a one-line summary of each link. Mintlify, Fern, and GitBook ship one-click toggles for it. IDE agents — Cursor, Windsurf, Claude Code, Copilot — fetch it when you point them at a docs site, then pull only the linked pages they need before writing code. LangChain even shipped an MCP server (mcpdoc) that hands those files to host apps as a fetch_docs tool.\n\nPeople are calling this the Business-to-Agent web, and the framing is right: just as you once needed a site humans could navigate, you now need surfaces agents can route on. Ship the llms.txt. It's a half-day of work.\n\nBut notice what it actually solves: discovery. It answers \"which page matters.\" It says nothing about the harder question that broke my coach agent:\n\nOnce the agent has found your endpoint, can it call it correctly on the first try — with no human in the loop to recover when your prose is ambiguous?\n\nThat's not a discovery problem. That's a contract problem. And it's where most docs quietly fail.\n\nAn agent is a different kind of reader\n\nA human reading your docs brings a lifetime of priors. They infer that userId is probably a UUID. They notice the example uses snake_case and adjust. They hit a 400, shrug, read the error, and try again. If they're really stuck they ask a teammate. Human docs can be good enough because the human closes the gap.\n\nAn agent closes nothing. It has your tokens and a probability distribution. It pattern-matches structure: if your example shows one field, it produces one field; if you describe an error in a sentence, it treats the sentence as flavor, not as a branch it has to handle. Ambiguity doesn't make an agent cautious — it makes it confident and wrong.\n\nSo the doc stops being documentation and becomes the interface itself. Everything the agent will ever know about your endpoint is in the text. If a fact isn't on the page, it doesn't exist.\n\nThat reframes what a good endpoint doc has to contain. Here are the five things mine were missing.\n\nProse says: \"Send the user's question and an optional list of topic tags.\"\n\nA schema says exactly what's allowed, and an agent can pattern-match it without guessing:\n\n`ts// retrieve — request`\n\nconst RetrieveRequest = z.object({\n\nquery: z.string().min(1).max(2000),\n\ntopics: z.array(z.enum([\"sleep\", \"feeding\", \"behavior\", \"self_care\"]))\n\n.max(4)\n\n.default([]),\n\ntopK: z.number().int().min(1).max(10).default(5),\n\n});\n\nThe difference is the enum, the bounds, the default. \"Optional list of topic tags\" let my coach invent \"toddler_tantrums\". z.enum([...]) makes the valid set unguessable-wrong. Publish the schema, not a paragraph about the schema.\n\nAgents copy examples. Whatever you show is what you'll get back. If your only example is the happy path, the happy path is the only thing the model knows how to produce.\n\nSo I document the empty result and the rejected request as first-class examples, not footnotes:\n\n``jsonc// 200 — results found\n\n{ \"matches\": [{ \"id\": \"c_18\", \"score\": 0.82, \"text\": \"...\" }], \"truncated\": false }\n\n// 200 — valid query, nothing relevant (NOT an error)\n\n{ \"matches\": [], \"truncated\": false }\n\n// 422 — query failed validation\n\n{ \"error\": \"validation_error\", \"field\": \"topics\", \"detail\": \"unknown topic 'toddler_tantrums'\" }`\n\nThe middle case is the one humans leave out and agents desperately need. \"No matches\" is a normal outcome, not a failure — and if you don't say so, the agent will treat an empty array as a bug and retry forever.\n\nMost docs describe errors. Agents need to be told what to do about them. \"Returns 429 when rate-limited\" is a description. An agent needs a decision.\n\nSo I ship a table where every row ends in an action:\n\n| Code | `error` |\nCause | What the caller should do |\n|---|---|---|---|\n| 422 | `validation_error` |\nBad input | Fix the field named in `detail` ; do not retry unchanged |\n| 429 | `rate_limited` |\nToo many calls | Back off using `Retry-After` ; retry the same body |\n| 503 | `model_unavailable` |\nUpstream LLM down | Fall back to cached/deterministic path; do not retry tightly |\n| 409 | `idempotency_conflict` |\nKey reused, different body | Stop; surface to a human |\n\nA human reads that table for reference. An agent reads it as a control-flow graph. The \"do not retry\" cells are the ones that stop a confused agent from hammering your endpoint at 3am.\n\nThe single most useful sentence I added to any endpoint doc was: \"Is it safe to retry this?\"\n\nAgents retry. Networks are flaky, and a retried call that isn't idempotent is how you double-charge a card or send two replies to one anxious parent. For anything with a side effect, I now state the contract in the doc itself:\n\nIdempotency: required for POST /coach/reply and all payment routes.\n\nSend an `Idempotency-Key` header (UUID). Replays with the same key + same\n\nbody return the original result. Same key + different body → 409.\n\nRetrieval (GET /retrieve) is side-effect-free and safe to retry freely.\n\nThat paragraph is the difference between a retry loop that heals and one that does damage. It's also the kind of thing humans infer and agents simply won't — there is no prior that tells a model your payment webhook is replay-safe. You have to say it.\n\n\"Authenticated requests only, please don't spam it\" is not a contract. Scopes, the exact header, the rate limit, and the window belong in the doc as structured fields the agent can read and self-regulate against:\n\n``plaintext`\n\nAuthorization\n\nAuth: Bearer token in`. Scope`\n\ncoach:read`for /retrieve.`\n\nRetry-After` (seconds).\n\nLimits: 60 req/min/token. On exceed → 429 +\n\n``plaintext`\n\n```\n\nNow the agent can pace itself instead of discovering your limit by tripping it.\n\nKeep it honest: one source of truth\n\nAll of this rots the moment your docs and your code disagree — and an agent can't smell a stale doc the way a human can. So the contract has to be generated, not hand-maintained.\n\nMy chain is boring on purpose: the typed Next.js handler validates with the Zod schema, the schema generates the OpenAPI spec, and my llms.txt links to the generated reference. The schema is the only thing I edit. The doc can't drift, because the doc is downstream of the thing that's actually true.\n\n**Zod schema ──► request validation (runtime)\n│\n└────────► OpenAPI spec ──► /llms.txt entry ──► agent reads it\n**\n\nThe test that actually proves it\n\nHere's the check I run before I trust an endpoint doc: give a fresh model only the doc — no codebase, no context — and ask it to (a) construct a valid call and (b) handle a seeded error. If it can't, the gap is in the doc, not the model. I keep these as tiny snapshot tests next to the endpoint, so a doc regression fails CI like any other bug.\n\nWhen my coach agent broke, this test would have caught it in seconds. The retrieval doc, fed to a cold model, produced exactly the malformed call I saw in production — because the ambiguity was right there on the page.\n\nThis isn't new. It's just a new reader.\n\nNone of this is novel discipline. \"Unambiguous, complete, verifiable\" is the spine of IEEE 29148 — the requirements standard I write everything against. What's changed is that the consumer of your interface is no longer guaranteed to be a person who can paper over a vague spec. Half your integrators in 2026 are agents, and they read your docs literally, exhaustively, and without charity.\n\nShip the llms.txt so agents can find you. But the thing that makes them succeed once they arrive is older and less glamorous: a contract precise enough that it can't be misread. The agentic web doesn't need prettier docs. It needs docs that can't be guessed wrong.\n\nI build FamNest, an AI wellness coach for busy parents, and write about production reliability and safety for multi-agent systems. If you're documenting an agent-callable API and want a second pair of eyes on the contract, my notes are open.", "url": "https://wpnews.pro/news/writing-api-docs-an-ai-agent-can-actually-consume", "canonical_source": "https://dev.to/virginiamwega2svg/writing-api-docs-an-ai-agent-can-actually-consume-16bb", "published_at": "2026-06-29 15:01:58+00:00", "updated_at": "2026-06-29 15:19:41.280561+00:00", "lang": "en", "topics": ["ai-agents", "developer-tools", "large-language-models", "natural-language-processing"], "entities": ["FamNest", "Groq", "Llama 3.3 70B", "Mintlify", "Fern", "GitBook", "Cursor", "Windsurf"], "alternates": {"html": "https://wpnews.pro/news/writing-api-docs-an-ai-agent-can-actually-consume", "markdown": "https://wpnews.pro/news/writing-api-docs-an-ai-agent-can-actually-consume.md", "text": "https://wpnews.pro/news/writing-api-docs-an-ai-agent-can-actually-consume.txt", "jsonld": "https://wpnews.pro/news/writing-api-docs-an-ai-agent-can-actually-consume.jsonld"}}