{"slug": "agentrail-an-ai-agent-friendly-layer-for-websites", "title": "AgentRail. An AI-agent friendly layer for websites", "summary": "AgentRail launched a Cloudflare edge layer that serves deterministic Markdown responses to known AI agents from the same URLs humans browse, eliminating the need for separate AI-optimized endpoints. The open-source tool uses a background crawler to extract and cache Markdown in KV storage, returning it on subsequent agent requests without adding latency to human visitors.", "body_md": "AgentRail is a Cloudflare edge layer that gives known AI agents deterministic Markdown responses from the same URLs humans already visit.\n\n``` php\nBrowser or search crawler -> /pricing -> origin HTML\nKnown AI agent           -> /pricing -> generated Markdown if ready\nKnown AI agent           -> /pricing -> origin HTML if Markdown is unavailable\n```\n\nThe crawler runs in the background. Request handling never waits for extraction, so cache misses fall through to the original site without adding generation latency.\nWhen a known AI agent requests a page that is not in KV yet, AgentRail returns the origin page and uses `ctx.waitUntil`\n\nto warm KV from that same origin response. A later AI-agent request can then receive the prepared Markdown.\n\n``` php\nflowchart TD\n  browser[\"Human browser\"] --> worker[\"Cloudflare Worker route\"]\n  search[\"Search crawler\"] --> worker\n  ai[\"Known AI agent\"] --> worker\n\n  worker --> classify{\"Classify request\"}\n  classify -->|\"Browser, search crawler, unknown bot, asset, or non-GET/HEAD\"| origin[\"Origin website HTML\"]\n  classify -->|\"Known AI agent\"| kvcheck{\"KV record exists?\"}\n\n  kvcheck -->|\"ready or fresh stale\"| markdown[\"Return deterministic Markdown\"]\n  markdown --> headers[\"text/markdown + x-ai-response-layer\"]\n\n  kvcheck -->|\"missing\"| originfetch[\"Fetch origin HTML\"]\n  originfetch --> firstbot[\"Return origin HTML to first bot\"]\n  originfetch --> waituntil[\"ctx.waitUntil warmup\"]\n  waituntil --> extract[\"Extract deterministic Markdown\"]\n  extract --> store[\"Store page:<normalized-url> in AGENTRAIL_RESOURCES KV\"]\n\n  kvcheck -->|\"pending, failed, skipped, or too stale\"| origin\n  cron[\"Cloudflare Cron Trigger\"] --> sitemap[\"Fetch sitemap\"]\n  sitemap --> crawl[\"Crawl sitemap URLs\"]\n  crawl --> extract\n\n  store --> nextbot[\"Next AI-agent request\"]\n  nextbot --> kvcheck\n```\n\n`@agentrail/bot-detector`\n\n: classifies AI agents, search crawlers, browsers, and unknown bots.`@agentrail/markdown-extractor`\n\n: deterministic HTML to Markdown extraction.`@agentrail/crawler`\n\n: sitemap parsing, link discovery, resource keys, and crawl processing.`@agentrail/worker`\n\n: Cloudflare Worker runtime.`create-agentrail`\n\n: scaffold generator for Cloudflare projects.\n\nAgentRail expects Node 22 or newer. Current Wrangler 4 releases require it.\n\n```\nnpm test\n```\n\nThe repository uses Node's built-in test runner and has no runtime test dependency.\n\nFrom this repository:\n\n``` python\nnode --import tsx packages/create-agentrail/bin/create-agentrail.ts my-site \\\n  --origin=https://example.com \\\n  '--route=example.com/*' \\\n  --schedule=\"0 */6 * * *\"\n```\n\nThe CLI checks Cloudflare through Wrangler, reuses an existing `AGENTRAIL_RESOURCES`\n\nKV namespace if one is present, or creates it automatically if it is missing. When that setup succeeds, the generated project contains a Wrangler-compatible Worker entrypoint and config with the real KV namespace id already written into `wrangler.jsonc`\n\n. If automatic setup is skipped or fails, the config keeps a placeholder and the generated README explains the manual KV setup.\n\nIt also runs `npm install`\n\ninside the generated project by default, so the normal next step is deploy:\n\n```\ncd my-site\nnpm run deploy\n```\n\nAgentRail includes a Cron Trigger for background crawling. On a fresh Cloudflare account, open the Cloudflare dashboard and visit Workers & Pages once before the first deploy. Cloudflare creates the required `workers.dev`\n\nsubdomain there. If `npm run deploy`\n\nfails with Cloudflare `code: 10063`\n\n, do that dashboard step and rerun the deploy command.\n\nIf you want to generate files only:\n\n``` python\nnode --import tsx packages/create-agentrail/bin/create-agentrail.ts my-site \\\n  --origin=https://example.com \\\n  '--route=example.com/*' \\\n  --skip-install\n```\n\nIf you are offline, not logged into Wrangler, or want to wire Cloudflare later:\n\n``` python\nnode --import tsx packages/create-agentrail/bin/create-agentrail.ts my-site \\\n  --origin=https://example.com \\\n  '--route=example.com/*' \\\n  --skip-cloudflare\n```\n\nThe generated `wrangler.jsonc`\n\nwill contain this placeholder until you add the real KV namespace id:\n\n```\n{\n  \"binding\": \"AGENTRAIL_RESOURCES\",\n  \"id\": \"replace-with-agentrail-resources-kv-id\"\n}\n```\n\nIf you already have a namespace id:\n\n``` python\nnode --import tsx packages/create-agentrail/bin/create-agentrail.ts my-site \\\n  --origin=https://example.com \\\n  '--route=example.com/*' \\\n  --kv-id=your-kv-namespace-id\n```\n\nUse this when automatic Cloudflare setup was skipped or failed.\n\nFirst make sure Wrangler is logged in:\n\n```\nnpx wrangler login\n```\n\nCheck whether the namespace already exists:\n\n```\nnpx wrangler kv namespace list --json\n```\n\nIf the output includes a namespace with `\"title\": \"AGENTRAIL_RESOURCES\"`\n\n, copy its `\"id\"`\n\n.\n\nIf it does not exist, create it:\n\n```\nnpx wrangler kv namespace create AGENTRAIL_RESOURCES\n```\n\nWrangler prints an id. It may look like this:\n\n```\nid = \"abc123...\"\n```\n\nPaste that id into `wrangler.jsonc`\n\n:\n\n```\n{\n  \"kv_namespaces\": [\n    {\n      \"binding\": \"AGENTRAIL_RESOURCES\",\n      \"id\": \"abc123...\"\n    }\n  ]\n}\n```\n\nThen deploy:\n\n```\nnpm install\nnpm run deploy\n```\n\nGenerated projects are local deployment workspaces. Keep them under `projects/`\n\n; that folder is ignored so your site-specific Cloudflare config does not get committed to the AgentRail source repo.\n\nCopy the example config and edit the route and origin:\n\n```\ncp wrangler.example.jsonc wrangler.jsonc\n```\n\nFollow the manual KV setup above if `AGENTRAIL_RESOURCES`\n\nis not configured yet, then deploy:\n\n```\nnpm install\nnpm run deploy\n```\n\nIf this is the first Worker on the Cloudflare account, open Workers & Pages in the Cloudflare dashboard once before deploying so Cloudflare creates the required `workers.dev`\n\nsubdomain for cron schedules.\n\nAgentRail only returns Markdown when a stored resource is safe to serve:\n\n`ready`\n\n: return Markdown.`stale`\n\n: return Markdown only inside the configured stale window.`missing`\n\n,`pending`\n\n,`failed`\n\n,`skipped`\n\n, or too stale: pass through to origin.\n\nHumans, traditional search crawlers, unknown bots, assets, and non-GET/HEAD requests always pass through to origin. Known AI-agent GET requests with no KV record also schedule a background warmup from the origin response before passing through. That keeps the first miss fast and prepares the next bot request.\n\nAgentRail treats these user agents as AI-agent traffic by default:\n\n```\nApplebot\nGPTBot\nChatGPT-User\nOAI-SearchBot\nGoogle-CloudVertexBot\nClaudeBot\nClaude-User\nClaude-SearchBot\nAnthropic-AI\nPerplexityBot\nPerplexity-User\nYouBot\nCohere-AI\nAmazonbot\nAnchor Browser\nBytespider\nCloudflare Crawler\nCCBot\nDuckAssistBot\nFacebookBot\nManus Bot\nMeta-ExternalAgent\nMeta-ExternalFetcher\nMistralAI-User\nNovellum AI Crawl\nPetalBot\nProRataInc\nTikTok Spider\nTimpibot\n```\n\nGooglebot, Bingbot, DuckDuckBot, YandexBot, Baiduspider, archive.org_bot, Arquivo Web Crawler, Terracotta Bot, Slurp, and other traditional search crawlers stay on the origin path.\n\nThe basic mode uses:\n\n- Worker routes for request switching.\n- Cron Trigger for sitemap crawling.\n- KV namespace named\n`AGENTRAIL_RESOURCES`\n\nfor Markdown records. - Request-time warmup for AI-agent misses.\n\nCron can crawl sitemap pages directly into KV. A production deployment can add Queues and D1 later, but they are not required for the first useful version.\n\nLocal Wrangler does not run Cron Triggers by itself. AgentRail's dev script uses `--test-scheduled`\n\n, so you can run `npm run dev`\n\nand trigger the crawler manually:\n\n```\ncurl \"http://localhost:8787/__scheduled?cron=0+*/6+*+*+*\"\n```\n\nEach record stores Markdown with this shape:\n\n```\n# Page Title\n\nCanonical URL: https://example.com/page\nLast generated: 2026-06-03T00:00:00.000Z\nSource: public HTML\n\n## Description\nMeta description or first meaningful paragraph.\n\n## Content\nClean extracted page content.\n```\n\nThe extractor preserves source ordering where practical and does not use LLM summarization.\n\nApache-2.0. See [LICENSE](/gharibyan/agentrail/blob/main/LICENSE).", "url": "https://wpnews.pro/news/agentrail-an-ai-agent-friendly-layer-for-websites", "canonical_source": "https://github.com/gharibyan/agentrail", "published_at": "2026-06-04 08:22:32+00:00", "updated_at": "2026-06-04 08:48:00.258581+00:00", "lang": "en", "topics": ["ai-infrastructure", "ai-agents", "ai-tools", "ai-products", "artificial-intelligence"], "entities": ["AgentRail", "Cloudflare"], "alternates": {"html": "https://wpnews.pro/news/agentrail-an-ai-agent-friendly-layer-for-websites", "markdown": "https://wpnews.pro/news/agentrail-an-ai-agent-friendly-layer-for-websites.md", "text": "https://wpnews.pro/news/agentrail-an-ai-agent-friendly-layer-for-websites.txt", "jsonld": "https://wpnews.pro/news/agentrail-an-ai-agent-friendly-layer-for-websites.jsonld"}}