llms.txt is a Markdown index for LLMs, placed at the site root. Where sitemap.xml
is a machine-readable list of URLs, llms.txt describes — with one-line notes — what the site is and where to start reading.
In Astro you can generate it from Content Collections as an API route, so the post list never has to be hand-maintained. This post is the minimum setup for a bilingual (EN/JA) site: emit /llms.txt
, /ja/llms.txt
and /llms-full.txt
from one renderer.
Up front: how much llms.txt actually helps AI-search traffic isn't a settled or measured thing yet. This is only about the implementation.
Astro's file-based API routes return text when you drop a .txt.ts
file under src/pages/
. Return a text/plain
Response
from a GET
handler.
// src/pages/llms.txt.ts
import type { APIContext } from "astro";
import { renderLlmsTxt } from "../lib/llmsTxt";
export async function GET(_context: APIContext) {
const body = await renderLlmsTxt({ docLang: "en" });
return new Response(body, {
status: 200,
headers: {
"Content-Type": "text/plain; charset=utf-8",
"Cache-Control": "public, max-age=3600",
},
});
}
The .txt.ts
extension builds to the URL /llms.txt
. Keep the assembly logic in src/lib/llmsTxt.ts
and leave the route thin, so a per-language endpoint can reuse it.
Get the post list with getCollection
and lay it out on the fly. A hand-kept list goes stale — add a post, forget the index, and llms.txt drifts from the content.
// src/lib/llmsTxt.ts (excerpt)
export async function renderLlmsTxt(opts: LlmsTxtOptions): Promise<string> {
const blog = await getCollection("blog", ({ data }) => !data.draft);
blog.sort((a, b) => b.data.pubDate.getTime() - a.data.pubDate.getTime());
// ...assemble sections and return join("\n")
}
Don't drop the ({ data }) => !data.draft
filter. Skip it and a half-written draft lands in llms.txt, advertising a URL you haven't published. Reuse the same exclusion sitemap and RSS use.
This is the part that matters for a multilingual site. Give the renderer two axes:
Separating them lets one renderer emit three endpoints.
// src/pages/llms.txt.ts → English headings, posts from all languages
renderLlmsTxt({ docLang: "en" });
// src/pages/ja/llms.txt.ts → Japanese headings, Japanese posts only
renderLlmsTxt({ filterLang: "ja", docLang: "ja" });
The English /llms.txt
leaves filterLang
unset on purpose — it's the whole-site entry point, so it surfaces posts in either language. The Japanese /ja/llms.txt
closes to the Japanese surface with filterLang: "ja"
.
One design call. The English version can surface both languages, but the Featured section alone narrows to docLang
.
const featuredSource = filteredBlog.filter(
(p) => entryLangLocal(p.id) === opts.docLang,
);
Listing both halves of a translation pair in the featured slots spends two slots on one piece of content and halves the unique signal in a bounded list. So a limited list (featured) narrows by language; a full dump with loose size limits (llms-full.txt) carries both.
draft
exclusion in the shared rendererfilterLang
and docLang
filterLang
unset on the English version is deliberate — don't read it later as a bug and add a filterGenerate llms.txt from Content Collections as an Astro API route and the manual upkeep goes away. Split on filterLang and docLang and one renderer emits all three files.
The language cross-references in llms-full.txt, how it sits next to robots.txt, and how I word the usage/citation section are on the Aulvem site → Generating llms.txt and llms-full.txt in Astro for a Bilingual Site