Your LLM reads the whole file. It doesn't have to. A developer created md2idx, a CLI tool that splits Markdown files at heading boundaries into a JSON index and sections, enabling LLM coding agents to read only relevant parts instead of entire files. This reduces token consumption by 80-98% and improves answer quality by avoiding context window bloat. The tool is available on GitHub and includes a skill for autonomous agent use. Coding agents read specs, design docs, and long READMEs every day. Most of the time, they only need a few sections. Yet they load the entire file into context. Here's a scenario that plays out constantly. You ask your agent to check the error handling section of a 5,000-line API spec. The agent opens the file, reads all 5,000 lines into its context window, finds the 80 lines it needs, and answers your question. The result is correct. But the agent also consumed a large number of tokens on the 4,920 lines it didn't need. Repeat this for every file read in a session, and the waste compounds fast. The cost isn't just tokens. A context window stuffed with irrelevant content makes the agent's answers worse. When a human picks up a 300-page technical book, they don't read cover to cover to find the chapter on authentication. They flip to the table of contents, scan the chapter titles, and jump to page 47. LLMs can do the same thing. Markdown documents have a built-in structure: headings. A Title followed by Section A followed by Subsection A.1 creates a hierarchy that mirrors a book's table of contents. Split a Markdown file at heading boundaries, and you get a natural "table of contents + sections" structure. Each heading starts a new section, the heading text becomes the index entry, and the section number becomes the address. This is the idea behind md2idx https://github.com/oubakiou/md2idx , a CLI tool. md2idx converts a Markdown file into JSON with two fields: index < markers for depth