{"slug": "the-five-thousand-line-file", "title": "The Five-Thousand-Line File", "summary": "This article explains the concept of a \"god file\"—a single, excessively large file in a codebase that has grown incrementally over time by accumulating unrelated functions and concepts. It argues that such files are particularly problematic for AI coding agents because they consume excessive context budget, dilute recognizable patterns, and encourage adding more code to the file rather than refactoring it. The key diagnostic for a god file is not its length but its lack of conceptual coherence, meaning its functions do not belong together under a single, clear purpose.", "body_md": "Every team has one. Sometimes it is called utils.ts\nor helpers.py\n. Sometimes it has the name of a domain concept that originally meant something specific and has since absorbed everything tangentially related. The file is large enough that nobody opens it casually. It has multiple maintainers, each of whom understands a different third of it. New additions go into it because that is where similar things already live, and the file gets larger.\nThis is the god file: a single file that has grown to do too much, and that resists the refactor that would split it because the refactor is large and the file mostly works.\nThe god file is one of the most agent-hostile shapes a codebase can take.\nNo file is born five thousand lines long. The growth is incremental.\nThe file starts as something reasonable: a module that handles one thing, three hundred lines, well-organized. A developer adds a function related to the thing. Another developer adds a function that is related to one of the existing functions, but also touches a new concept. The new concept does not justify its own file, so it lives in this one. Over time, the file accumulates concepts that share an author, a directory, or nothing in particular except convenience.\nThe reason nobody splits it is that splitting it is a project. The file is imported from many places. Each import has to be updated. The functions inside have shared private helpers that have to be sorted out. Tests for the file have the same problem in miniature. The estimate for the refactor is \"a sprint,\" and a sprint is always too expensive when the file is \"fine.\"\nSo the file stays. New developers add to it because the existing functions are there. The agent does the same. The file grows.\nThe god file is expensive for an agent in three specific ways.\nThe first is context budget. The agent loads files into its working memory to understand them. A five-thousand-line file consumes a large fraction of that budget for a small change. The agent has less room left for the rest of the codebase — the calling files, the tests, the conventions. Quality drops, not because the agent is dumber, but because it is operating with less situational awareness.\nThe second is pattern dilution. The agent pattern-matches against the file it is editing. A file with five hundred coherent lines teaches the agent one strong pattern. A file with five thousand lines teaches the agent ten weak patterns, often contradictory. The agent picks one, often the wrong one for the specific change.\nThe third is the path-of-least-resistance problem. When asked to add new functionality, the agent looks for where similar functionality lives. It finds the god file. It adds to the god file. The file grows by one more function, in the same shape as the previous additions. The agent, like every previous contributor, has chosen the cheap path. The file is now slightly more god-like.\nA small coherent file is a force multiplier for an agent. A god file is a tax.\nIt is worth being careful about the diagnosis. Not every large file is a god file. A file that defines a complex but coherent thing (a state machine, a parser, a single algorithm) may legitimately be large. The size is not the smell. The smell is unrelated things sharing a file.\nThe diagnostic question is: if you had to give this file a name that described what it does, in a single concept, could you? parser.ts\nis a coherent file even at three thousand lines, because everything in it is parser. helpers.ts\nis incoherent at five hundred lines, because nothing about the name tells you what is in it. The size is downstream of the coherence.\nA useful test: pick five functions from the file at random. Do they belong together? If yes, the file is big but legitimate. If no, the file is a junk drawer with a misleading name.\nThe right way to split a god file is not by line count. It is by concern.\nLook at the functions in the file. Group them by what they are for, not by what they touch. Two functions that both manipulate strings are not necessarily related; two functions that both implement steps of the same workflow are.\nFor each group, ask: would this group, alone, make sense as a file? Does it have a name that describes what it does? Are the dependencies between this group and the rest of the file mostly external, or mostly internal?\nGroups that score well on this become candidate files. Move them. The imports update mechanically. The tests follow. The original god file shrinks by one concept; the codebase gains a coherent module.\nThis is the kind of refactor an agent is good at, given a clear scope. \"Move these eight functions to a new file called pricing.ts\n, update all callers, and split the corresponding test file.\" A concrete instruction. The agent does the mechanical work. A human reviews the result.\nOnce you have done the initial split, the way to keep the file from re-growing is the same as with every other limit: make it mechanical.\nMost linters can enforce a maximum file length. Set the limit slightly above your current largest legitimate file. The build fails when a file exceeds it. New code cannot grow a file past the limit; it has to go somewhere else.\nThe limit is a forcing function, not a precise number. The point is not that 500 is correct and 501 is wrong. The point is that the team is forced to make an active decision when a file approaches the limit, instead of letting it drift past 1,000, 2,000, 5,000 without noticing.\nThe agent will respect the limit because the agent runs the linter. It will offer to put new functions in new files when the existing file is near the threshold. The default direction shifts from \"grow the god file\" to \"split the god file,\" which is what you wanted.\nIf your codebase has god files and you want to start fixing them:\nFind your largest source file. Count its lines. Note the number. Open it and look at the function list. Are they coherent? Or is the file a junk drawer?\nIf it is a junk drawer, pick the most distinct group of functions — the smallest set you can extract without untangling shared dependencies. Move that group to its own file. Update imports. Run tests. Ship the PR.\nAdd a max-lines\nrule to your linter, set 20% above the largest file you have decided to keep. The build now prevents new files from exceeding the limit.\nQuarterly, look at the file-length distribution. Pick the largest file. Split one group. Repeat.\nAdd a rule to AGENTS.md: \"When adding new functions, prefer creating a new file in a domain-appropriate directory over extending a large existing file. Files larger than [N] lines are a smell; do not extend them without splitting at the same time.\"\nThe god file did not arrive overnight. It will not leave overnight. But the trajectory matters. A team that splits one group per quarter is on a path toward a codebase made of coherent modules. A team that does not is on a path toward one file that contains everything, and an agent that gets worse the more it touches the codebase.\nThe size of any one file is small. The cost of letting them all grow is not.", "url": "https://wpnews.pro/news/the-five-thousand-line-file", "canonical_source": "https://dev.to/tacoda/the-five-thousand-line-file-die", "published_at": "2026-05-22 14:48:01+00:00", "updated_at": "2026-05-22 15:05:56.595095+00:00", "lang": "en", "topics": ["developer-tools"], "entities": [], "alternates": {"html": "https://wpnews.pro/news/the-five-thousand-line-file", "markdown": "https://wpnews.pro/news/the-five-thousand-line-file.md", "text": "https://wpnews.pro/news/the-five-thousand-line-file.txt", "jsonld": "https://wpnews.pro/news/the-five-thousand-line-file.jsonld"}}