{"slug": "the-supply-chain-risk-of-llm-code-in-dependencies", "title": "The Supply Chain Risk of LLM Code in Dependencies", "summary": "Maintainers using generative AI to write library code are introducing unprecedented quality, legal, and maintenance risks into the open-source supply chain, as downstream projects face broken changes, unmaintainable code, and ambiguous legal provenance. The shift in software economics, where LLMs dramatically lower the cost of producing low-quality code, threatens to create a 'lemon market' that drives high-quality libraries out of use.", "body_md": "[Security](https://sourcefeed.dev/c/security)Article\n\n# The Supply Chain Risk of LLM Code in Dependencies\n\nAs maintainers use generative AI to write library code, downstream projects face unprecedented quality, legal, and maintenance risks.\n\n[Ji-ho Choi](https://sourcefeed.dev/u/jiho_choi)\n\nThe open-source supply chain has always been fragile, but the nature of that fragility is shifting. For years, developers worried about malicious typosquatting, compromised maintainer accounts, and left-pad-style deletions. Today, a quieter, more insidious risk is creeping into our dependency trees: LLM-generated code.\n\nWhen Joey Hess, the creator of [git-annex](https://git-annex.branchable.com), spent roughly 100 hours over a month purging LLM-generated code from his project's dependency tree, it sounded to some like a quixotic crusade. But Hess's effort highlights a structural shift in how software is built and maintained. By introducing a `NoLLMDependencies`\n\nbuild flag, Hess forced a hard question: what happens when the libraries we rely on are written by machines, reviewed by tired humans, and shipped to production without anyone fully understanding the side effects?\n\nThis is not a theoretical debate about the ethics of AI training data. It is a practical, technical crisis of software quality, maintainability, and legal liability.\n\n## The Anatomy of LLM Slop in Production Libraries\n\nTo understand the risk, we have to look at what is actually getting merged into active open-source repositories. In his audit, Hess uncovered several alarming examples of what happens when maintainers treat LLMs as a shortcut to productivity.\n\nConsider the Haskell library `ram`\n\n, a fork of the unmaintained `memory`\n\npackage. In version 0.21.0, the library introduced massive LLM-generated code churn, resulting in broken changes that had to be abruptly reverted in version 0.21.1 without any explanation. For downstream projects, this kind of churn is a nightmare. It introduces silent regressions that bypass basic compiler checks but fail under specific production workloads.\n\nEven more egregious is the impact on code readability and reviewability. In the `yesod`\n\nweb framework (specifically starting in version 1.7.0.0), a commit was merged that featured an incoherent 1,489-line commit message accompanying over 10,000 lines of changes to a 26,000-line codebase. When LLMs make it easy to generate thousands of lines of syntactically correct but structurally opaque code, the traditional code review process breaks down. No human maintainer can thoroughly review a 10,000-line diff generated in seconds. The result is dark code (code that compiles and runs but is effectively unmaintainable).\n\nThen there is the legal dimension. The copyright status of LLM-generated code remains an open question. If an LLM is prompted to copy functionality from another project, it can easily generate code that skirts the edge of copyright infringement. Hess noted one such prompt in a dependency that only avoided infringement by pure luck. For enterprise users and long-term projects like git-annex, where future-proofing is a core design goal, shipping code with ambiguous legal provenance is a ticking time bomb.\n\n## The Lemon Market of Modern Software\n\nThe root of this problem lies in a fundamental shift in the economics of software development. In a classic economic paper, George Akerlof described the \"market for lemons,\" where buyers cannot distinguish between high-quality and low-quality goods, eventually driving high-quality goods out of the market entirely.\n\nLLMs have dramatically altered the relative cost of software quality. Before generative AI, writing a poorly designed, hard-to-maintain library still required a non-trivial amount of human effort. Writing a high-quality, well-tested library required perhaps twice as much effort. Today, a skilled developer might use an LLM to reduce the effort of writing a high-quality library by 25 percent. But a low-effort developer can use an LLM to reduce the effort of generating a massive, poorly understood library by 90 percent.\n\nThe market is suddenly flooded with cheap, polished-looking code that is actually full of architectural holes. Because this code compiles and passes basic tests, it looks like quality work. But it degrades the overall ecosystem, making downstream consumers increasingly suspicious of all dependencies.\n\nThe breach of trust here is not the use of LLMs as a typing assistant. The failure occurs when maintainers abdicate their role as gatekeepers, accepting massive, unreadable commits because the tool made it easy to generate them.\n\n## The Developer Angle: Defending Your Dependency Tree\n\nIf you want to protect your project from the risks of LLM-generated dependencies, the path is steep and full of compromises.\n\nThe most direct approach is the one pioneered by git-annex: pinning your dependencies to versions that pre-date the widespread adoption of LLM code. On [Codeberg](https://codeberg.org), a community project called `open-slopware`\n\nhas begun tracking the last known untainted commit or version of various packages to help developers establish a baseline.\n\nHowever, this defense comes with severe trade-offs:\n\n**The Security Dilemma**: If a critical security vulnerability is discovered in a dependency, and the maintainer only releases the patch in a newer version that contains LLM-generated code, you are trapped. You must choose between running a known, vulnerable library or accepting the unreviewed LLM code.**Language Stagnation**: GHC (the Glasgow Haskell Compiler) is slated to include its first LLM-generated code in GHC 9.15. Because git-annex refuses to link against LLM-generated code, it must remain buildable with GHC 9.6.6. This prevents the project from adopting any new Haskell language features or performance improvements introduced in future compiler releases.**The Maintenance Burden**: Auditing every transitive dependency for LLM usage is an exhausting, manual process. It requires digging through commit histories, analyzing prompt-like commit messages, and tracking down the origin of large, sudden code dumps. For most small teams, this level of scrutiny is simply unsustainable.\n\nIf you choose to adopt a strict \"no LLM\" policy for your dependencies, you can implement a build configuration similar to git-annex's `NoLLMDependencies`\n\nflag. This involves maintaining custom dependency lockfiles or build configurations (such as a custom `stack.yaml`\n\nin the Haskell ecosystem) that explicitly exclude tainted versions.\n\n```\n# Example of pinning to pre-LLM versions in a stack-NoLLMDependencies.yaml\nresolver: lts-22.0\npackages:\n- .\nextra-deps:\n- ram-0.20.0 # Pinning to version before LLM code churn in 0.21.0\n- persistent-2.14.3.0 # Pinning to version before 2.15.0.0\n- yesod-1.6.2.1 # Pinning to version before 1.7.0.0\n```\n\nThis approach works in the short term, but as the ecosystem moves forward, the gravity of new language features, performance updates, and security patches will make maintaining these legacy forks increasingly difficult.\n\n## The Hard Choice Ahead\n\nWe are reaching a point where maintaining a completely human-written software stack will require freezing your development environment in the year 2023. For highly sensitive, long-term projects, that trade-off may be worth the cost. For the vast majority of web and application developers, it is a losing battle.\n\nThe realistic path forward is not a blanket ban, but a demand for better gatekeeping. Maintainers must treat LLM-generated contributions with higher scrutiny, not lower. If a contributor submits a PR with thousands of lines of code and an incoherent commit message, it must be rejected, regardless of whether it compiles. The tool we use to write the code is less important than our willingness to take responsibility for every single line we merge.\n\n## Sources & further reading\n\n-\n[No LLM Code in Dependencies](https://joeyh.name/blog/entry/no_LLM_code_in_dependencies/)— joeyh.name -\n[No LLM code in dependencies | Lobsters](https://lobste.rs/s/oe8pxn/no_llm_code_dependencies)— lobste.rs -\n[no llm code](https://git-annex.branchable.com/no_llm_code/)— git-annex.branchable.com -\n[Stop Getting Average Code from Your LLM | Krzysztof Zabłocki](https://merowing.info/posts/stop-getting-average-code-from-your-llm/)— merowing.info\n\n[Ji-ho Choi](https://sourcefeed.dev/u/jiho_choi)· Security & Cloud Editor\n\nJi-ho covers the increasingly tangled overlap between cloud architecture and security, drawing on a background as a penetration tester to keep his reporting grounded in real-world attack paths. He never lets a vendor claim go unquestioned and insists that every buzzword come with a proof of concept.\n\n## Discussion 0\n\nNo comments yet\n\nBe the first to weigh in.", "url": "https://wpnews.pro/news/the-supply-chain-risk-of-llm-code-in-dependencies", "canonical_source": "https://sourcefeed.dev/a/the-supply-chain-risk-of-llm-code-in-dependencies", "published_at": "2026-07-03 23:03:15+00:00", "updated_at": "2026-07-03 23:21:46.974571+00:00", "lang": "en", "topics": ["large-language-models", "ai-safety", "ai-ethics", "ai-policy", "developer-tools"], "entities": ["Joey Hess", "git-annex", "Haskell", "ram", "memory", "yesod", "George Akerlof"], "alternates": {"html": "https://wpnews.pro/news/the-supply-chain-risk-of-llm-code-in-dependencies", "markdown": "https://wpnews.pro/news/the-supply-chain-risk-of-llm-code-in-dependencies.md", "text": "https://wpnews.pro/news/the-supply-chain-risk-of-llm-code-in-dependencies.txt", "jsonld": "https://wpnews.pro/news/the-supply-chain-risk-of-llm-code-in-dependencies.jsonld"}}