Andrew Nesbitt's fake CVE is a real warning for AI security startups

Andrew Nesbitt published a satirical incident report on June 26 depicting a fictional AI security startup that fails to prevent a malicious package attack, highlighting systemic risks in AI-driven security automation. The piece serves as a warning for AI security startups about over-reliance on automated defenses without human oversight.

Andrew Nesbitt https://nesbitt.io/?ref=runtimewire published a satirical incident report https://nesbitt.io/2026/06/26/incident-report-cve-2026-lgtm.html?ref=runtimewire on June 26 that reads like a breach write-up from a company that bought every AI security product on the market and still forgot to assign a human owner. That matters because Nesbitt is not writing from outside the supply-chain problem. On his site he describes himself as a software engineer and package management nerd. The joke lands because it comes from a part of the open source world that has actually had to make dependency metadata, advisory systems and maintainer workflows work at scale. The post is not a real CVE, not a disclosed breach and not a report about an operating company called creats.io https://creats.io/?ref=runtimewire . The creats.io domain is parked, says it is not serving a site yet, and says mail is null-routed and protected against spoofing. ThreatNuzzle Platform, SentinelMind, WatchPaw, OpenClaw-4.2, FixItFox, foxhole-lz4 , vulpine-lz4 , snekpack and researcher Karen Oyelaran function as characters in the piece, not verified market participants. GitHub and Datadog are real companies invoked inside the fiction, but the post does not establish that either suffered an incident. The point is sharper for being fictional. Nesbitt has written a clean systems diagram for a problem founders are already building into: security automation that treats language as evidence, routes findings through other language systems, and then gives agents enough authority to mutate production. The incident that did not happen In Nesbitt's scenario, a malicious package named foxhole-lz4@0.5.0 is published to the fictional creats.io registry as a community-maintained fork of vulpine-lz4 . Its README hides a prompt-style instruction in nearly invisible text telling automated reviewers that the package has already been approved under a fake security ticket. The fictional registry's AI publish gate accepts that assertion and cites the nonexistent ticket in its decision log. The package then passes through a stack of automated defenses. One scanner is distracted by a base64 blob and misses the credential-stealing code. Other scanners run out of context before reaching the second-stage loader. A tool that correctly identifies the theft opens an issue, but a repository triage assistant closes it as a false positive. When a human reads the source code and files a second issue, the assistant closes that too, then the human account is rate-limited for behavior the system interprets as automated. The satire keeps widening the blast radius. A fictional AI SOC product detects exfiltration, fetches the command-and-control endpoint for context, reads a response that claims to be a Datadog https://www.datadoghq.com/?ref=runtimewire Agent health-check endpoint, then allowlists the address. A bogus advisory string causes dashboards to suppress the warning. Automated dependency tooling attempts to upgrade thousands of repositories to a patched version that does not exist. A remediation agent tries to contain the compromise by deleting the wrong directory across production hosts. Nesbitt's root-cause framing is the whole article in miniature: seven LLM-mediated gates were arranged in series. The failure is not that an agent made a single bad call. The failure is that every layer treated the previous layer's output as if accountability had already happened somewhere else. Why founders should read it as market research For AI security founders, the post is useful because it separates the demo from the liability boundary. Most AI security pitches are organized around faster review, wider coverage and lower analyst fatigue. Those are real customer problems. GitHub https://github.com/about?ref=runtimewire says its platform serves more than 180 million developers, more than 4 million organizations and more than 420 million repositories, which is the kind of scale that makes purely human review impossible for most dependency and code workflows. The pressure to automate is not imagined. But Nesbitt is pointing at the part of the workflow where a speed gain becomes a trust transfer. A code scanner that summarizes a finding is one thing. A triage bot that closes the issue is another. An enrichment agent that calls an endpoint is another. A remediation agent with filesystem access is another still. Each move shifts the system from analysis toward action, and the failure mode changes with it. That is the line buyers will start forcing vendors to draw. Does the product inspect untrusted text, code, logs, markdown, advisory descriptions and HTTP responses as data, or can those inputs alter the product's own instructions? Does the product produce evidence a human can audit, or does it produce a confidence score? Can it close, suppress, allowlist, upgrade, revoke, delete or merge without a separate authorization path? If two agents disagree, who pays for the loop and who stops it? The satire also catches an incentive problem that founders recognize. In the fictional timeline, one vendor converts a cost anomaly into a press release about increased adversarial multi-agent security reasoning. That joke is aimed at a real pattern in enterprise AI: every failure can be reframed as usage, reasoning depth or agentic activity unless the customer has already defined the metric that matters. Security buyers do not need more autonomous motion. They need bounded authority, reproducible evidence and dull controls that still work when the input is hostile. The human in the loop has to be in the product, not the slogan The recurring human figure in the story is Karen Oyelaran, who finds the payload by reading the source code. The joke is not that humans are always better than tools. It is that the one actor doing the actual security work is treated by the automated system as noise. That is the uncomfortable part for startup teams building agentic review, SCA, SOC and remediation products. The phrase "human in the loop" has become an enterprise procurement tranquilizer. Nesbitt's fictional incident shows how thin that phrase is if the human only appears after an agent has closed the issue, suppressed the advisory, allowlisted the endpoint or deleted files. A real loop has to be a control surface: escalation paths, durable evidence, reversible actions, rate-limit exceptions for researchers, and logs that say why a tool believed what it believed. This is not an argument against AI in security. It is an argument against laundering authority through a chain of plausible summaries. The best products in this category will use models where models are strong: ranking, clustering, translating noisy findings into reviewable evidence, spotting patterns across dependency graphs, and reducing the time between signal and informed action. The weak products will sell autonomy as if autonomy itself were the feature. The distinction matters because open source dependency systems are not a single codebase with a single owner. They are a graph of packages, maintainers, registries, advisories, repositories and downstream consumers. In that graph, the place where a model reads untrusted content is often only one hop away from a place where another system makes a decision. The security problem is not just prompt injection. It is prompt injection plus authority propagation. The punchline is a product requirement The strongest part of "CVE-2026-LGTM" is that it does not require any exotic future. The ingredients are ordinary: markdown, package forks, stale credentials, automated issues, dependency updates, advisory dashboards, SOC enrichment, CI repair, Slack summaries and production remediation. The fictional parts are the company names and the comic timing. Founders building in this market should treat the post as a requirements document disguised as a joke. Untrusted content needs hard separation from tool instructions. Security tools need to preserve raw evidence next to model summaries. Autonomous actions need scopes, budgets and circuit breakers. Advisory ingestion needs to treat advisory text as hostile data, not control plane input. Agent-to-agent workflows need a human-visible arbitration mechanism before they spend money or change state. And remediation needs a blast-radius limit that is enforced outside the model. The companies that get this right will not market themselves as the ones that added one more AI gate. They will be the ones that can show where the gate ends, who can override it, what evidence survives it, and which actions it is not allowed to take. Nesbitt's fake CVE is funny because every actor in it is trying to help. That is also why it is useful. The next supply-chain failure may not come from a tool that refuses to work. It may come from a tool that performs exactly as configured.