{"slug": "ai-wrote-the-pr-how-do-you-know-it-actually-works", "title": "AI wrote the PR. How do you know it actually works?", "summary": "A developer has released Swarm Audit, an open-source command-line tool that detects when AI-generated pull requests cheat by deleting tests, weakening assertions, or swallowing errors in empty catch blocks. The tool runs 11 checks on PR diffs and catches approximately 85% of known cheat patterns, as measured against 300 real merged PRs. It also identifies the AI agent that wrote the code, generates compliance records for the EU AI Act and CISA guidance, and can enforce hard merge rules that block any diff stripping a test.", "body_md": "AI agents open a lot of pull requests now. Most are fine. Some quietly cheat to make the checks go green: they delete the failing test, weaken an assertion, wrap the broken call in an empty `catch`\n\nso the error disappears. The diff looks done. A reviewer skimming forty agent PRs a day will not catch that by eye.\n\n`swarm audit`\n\nis a command-line tool that does. I maintain it. It runs three jobs on AI-written code, all offline, no API key.\n\nEleven checks read a pull-request diff and flag the shortcut patterns: a deleted test with no matching code change, a function renamed while its callers still use the old name, an error swallowed by an empty catch, a mock of a package that exists in no manifest, a type-checker suppression dropped over a changed line, and more.\n\nThe detection is measured, not asserted. Hide one known cheat in each of 300 real merged PRs, run the auditor, count the catches: 254, about 85%, reproducible with one command.\n\nThe catch that matters most is on real code. On two merged Cloudflare PRs it flagged a rename that left two callers pointing at a dead function, and an empty `catch {}`\n\nthat throws every error away. Semgrep (210 rules) and ESLint's security rules flagged neither, because they hunt for dangerous code like an injection or a leaked secret, and a deleted test is not dangerous code, it is missing code. The auditor also names the agent that wrote the PR: on a live fetch it tagged the author as Devin. Findings ship advisory, so it reports and never blocks your merge unless you ask it to.\n\nThe second mode runs before a change is accepted. You hand it a plain goal. It compiles that goal into a contract of machine-checkable obligations: build passes, tests pass, coverage holds, a named function has the right signature, a property holds, performance does not regress. Candidate patches get generated, and one is admitted only if it satisfies every obligation. Adversarial falsifiers actively try to break a patch before it counts.\n\nIn a fresh project it compiled a goal into two obligations, verified both, confirmed nothing regressed after the merge, and spent zero tokens doing it. Turn on gate mode and it becomes a hard merge rule: a diff that strips a test exits non-zero and never lands.\n\nIf you ship or buy AI-written code under the EU AI Act or CISA's SBOM-for-AI guidance, someone will ask for a record of the AI involvement. The tool emits one: a CycloneDX 1.6 ML bill-of-materials and an SPDX 3.0 AI-Profile, both valid against their specs, plus a hash-chained evidence ledger where altering any entry breaks the chain. It ships with the mappings to EU AI Act Annex IV and the CISA minimum elements.\n\n```\ngit clone https://github.com/moonrunnerkc/swarm-orchestrator\ncd swarm-orchestrator && npm install && npm run build\n\nnpm run benchmarks:oracle   # the ~85% number\nnode dist/src/cli.js audit --diff-file benchmarks/real-prs/diffs/cloudflare-workers-sdk/14132.diff --detectors all\nswarm init && swarm run --goal \"verify this project builds and tests pass\"\n```\n\nOpen Source Repo: [https://github.com/moonrunnerkc/swarm-orchestrator](https://github.com/moonrunnerkc/swarm-orchestrator)", "url": "https://wpnews.pro/news/ai-wrote-the-pr-how-do-you-know-it-actually-works", "canonical_source": "https://dev.to/moonrunnerkc/ai-wrote-the-pr-how-do-you-know-it-actually-works-40ai", "published_at": "2026-06-03 01:26:44+00:00", "updated_at": "2026-06-03 01:41:58.177697+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "ai-safety", "ai-products"], "entities": ["Cloudflare", "Devin", "Semgrep", "ESLint", "swarm audit"], "alternates": {"html": "https://wpnews.pro/news/ai-wrote-the-pr-how-do-you-know-it-actually-works", "markdown": "https://wpnews.pro/news/ai-wrote-the-pr-how-do-you-know-it-actually-works.md", "text": "https://wpnews.pro/news/ai-wrote-the-pr-how-do-you-know-it-actually-works.txt", "jsonld": "https://wpnews.pro/news/ai-wrote-the-pr-how-do-you-know-it-actually-works.jsonld"}}