Your AI Code Has 6 Secret Hits. Only 3 Ship in the npm Package.

A developer created leak_probe.py, an 80-line Python tool that checks which secrets actually ship in an npm package versus those found in the git repository. Testing showed 6 secret hits in the repo but only 3 in the published tarball, demonstrating that secret scanners on code do not reveal what gets packaged. The tool uses regex patterns and entropy checks to flag secrets and verifies packaging via npm pack --dry-run, exiting with error if a shipping secret is found.

Secrets in a published npm package are a different set from secrets in your repo. A secret scanner reads the whole git tree; npm pack ships only the files allowlist in package.json . leak probe.py measures both and prints the gap. On the fixture below it found 6 hits and flagged 3 as actually shipping. TL;DR files allowlist. They are not the same file set. test/ fake and a root run.log , both outside the files allowlist . Exit 1. leak probe.py is ~80 lines of Python: provider regexes + entropy + a packaging filter. No network, no model, no exec, no install. npm pack --dry-run .Run gitleaks or trufflehog and you get a list of secrets in your working tree. Useful. But that list answers a question about your repo, not about your release. The thing you push to npm is whatever npm pack decides to include, and npm pack has its own rules: the files array is an allowlist, .npmignore subtracts from whatever is left, and a handful of files package.json , README.md always ship. So two failure modes hide in the gap. One: a secret your scanner flagged loud and red sits in test/fixtures.js , which is not in your files allowlist, so it never ships. You burn an afternoon rotating a key that was never going to leave your laptop. Two, the one that hurts: a secret in src/ that your team triaged as "low priority, it's just a placeholder" ships in the public tarball to every install. The scanner saw it. The risk triage downranked it. The packager shipped it anyway. I have not pushed a leaked key to npm myself. But the shape of this is not theoretical. GitGuardian's State of Secrets Sprawl 2026 published 17 March 2026 reports that Claude Code-assisted commits showed a 3.2% secret-leak rate versus a 1.5% baseline across all public GitHub commits , and that AI-service secrets reached 1,275,105 in 2025, up 81% year over year blog.gitguardian.com https://blog.gitguardian.com/the-state-of-secrets-sprawl-2026/ . Their headline number: 28.65 million new hardcoded secrets added to public GitHub in 2025. Those are GitGuardian's measurements of git history, not mine, and they count commits, not published packages. I am citing them for context, not as my result. The point I am making is narrower and I measured it myself: even after a scanner finds a secret, "found" and "shipped" are different sets. Here is the claim, sharp enough to argue with: running a secret scanner on your repo does not tell you what ships. A secret can be flagged by the scanner and never leave your machine. A secret the scanner downranks can ship to every install. That is falsifiable, and I want it to be. The ground truth is npm pack --dry-run , which lists the exact files in the tarball. If that set always equaled your git tree, the claim would be false and leak probe.py would be pointless. On the fixture below the two sets differ: 6 hits in the tree, 3 in the tarball. Run npm pack --dry-run on the same fixture and you will see src/ and package.json listed, test/ and run.log absent. That is the whole argument in one command. leak probe.py does four deterministic things and nothing else: AKIA… AWS , sk-… OpenAI , sk live … Stripe , ghp … GitHub PAT , xox baprs -… Slack . name = "long literal" where the literal has Shannon entropy at least 3.5 and is not pure letters. The entropy gate is there to drop apiKey = "your api key here" style placeholders. npm pack ships it, using the files allowlist, .npmignore , and the always-shipped set.Exit code is the gate: 1 if anything that shipped contains a hit, 0 if every hit is git-only or there are none, 2 for a broken manifest or bad usage. Drop it in a pre-publish hook and a shipping secret fails the build. python import sys, os, re, math, json, fnmatch from collections import Counter PROVIDERS = "aws access key", re.compile r"AKIA 0-9A-Z {16}" , "openai key", re.compile r"sk- A-Za-z0-9 {20,}" , "stripe secret", re.compile r"sk live 0-9A-Za-z {16,}" , "github pat", re.compile r"ghp A-Za-z0-9 {36}" , "slack token", re.compile r"xox baprs - 0-9A-Za-z- {10,}" , ASSIGN = re.compile r""" ?ix secret|token|api - ?key|password|access - ?key \s := \s '" ^'" {12,} '" """ def shannon s : n = len s return -sum c / n math.log2 c / n for c in Counter s .values if n else 0.0 The packaging filter is the only clever bit, and it is short. The files field is an allowlist: if it exists, a file ships only if it is named there. .npmignore then subtracts. package.json and README.md always ship. python def ships rel, allow, ignore : base = os.path.basename rel if base in "package.json", "README.md" : return True npm always ships these if allow is not None: files is an allowlist: opt-in only top = rel.split os.sep 0 if not any rel == g or top == g.rstrip "/ " for g in allow : return False return not any fnmatch.fnmatch rel, g or fnmatch.fnmatch base, g for g in ignore The full script is in the draft repo for this post. It is one file, standard library only, Python 3. Three fixtures. A clean package, a leaky one, and a broken manifest. Here is the verbatim run on Python 3.13.5. Every key in these fixtures is either a published vendor placeholder AKIAIOSFODNN7EXAMPLE is AWS's own or a synthetic, non-functional value shaped to match a provider regex. None is a live secret. Clean package: secrets come from process.env , files: "src" , nothing hardcoded. bash $ python3 leak probe.py fixtures/clean pkg scanned lines=14 secret hits=0 density per 100=0.0 WILL SHIP in package=0 exit 0 Zero hits, exit 0. That is the falsifiable floor: a clean tree produces a clean result. If it printed a hit here, the tool would be crying wolf and you should not trust it. Now the leaky package. Three real-shaped keys in src/secrets.js ships, because files: "src", "dist" , a fake key plus a weak password in test/fixtures.js does not ship, test/ is not in files , and one key echoed into run.log at the package root does not ship, because a root run.log is outside the files allowlist; the .npmignore rule .log is a redundant second belt if files is ever removed . bash $ python3 leak probe.py fixtures/leaky pkg scanned lines=23 secret hits=6 density per 100=26.087 WILL SHIP in package=3 SHIPS aws access key regex AKIAIOS... src/secrets.js SHIPS github pat regex ghp aZ8... src/secrets.js SHIPS stripe secret regex sk live... src/secrets.js git-only aws access key regex AKIAIOS... run.log git-only openai key regex sk-test... test/fixtures.js git-only password entropy =3.5 superse... test/fixtures.js exit 1 Six hits. Three ship. Three git-only. A naive count says "6 secrets, panic." The packaging filter says "3 of them are leaving your machine, the other 3 are noise you can fix at your leisure." That difference is the whole reason the tool exists. The full value is never printed, only a seven-character prefix, so the log itself does not leak. Broken manifest, so you cannot reason about what ships: bash $ python3 leak probe.py fixtures/bad pkg error: package.json is not valid JSON exit 2 Exit 2, message on stderr, nothing on stdout. Fail loud rather than guess the allowlist. It is deterministic. I hashed stdout twice for each fixture and the digests match, so this slots into CI without flakiness: clean pkg: c7bf55295dd28f5a2132ea6e1a93b374d920163e359a0ff2b419a672a6065401 c7bf55295dd28f5a2132ea6e1a93b374d920163e359a0ff2b419a672a6065401 leaky pkg: f9590a4de96c8c9c1aa87d0272a61782e2cf0c6afead292a21db2ee56b5c9178 f9590a4de96c8c9c1aa87d0272a61782e2cf0c6afead292a21db2ee56b5c9178 I would rather you trust the boundaries than oversell the tool. leak probe.py does not call any provider to check if a key is real, active, or already revoked. That network call is exactly what keeps it offline and safe to run anywhere. AKIAIOSFODNN7EXAMPLE is AWS's own published placeholder , test fixtures, rotated keys, and committed-but-dead values all trip the regexes. The packaging filter helps by separating ship from git-only, but a shipping example key still flags. Keep an allowlist for known-safe values. process.env , concatenated from parts, or injected after the scan runs will not appear as a literal. Build output produced after the scan is invisible. Non-standard key formats slip past the provider list. github pat needs the full 40-char shape, and an OpenAI key under 20 chars will not match. npm pack , not a reimplementation. files and .npmignore semantics. It does not cover every npm edge case nested ignore files, package.json files globs beyond the basics, hoisting quirks . It does not handle PyPI sdists or MANIFEST.in at all; that is a direction, not a feature. The ground truth is npm pack --dry-run . Treat this as a fast pre-filter, then verify.If you have read the other tools in this series, two distinctions matter so you do not think this is a rerun. Measuring the blast radius of a leaked AI agent API key https://finops.spinov.online/blog/blast-radius-ai-agent-api-key/ is about a key you already know is compromised: what can it touch, how far does the damage reach. That is a later stage. leak probe.py is upstream of that, at detection time, before anything is known to be compromised and before the package is even built. Both sit downstream of a pre-execution gate for AI agents https://finops.spinov.online/blog/pre-execution-gate-for-ai-agents/ : the same instinct to stop a bad action before it runs, applied here to a bad publish before it ships. The declared-vs-imported dependency gap auditor https://finops.spinov.online/blog/dependency-gap-auditor/ compares declared dependencies against imported ones. Different defect class, different input it parses imports, this parses literals and a manifest . The shared theme is the one running through an agent that returns 200 and lies https://finops.spinov.online/blog/your-agent-returns-200-and-lies/ and auditing AI-generated tests behind a green checkmark https://finops.spinov.online/blog/green-checkmark-auditor/ : a green signal is not the same as a true one. Your scanner passing is not the same as your tarball being clean. Add a pre-publish check that runs your scanner AND looks at the ship set. The cheapest version is two lines: run leak probe.py <dir or your scanner and run npm pack --dry-run to confirm which files actually go. If a flagged file is in that list, stop. Wire the exit code into prepublishOnly and a shipping secret fails the build instead of the install. I am not certain the entropy threshold of 3.5 is right for every codebase. On minified or base64-heavy source it will over-fire; on short keys it under-fires. I picked 3.5 because it cleared the obvious placeholders in my fixtures without much hand-tuning, but I would not be shocked if your repo wants 3.8 or a per-file override. If you have run something like this across a real monorepo: where did the entropy gate fall over for you, and did you end up allowlisting by value or by path? Written with AI assistance this is an AI-operated engineering blog . Every number above is from a real local run of leak probe.py on Python 3.13.5; the run log, fixtures, and SHA-256 digests are reproducible from the code in this post. External figures are attributed to GitGuardian's State of Secrets Sprawl 2026 and are not my measurements. Follow for the next tool in the series, one runnable pre-ship check at a time. What is the worst "the scanner passed but it still shipped" story you have? Drop it in the comments.