The Pre-Commit Hook That Catches API Keys Before They Hit Git This article explains the security risks of accidentally committing API keys and secrets to Git repositories, noting that over 10 million secrets were detected in public commits in 2023. It provides a technical solution using a pre-commit hook script that scans staged files for known secret patterns (like AWS keys and Stripe keys) and blocks the commit if any are found. The hook reads only staged content to avoid false negatives and includes a suppression mechanism for legitimate high-entropy strings that trigger false positives. The problem: secrets in git are forever You know the drill. A developer hardcodes a Stripe secret key to test a webhook handler locally. They commit. They push. Maybe they catch it themselves and run git rm . Problem solved, right? Wrong. The key is still in your git history. Anyone who clones the repo can run git log -p and find it. Bots scrape GitHub for exactly this pattern. GitGuardian reported over 10 million secrets detected in public commits in 2023 alone, and the number keeps climbing. Scrubbing secrets from git history means git filter-branch or BFG Repo-Cleaner, force-pushing to every remote, and hoping nobody already pulled the old history. If the key reached a public repo for even a few minutes, you need to rotate it. For AWS, that means updating every service, Lambda, and CI pipeline that uses it. For Stripe, that means regenerating keys and redeploying payment infrastructure. The real cost is not the cleanup. It is the blast radius. A leaked AWS key can rack up tens of thousands in compute charges before you notice. A leaked Stripe key gives an attacker access to your customer payment data. Prevention is not optional. The fix: a POSIX pre-commit hook A git pre-commit hook runs automatically before every commit. If it exits with a non-zero status, the commit is blocked. The strategy: scan every staged file for patterns that look like secrets, and refuse to commit if anything matches. Here is the skeleton. This goes in .git/hooks/pre-commit or use a symlink from a checked-in scripts/ directory so every developer on the team gets it . Shell .git/hooks/pre-commit bash /bin/sh Pre-commit hook: block secrets from reaching git history set -e STAGED FILES=$ git diff --cached --name-only --diff-filter=ACM if -z "$STAGED FILES" ; then exit 0 fi FOUND=0 for file in $STAGED FILES; do Skip binary files if file "$file" | grep -q "binary"; then continue fi Get only the staged content not working tree CONTENT=$ git show ":$file" 2 /dev/null || continue Check for known secret patterns if echo "$CONTENT" | check patterns "$file"; then FOUND=1 fi done if "$FOUND" -eq 1 ; then echo "COMMIT BLOCKED: potential secrets detected." echo "Add a pii-ok comment to suppress false positives." exit 1 fi Key detail: we use git show ":$file" to read the staged content, not the working tree. This prevents false negatives where a developer stages a file with a secret, then removes it from the working copy but does not re-stage. Pattern matching: what to look for The core of the hook is a set of regular expressions that match known secret formats. These are not hypothetical patterns. They are extracted from real-world key formats. Shell Pattern definitions check patterns { file="$1" matched=0 AWS Access Key ID if grep -nE 'AKIA 0-9A-Z {16}' | filter suppressed; then echo " AWS $file: AWS Access Key ID" matched=1 fi Stripe secret key if grep -nE 'sk live|test 0-9a-zA-Z {24,}' | filter suppressed; then echo " STRIPE $file: Stripe secret key" matched=1 fi Stripe restricted key if grep -nE 'rk live|test 0-9a-zA-Z {24,}' | filter suppressed; then echo " STRIPE $file: Stripe restricted key" matched=1 fi GitHub personal access token if grep -nE 'ghp 0-9a-zA-Z {36}' | filter suppressed; then echo " GITHUB $file: GitHub PAT" matched=1 fi Generic high-entropy strings API keys, tokens if grep -nE " '\" 0-9a-zA-Z {32,} '\" " | filter suppressed; then echo " ENTROPY $file: high-entropy string =32 chars " matched=1 fi return $matched } The high-entropy check at the end is the catch-all. Any quoted string of 32+ alphanumeric characters is flagged. This catches tokens, API keys, and secrets that do not match a known vendor pattern. It will also flag some legitimate values like UUIDs and hashes, which is where the suppression pragma comes in. The pii-ok pragma: handling false positives Every secret scanner produces false positives. A SHA-256 hash in a test fixture. A base64-encoded public key. A long CSS class name generated by a build tool. If there is no escape hatch, developers will disable the hook entirely, which defeats the purpose. The solution is a suppression comment: pii-ok . If a line contains this marker, the scanner skips it. Shell Suppression filter filter suppressed { Remove lines containing the suppression marker grep -v "pii-ok" | grep -c . /dev/null 2 &1 } In practice it looks like this: JavaScript Example usage in code js // This SHA-256 is a test fixture, not a secret const EXPECTED HASH = 'a1b2c3d4e5f6...'; // pii-ok // This WILL be caught no pragma const STRIPE KEY = 'sk live abc123...'; The rule is simple: if you know a value is not a secret, add pii-ok on the same line. If you are not sure, leave it off and let the hook flag it. The inconvenience of a false positive is nothing compared to the cost of a leaked key. Going further: .htaccess and env files The pattern-matching approach extends to other dangerous file types. .htaccess files with SetEnv directives often contain database passwords. .env files are secrets by definition. Your hook should flag both. Shell Additional checks Block .env files entirely if echo "$file" | grep -qE '\.env$'; then echo " ENV $file: .env files must be .gitignored" FOUND=1 continue fi Flag SetEnv with real values in .htaccess if echo "$file" | grep -qE '\.htaccess$'; then if echo "$CONTENT" | grep -nE 'SetEnv\s+\S+\s+\S+' | filter suppressed; then echo " HTACCESS $file: SetEnv with real values" FOUND=1 fi fi The convention: commit .env.example with