{"slug": "i-automated-opt-outs-for-500-data-broker-sites-open-source", "title": "I automated opt-outs for 500 data broker sites (open source)", "summary": "Open-source macOS tool called \"auto-identity-remove\" that automates the process of opting out of over 500 data broker and people-search sites. The script runs monthly, uses AI-powered CAPTCHA solving, tracks completed opt-outs to avoid resubmission, and sends iMessage notifications with results. The tool is designed for users who want a free, self-hosted alternative to paid data removal services.", "body_md": "Automated data broker opt-out runner for macOS, Linux, and Windows. Removes your personal information from **500+ people-search sites and data broker databases** on a monthly schedule - with CAPTCHA solving, persistent state tracking (so completed opt-outs aren't resubmitted every run), and an iMessage notification when done. **Privacy & data flow ->**\n\nEach month, the script:\n\n**Searches** each data broker site for your name + state**Finds your specific listing**(for sites that need a profile URL)** Fills and submits**the opt-out form automatically** Solves CAPTCHAs**via[CapSolver](https://capsolver.com)(AI-powered, ~$0.001/solve)** Skips**brokers you were already removed from recently (90-day re-check window)** Sends you an iMessage**with the results summary** Opens**any sites that require manual action in your browser\n\n- Node.js 18+\n- macOS, Linux, or Windows (scheduling adapts automatically)\n[Playwright](https://playwright.dev)browsers installed\n\n```\nnpx playwright install chromium\n# 1. Clone the repo\ngit clone https://github.com/stephenlthorn/auto-identity-remove.git\ncd auto-identity-remove\n\n# 2. Install dependencies\nnpm install\n\n# 3. Run interactive setup (creates config.json and schedules the monthly job)\nnode setup.js\n\n# 4. Run manually anytime\n./run.sh\n```\n\n`node setup.js`\n\nguides you through:\n\n| Step | What it does |\n|---|---|\nPersonal info |\nName, city, state, ZIP, email, phone |\nAliases |\nPast names or variations (e.g. \"Steve Doe\") |\nCapSolver key |\nFor CAPTCHA-protected opt-out forms |\nOne-time accounts |\nCreates accounts on sites that require login (stored in `config.json` , gitignored) |\niMessage |\nPhone number to text the results summary to |\nMonthly schedule |\nRegisters a monthly job to run on the 1st at 9am (launchd / systemd / crontab / schtasks - detected automatically) |\n\n**Your personal info never leaves your machine.** `config.json`\n\nand `state.json`\n\nare both gitignored.\n\nSome opt-out forms have reCAPTCHA. Without CapSolver, those sites go to your manual list instead of being handled automatically.\n\n- Sign up at\n[capsolver.com](https://capsolver.com)- free, pay-as-you-go - Add $1-2 of credits (enough for months of use at ~$0.001/solve)\n- Paste your API key when\n`setup.js`\n\nasks, or add it to`config.json`\n\n:\n\n```\n\"capsolver\": {\n  \"apiKey\": \"CAP-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\"\n}\n```\n\nCapSolver is optional.Without it, CAPTCHA-protected sites are flagged as manual and opened in your browser for completion. Pass`--no-capsolver`\n\nto skip them entirely rather than opening the browser.\n\nThe included `Dockerfile`\n\nuses the official Playwright image, so Chromium and\nall system dependencies are pre-installed. No Mac required.\n\n```\n# Build the image (once)\ndocker build -t auto-identity-remove .\n\n# Dry-run (no opt-out forms submitted, no network calls)\ndocker run --rm \\\n  -v $(pwd)/config.json:/app/config.json \\\n  -v $(pwd)/state.json:/app/state.json \\\n  auto-identity-remove node watcher.js --dry-run\n\n# Full run\ndocker run --rm \\\n  -v $(pwd)/config.json:/app/config.json \\\n  -v $(pwd)/state.json:/app/state.json \\\n  auto-identity-remove\n```\n\n**Persistent state:** mount `state.json`\n\nso completed opt-outs are remembered\nbetween container runs. If the file does not exist yet, create an empty one\nfirst: `echo '{}' > state.json`\n\n.\n\nWhen running headless or in Docker you won't have iMessage or a desktop - use\na webhook instead. Set `notify.webhook`\n\nin `config.json`\n\nto any ntfy.sh,\nSlack incoming-webhook, or Discord webhook URL:\n\n```\n\"notify\": {\n  \"textTo\": \"\",\n  \"webhook\": \"https://ntfy.sh/my-private-channel\"\n}\n```\n\nThe tool POSTs `{\"text\": \"<summary>\"}`\n\nafter every run. Works on macOS, Linux,\nand Windows - the webhook fires in addition to (not instead of) any platform\nnotification that is available.\n\n```\nauto-identity-remove/\n├── setup.js            ← Run once: interactive setup + scheduling\n├── watcher.js          ← Main runner\n├── brokers.js          ← Broker list with opt-out strategies\n├── run.sh              ← Manual trigger\n├── config.example.json ← Template (copy → config.json)\n├── package.json\n├── .gitignore\n│\n├── config.json         ← YOUR personal info (gitignored, created by setup.js)\n├── state.json          ← Opt-out history / skip logic (gitignored)\n└── logs/               ← Per-run JSON logs (gitignored)\n```\n\n`state.json`\n\ntracks when each broker was last successfully opted out. The default re-check window is **90 days** - brokers typically re-add your data within that window, so the script re-submits when it's time.\n\n```\n{\n  \"optOuts\": {\n    \"Spokeo\": {\n      \"lastSuccess\": \"2026-05-01T09:00:00.000Z\",\n      \"totalRuns\": 3,\n      \"detail\": \"\"\n    }\n  }\n}\n```\n\nOn each run you'll see:\n\n`✅ Submitted (form accepted)`\n\n- opt-out form was submitted this run`📧 Awaiting email confirm`\n\n- broker replied \"check your email to confirm\"; click the link in your inbox. Auto-retried after 14 days if still pending.`⏭ Skipped (fresh)`\n\n- removed recently, re-check not due yet`🔍 Not listed`\n\n- your name wasn't found on that site`📋 Manual needed`\n\n- opened in your browser for you to handle`❌ Error`\n\n- network/timeout issue, will retry next run`💀 Dead (stale URL)`\n\n- broker URL is gone (DNS/404); not counted as an error\n\nSubmitted ≠ confirmed deleted.Use`node watcher.js --verify`\n\nfor spot-check verification. See[STATUS.md]for a per-broker confidence table.\n\nThis tool covers 500+ data brokers in two tiers:\n\n| Tier | Count | Confidence |\n|---|---|---|\nExplicit brokers (\n|\n42 | Hand-mapped with specific selectors. `verified` entries have been tested live; `untested` ones may have drifted since they were added. |\nGeneric runner |\n~490 | Best-effort heuristic - tries 4 strategies (Do Not Sell click, OneTrust/TrustArc, generic form, DSAR link). Many succeed; some fail silently. |\n\nThe `✅ Submitted`\n\ncount means the form was accepted by the broker. It does **not** prove deletion. To check:\n\n- Run\n`node watcher.js --verify`\n\n- re-searches each broker where a successful opt-out was recorded and reports whether your name still appears. - Look at the\n`📧 Awaiting email confirm`\n\nsection after each run - these are half-done until you click the link.\n\nIf you want to know exactly which brokers are hand-verified vs heuristic, see [STATUS.md](/stephenlthorn/auto-identity-remove/blob/main/STATUS.md).\n\n| Site | Method |\n|---|---|\n| Spokeo | Search → find listing → opt-out form |\n| WhitePages | Search → find listing → suppression form |\n| FastPeopleSearch | Search → opt-out form |\n| TruePeopleSearch | Direct opt-out form |\n| BeenVerified | Opt-out search form |\n| Radaris | Search → privacy form |\n| Intelius | Direct opt-out form |\n| PeopleFinders | Direct opt-out form |\n| PeopleSmart | Direct opt-out form |\n| MyLife | Search → opt-out |\n| Nuwber | Search → removal form |\n| FamilyTreeNow | Direct opt-out form |\n| CheckPeople | Direct opt-out form |\n| ThatsThem | Direct opt-out form |\n| USPhonebook | Direct opt-out form |\n| PublicDataUSA | Direct opt-out form |\n| SmartBackgroundChecks | Direct opt-out form |\n| SearchPeopleFree | Direct opt-out form |\n| PeopleSearchNow | Direct opt-out form |\n| InfoTracer | Direct opt-out form |\n| SocialCatfish | Direct opt-out form |\n| NationalPublicData | Direct opt-out form |\n| ClustrMaps | Direct opt-out form |\n| PrivateRecords | Direct opt-out form |\nAcxiom |\nDirect form (feeds dozens of downstream brokers) |\nLexisNexis |\nDirect form (legal/financial data) |\nZoomInfo |\nDirect form (B2B professional data) |\nClearbit |\nDirect form (B2B enrichment data) |\n| Pipl | Email opt-out via Mail.app |\n\n`generic-runner.js`\n\nhandles the remaining ~470 brokers from two public datasets:\n\n| Dataset | Source | Count |\n|---|---|---|\n|\n\n[BADBOOL](https://github.com/yaelwrites/Big-Ass-Data-Broker-Opt-Out-List)For each site it tries four strategies in order:\n\n- Click a \"Do Not Sell My Personal Information\" button\n- Opt out via OneTrust / TrustArc / Osano privacy manager\n- Fill any generic opt-out form (email, name, state) and submit\n- Find and record a DSAR / data request link for manual follow-up\n\nSites requiring manual action are opened in your browser automatically.\n\n| Site | Why manual |\n|---|---|\n| Google - Results About You | Requires Google account interaction |\n| Google - Outdated Content | Case-by-case URL submission |\n\nEdit `brokers.js`\n\nand add an entry:\n\n```\n{\n  name: 'NewBrokerSite',\n  method: 'direct-form',           // or 'search-form', 'email', 'manual'\n  optOutUrl: 'https://example.com/opt-out',\n  formFields: {\n    'input[name*=\"first\" i]': F,   // F, L, N, E, ST, Z are from config\n    'input[name*=\"last\"  i]': L,\n    'input[type=\"email\"]':    E,\n  },\n  submitSelector: 'button[type=\"submit\"]',\n  captchaLikely: false,\n  priority: 2,\n}\n```\n\nPRs welcome - especially for brokers with verified working selectors.\n\n```\n./run.sh\n```\n\n**Dry-run mode** - navigates to each site and fills forms but does NOT submit anything. Good for verifying what the script will do before your first real run:\n\n```\nnode watcher.js --dry-run\n```\n\nOr to run in the background and log output:\n\n```\n./run.sh >> logs/manual-run.log 2>&1 &\n```\n\nRun a read-only spot-check to see whether previous opt-outs are still in effect:\n\n```\nnode watcher.js --verify\n```\n\nThis opens a browser, searches each broker where you have a recorded successful opt-out, and reports what it finds. No forms are submitted, nothing is written to `state.json`\n\n.\n\nOutput is grouped into three sections:\n\n| Section | Meaning |\n|---|---|\n`VERIFIED CLEAR` |\nYour name was not found in the broker's search today |\n`STILL LISTED` |\nA listing was found - the opt-out may have failed, or your data was re-added |\n`UNVERIFIABLE` |\nThe broker uses a direct-form, email, or manual method - no automated search signal exists to check |\n\nA dated JSON report is saved to `logs/verify-YYYY-MM-DD.json`\n\n.\n\n**Important caveats:**\n\n- Only\n`search-form`\n\nbrokers (those with a`searchUrl`\n\nand`listingPattern`\n\n) can be checked automatically. Direct-form and email opt-outs are always`unverifiable`\n\n. - \"Verified clear\" means your name was not found in one search today. It is\n**not** a legal guarantee of deletion. Brokers routinely re-ingest data from upstream sources. - \"Still listed\" can mean the opt-out failed\n**or** the broker re-added your data since the last successful opt-out was recorded. Either way, re-running`node watcher.js`\n\nwill attempt removal again. - If the broker's search page is down or slow, the result is classified as\n`unverifiable`\n\n(a timeout is not counted as \"still listed\").\n\nWARNING: This feature may violate broker Terms of Service.Submitting fabricated opt-out requests to data broker sites is ethically questionable and could expose you to legal risk. Use at your own discretion. This feature isoff by defaultand is provided only as a research/experimental tool.\n\nThe `--pollute N`\n\nflag submits `N`\n\nrandomly-generated fake person records to data brokers that are explicitly tagged `acceptsBogus: true`\n\nin `brokers.js`\n\n. The goal (inspired by a suggestion on HN) is to flood broker databases with junk records, degrading the accuracy of their search results.\n\n```\n# Submit 10 bogus records to each acceptsBogus broker\nnode watcher.js --pollute 10\n```\n\nEach fake record uses:\n\n- A random name from a small fixture list (not real people)\n- A US city/state/zip from a fixture of 50+ valid combos (not your address)\n- A 10-digit phone with an area code valid for the fake state\n- A randomised\n`firstname.lastname+XXXXXX@gmail.com`\n\nemail\n\nOnly brokers tagged `acceptsBogus: true`\n\nin `brokers.js`\n\nwill receive noise submissions. Currently tagged: ThatsThem, SearchPeopleFree, PeopleSearchNow, InfoTracer, SocialCatfish. These are direct-form brokers with no SSN/DOB gate.\n\n**Regular opt-outs run first** - noise submissions happen after the normal run. The `--pollute`\n\nflag has no effect on your real opt-out submissions.\n\nThe Markup dataset is years old; many of the ~489 generic opt-out URLs now 404 or fail DNS lookup. These are classified as `💀 Dead (stale URL)`\n\nin run output and do **not** count as errors.\n\nAfter several runs have accumulated in `logs/`\n\n, trim permanently-dead hostnames from future runs so they are skipped without any network request:\n\n```\nnode scripts/prune-dead.js\n```\n\nThe script:\n\n- Reads every\n`logs/run-*.json`\n\nfile - Finds hostnames whose status was\n`dead`\n\nin**every** run they appeared in - Merges them into\n`data/dead-urls.json`\n\n(deduped, sorted) - Prints a summary of how many new hosts were added\n\nThe script is **idempotent** - running it twice produces no change. You can add it as a post-run step or run it manually whenever you want to prune the dead list.\n\n`data/dead-urls.json`\n\nis committed to the repo so the dead list is shared with all clones.\n\n| Platform | Command |\n|---|---|\nmacOS (launchd) |\n`launchctl unload ~/Library/LaunchAgents/com.auto-identity-remove.plist` then `rm ~/Library/LaunchAgents/com.auto-identity-remove.plist` |\nLinux (systemd) |\n`systemctl --user disable --now auto-identity-remove.timer` then `rm ~/.config/systemd/user/auto-identity-remove.{service,timer}` |\nLinux (crontab fallback) |\nRun `crontab -e` and delete the `auto-identity-remove` line |\nWindows (schtasks) |\n`schtasks /Delete /TN auto-identity-remove /F` |\n\nThis tool supports non-US users with a few important caveats.\n\n`setup.js`\n\nwill prompt for**Country**(2-letter ISO code, e.g.`CA`\n\n,`GB`\n\n,`AU`\n\n) and then replace the US-centric \"State\" / \"ZIP code\" prompts with**Province/Region** and**Postal code** prompts that accept any format (`K1A 0A6`\n\n,`SW1A 1AA`\n\n,`2000`\n\n, etc.) with no coercion.- Phone numbers for non-US users are stored verbatim - no\n`(xxx) xxx-xxxx`\n\nreformatting is applied. `lib/forms.js`\n\nautomatically tries province/postal/postcode HTML field variants (e.g.`input[name*=\"province\"]`\n\n,`input[name*=\"postcode\"]`\n\n) when filling forms for non-US users, with no change needed in broker definitions.- A country\n`<select>`\n\non opt-out forms is targeted and filled with your 2-letter country code when present. - Global brokers (ZoomInfo, Clearbit, Acxiom, Radaris, etc.) are attempted for all users.\n\nThe following brokers are flagged `usOnly: true`\n\nand are silently skipped when your configured country is not `US`\n\n. These sites index US public records, voter data, or phone directories - a non-US person definitionally has no record to remove there:\n\n| Broker | Reason |\n|---|---|\n| Spokeo | US people-search (state-keyed search) |\n| WhitePages | US white-pages directory |\n| FastPeopleSearch | US people-search |\n| TruePeopleSearch | US people-search |\n| BeenVerified | US background-check (requires US state) |\n| USPhonebook | US phone directory |\n| PublicDataUSA | US public records |\n\nAll other brokers in the list are attempted regardless of country.\n\nUS people-search sites (`Spokeo`\n\n, `WhitePages`\n\n, etc.) hold records sourced from US public records - if you have never lived in the US, your data is very unlikely to appear on these sites. The script skips them for you automatically.\n\nA fair concern raised by some users: aren't you just confirming your data to the brokers by filling out their forms?\n\nA few things worth knowing:\n\n**These brokers already have your info.** You're not revealing anything new - you're using the legally-required removal mechanism they're obligated to provide.**CCPA (California) and similar state laws require brokers to honor opt-out requests.** Submitting the form creates a legal obligation to remove you. Doing nothing does not.**The script uses info you're already listed under**- your name as it appears publicly, your state, your email. It doesn't add new data points.** The alternative is worse.**Every month that passes, more brokers scrape and resell your data. Opt-outs are imperfect, but they work more often than not.\n\nThat said: if you're in a situation where even confirming your email address to a broker is a risk, this tool is not the right approach. Consider a paid service that uses a proxy email.\n\nCalifornia is launching an official **Delete Me** opt-out registry on August 1, 2025. Once registered, data brokers are legally required to delete your info automatically - no individual form submissions needed for participating brokers.\n\nRegister at: ** optoutregistry.oag.ca.gov** (live August 1)\n\n**Recommended:** Register with the CA Delete Registry first, then run this script for the brokers that aren't covered.\n\nPaid services like [Incogni](https://incogni.com) ($96/yr) or [Optery](https://optery.com) ($39/yr) are excellent and cover more brokers with professionally maintained opt-out flows. This tool is for people who want full control, transparency, and no recurring subscription - or who want to handle the gaps those services miss (Acxiom, LexisNexis, ZoomInfo, Clearbit).\n\nUsing both is the strongest approach: a paid service for the bulk of brokers + this script for the gaps.\n\nMIT", "url": "https://wpnews.pro/news/i-automated-opt-outs-for-500-data-broker-sites-open-source", "canonical_source": "https://github.com/stephenlthorn/auto-identity-remove", "published_at": "2026-05-18 11:32:42+00:00", "updated_at": "2026-05-18 14:31:24.226734+00:00", "lang": "en", "topics": ["open-source", "developer-tools", "cybersecurity", "data", "products"], "entities": ["CapSolver", "Playwright", "Node.js", "macOS", "OneTrust", "TrustArc", "Osano", "Spokeo"], "alternates": {"html": "https://wpnews.pro/news/i-automated-opt-outs-for-500-data-broker-sites-open-source", "markdown": "https://wpnews.pro/news/i-automated-opt-outs-for-500-data-broker-sites-open-source.md", "text": "https://wpnews.pro/news/i-automated-opt-outs-for-500-data-broker-sites-open-source.txt", "jsonld": "https://wpnews.pro/news/i-automated-opt-outs-for-500-data-broker-sites-open-source.jsonld"}}