{"slug": "show-hn-a-reverse-captcha-for-clankers", "title": "Show HN: A Reverse Captcha for Clankers", "summary": "A developer has created a reverse CAPTCHA system designed to be solved by automated agents rather than humans, flipping the traditional security model. The system, called Clanker CAPTCHA, presents challenges that are deliberately difficult for humans to complete under time pressure but straightforward for software agents that can read pixels and perform basic computation. The tool aims to help website operators distinguish between accountable automated access and malicious bots by requiring agents to demonstrate they can follow public rules and perform fresh computational work for each session.", "body_md": "## Why reverse the usual CAPTCHA shape?\n\nThe usual CAPTCHA looks for something people find easy and software finds hard. That gap has been closing for years. Clanker CAPTCHA flips it around: the task is a pain to do by hand against a timer, but simple for an agent that can read pixels and do a little math.\n\nNone of the method is hidden. The challenge is spelled out for any agent willing to play along, the answer stays on the server, and a solver still has to do the work to find it.\n\n**No hidden answer**\nThe browser sees the instructions, the images, and the public parameters. It never sees the checksum.\n\n**Agent readable**\nThe widget drops in machine-readable metadata and a JSON manifest describing the challenge.\n\n**Pixel grounded**\nYou are meant to solve it from the rendered frames, not by scraping a value out of the DOM.\n\n## Why would a CAPTCHA for agents be useful?\n\nThis kind of CAPTCHA does not care whether you are human. What it cares about is whether automated access happened in the open, tied to a live challenge you can measure. That helps when you already expect capable agents to turn up and would rather hand them a clear protocol than treat them as malfunctioning people.\n\nForget \"human or bot.\" The real question is whether this caller did the requested work for a fresh challenge, followed the public rules, and did it before reaching for the protected action. Whatever signal that produces, a host can weigh it against the usual things: account age, rate limits, reputation, payment status.\n\n**Cooperative agents**\nGives agents a documented way to show they can read the page and respect site policy.\n\n**Cost shaping**\nMakes throwaway automation burn real compute on fresh per-session evidence instead of replaying a token.\n\n**Audit trail**\nPublishes a structured manifest, so the solve path is easy to inspect when something breaks.\n\nIn practice you would put it in front of the expensive actions: creating accounts, hammering a sensitive endpoint, retrying checkout, minting API keys. It will not replace authorization. It just adds a little friction and some evidence, built with browser agents in mind instead of aimed at them.\n\n### Why make it hard for a human?\n\nSometimes the lane you are protecting is meant for software, not hands on a keyboard: agent APIs, automation consoles, bulk jobs, crawler deals. A puzzle a person can solve is the wrong fit there. It just invites people to solve it by hand, pay someone a few cents to click, or screenshot it and pass it along.\n\nMaking it hostile to humans on purpose is a way of saying who the lane is for: an accountable agent that reads the pixels and follows the manifest. That is worth doing when you want human flows and machine flows kept apart, rather than jammed behind one checkbox.\n\nDo not use this to lock people out of something they actually need. If a flow is for humans, give humans a way through. The hostile version is for agent-only gates, research demos, and controlled automation surfaces where stopping manual solves is the whole idea.\n\n## What signal does the host get?\n\nA normal checkbox really only tells you \"something got clicked.\" This aims for something with more in it: a specific browser session pulled a fresh challenge, showed its manifest, ran the computation, and sent back the checksum and nonce before the clock ran out.\n\nOn its own it is not identity, just one input into a bigger decision. A host can pair it with session age, account trust, request velocity, IP reputation, whatever crawler policy it has.\n\n**Freshness**\nEach challenge expires and is recorded once on the server, so a stale solve is worthless as a reusable credential.\n\n**Page-state awareness**\nThe intended solver must inspect rendered frames and the manifest produced by this widget instance.\n\n**Compute evidence**\nThe checksum requires spectral fusion and the submit body includes a proof-of-work nonce.\n\n**Debuggable contract**\nThe hidden instructions and JSON manifest make failures explainable for compliant agents and maintainers.\n\n## The challenge tricks\n\nIt looks chaotic on screen, but the real puzzle is in the frequency domain. Every frame carries the genuine signal, some per-frame decoys, and a layer of noise that is just for show. You have to fuse the frames before you trust whichever peak looks strongest.\n\n**Fused frames**\nEvery image contributes to the same answer, but a single image can emphasize the wrong cell.\n\n**Fiducial corners**\nFour off-grid markers per slot let an agent reconstruct geometry from evidence instead of receiving it directly.\n\n**Phantom carriers**\nDecoys have random phase per frame, so they look convincing locally and wash out under coherent fusion.\n\n**Proof of work**\nThe answer alone is not enough; the submit payload also includes a nonce bound to the challenge id.\n\n### 1. Coherent fusion beats single-frame reading\n\nReal carriers keep the same phase across frames. Decoys and phantoms do not. If a solver sums complex spectra across every image, real carriers reinforce. If it reads one image, a phantom can point at the wrong cell.\n\n### 2. The lattice is marked, not disclosed\n\nEach symbol slot has four fiducials just outside the data grid. They reveal the slot anchors, stride, and vertical step, but the raw values are not sent as ordinary JSON fields. The solver must recover them from spectral peaks.\n\n### 3. The codebook is public but shuffled\n\nThe challenge response discloses the transform, layout, permutation, checksum formula, and proof-of-work requirement. The solver still has to read the data cell from the fused image evidence.\n\n## Protocol walkthrough\n\n**01**\n\nThe host page mounts `ClankerCaptcha`\n\nwith\n`challengeUrl`\n\n, `verifyUrl`\n\n, and an\noptional `onSolved`\n\ncallback.\n\n**02**\n\nThe widget fetches a challenge, renders every frame into the DOM, starts the countdown, and prepares the browser-side proof of work.\n\n**03**\n\nThe library injects `meta[name=\"clanker-agent-task\"]`\n\nand an `application/clanker+json`\n\nmanifest containing\nimage selectors, data URLs, solve parameters, and submit details.\n\n**04**\n\nA solver computes the DFT of every frame, sums the complex spectra, recovers the lattice, decodes the symbols, computes the checksum, finds the nonce, and submits the result.\n\n**05**\n\nThe server checks expiry, proof of work, and checksum. A solved challenge returns a token and is removed from the in-memory map.\n\n## Integration shape\n\nA host page should not hand-author the agent metadata. That belongs to the library because it has the current challenge, instance id, frame selectors, and manifest id.\n\n``` js\n<div id=\"clanker\"></div>\n\n<script type=\"module\">\n  import { ClankerCaptcha } from \"./src/clanker-captcha.js\";\n\n  ClankerCaptcha.mount(\"#clanker\", {\n    challengeUrl: \"/api/challenge\",\n    verifyUrl: \"/api/verify\",\n    onSolved(token) {\n      console.log(\"Clanker token:\", token);\n    }\n  });\n</script>\n```\n\nThe server in this repo is deliberately tiny and has no dependencies. It is here to show the endpoint contract, not to be a blueprint for a production backend.\n\n### Three surfaces to integrate\n\n**Website**\nMount the widget, let it inject metadata, and pass the returned token into your normal form or session flow.\n\n**Agent**\nRead the manifest, resolve frame selectors, compute the answer from pixels, and submit the nonce-backed result.\n\n**Backend**\nGenerate short-lived challenges, keep the expected answer server-side, verify once, and issue an app-specific token.\n\n**Policy**\nDecide what a pass means for your product: lower friction, access to an automation lane, or an input to risk scoring.\n\n## What this is not claiming\n\nThis is not a complete security product. A production deployment would need durable challenge state or signed state, replay defense, rate limits, telemetry, token binding, CSRF and CORS policy, and a clear abuse model.\n\nThe useful part of the experiment is the interface: a widget that exposes machine-readable instructions while keeping the answer out of the browser, plus a challenge whose intended solution is grounded in rendered pixels.", "url": "https://wpnews.pro/news/show-hn-a-reverse-captcha-for-clankers", "canonical_source": "https://clanker-captcha.jeromem.workers.dev/", "published_at": "2026-05-31 10:48:45+00:00", "updated_at": "2026-05-31 11:17:01.294139+00:00", "lang": "en", "topics": ["ai-agents", "computer-vision", "ai-products", "ai-tools"], "entities": ["Clanker CAPTCHA"], "alternates": {"html": "https://wpnews.pro/news/show-hn-a-reverse-captcha-for-clankers", "markdown": "https://wpnews.pro/news/show-hn-a-reverse-captcha-for-clankers.md", "text": "https://wpnews.pro/news/show-hn-a-reverse-captcha-for-clankers.txt", "jsonld": "https://wpnews.pro/news/show-hn-a-reverse-captcha-for-clankers.jsonld"}}