I built a phishing detector into Chrome using Claude AI. Here's exactly how.

A developer built a Chrome extension that uses Claude AI to detect phishing messages. The extension sends suspicious messages to a Cloudflare Worker, which proxies requests to the Anthropic API using the Haiku model for fast, cost-effective classification. In tests against 50 real phishing attempts, the system correctly identified 48.

My mother called me last week. Someone had sent her an SMS claiming to be from DHL, asking her to pay a £2.99 customs fee via a link. She almost clicked it. That was enough. I spent a weekend building a Chrome extension that lets you paste any suspicious message and get an instant verdict. Here's how it works. The obvious approach is to call the Claude API directly from the extension. Don't do this. Your API key lives in the extension code, which anyone can extract from the Chrome Web Store in about 30 seconds. The right pattern: extension → Cloudflare Worker → Claude API. The Worker lives server-side, holds the API key as an environment variable, and acts as a proxy. Cloudflare's free tier handles 100,000 requests/day, which is more than enough. export default { async fetch request, env { const { prompt } = await request.json ; js const response = await fetch 'https://api.anthropic.com/v1/messages', { method: 'POST', headers: { 'x-api-key': env.ANTHROPIC API KEY, 'anthropic-version': '2023-06-01', 'content-type': 'application/json' }, body: JSON.stringify { model: 'claude-haiku-4-5-20251001', max tokens: 350, messages: { role: 'user', content: prompt } } } ; return response; } } I'm using Haiku, not Opus. For a classification task like this — is this phishing or not — Haiku is faster, 10x cheaper, and gets the same result. Opus is overkill. After a dozen iterations, this is what actually works: "You are an expert cybersecurity analyst specializing in phishing detection. Analyze the following message and determine if it is PHISHING, SUSPICIOUS, or LEGITIMATE. Pay special attention to impersonation of financial institutions PayPal, Chase, Barclays , government agencies IRS, HMRC, DVLA , delivery services UPS, FedEx, Royal Mail and major tech companies Amazon, Apple, Microsoft, Netflix . Respond ONLY in this format: VERDICT: PHISHING / SUSPICIOUS / LEGITIMATE CONFIDENCE: High / Medium / Low SIGNALS: comma-separated list, max 4 ADVICE: one clear action sentence " One thing worth knowing: parse only the VERDICT line, not the whole response. Otherwise txt.includes "PHISHING" will always return true because the word appears in the template itself. const verdictLine = txt.split '\n' .find l = l.startsWith 'VERDICT:' || ''; const isPhishing = verdictLine.includes 'PHISHING' ; Obvious in hindsight. Took me longer than I'd like to admit. Tested against 50 real phishing attempts. Claude got 48 right. The two it missed were unusually well-crafted — legitimate-looking domains with no obvious red flags. For anything with a suspicious link or an urgency pattern, it's essentially perfect. If you want the full source code — extension, Worker, and deploy instructions — I packaged it here: https://carlosdevlop.gumroad.com/l/ai-phishing-detector-bundle https://carlosdevlop.gumroad.com/l/ai-phishing-detector-bundle