# I built a phishing detector into Chrome using Claude AI. Here's exactly how.

> Source: <https://dev.to/carlos_lopez_e0907403c1b4/i-built-a-phishing-detector-into-chrome-using-claude-ai-heres-exactly-how-2d6c>
> Published: 2026-06-17 14:15:23+00:00

My mother called me last week. Someone had sent her an SMS

claiming to be from DHL, asking her to pay a £2.99 customs

fee via a link. She almost clicked it.

That was enough. I spent a weekend building a Chrome extension

that lets you paste any suspicious message and get an instant

verdict. Here's how it works.

The obvious approach is to call the Claude API directly from

the extension. Don't do this. Your API key lives in the

extension code, which anyone can extract from the Chrome Web

Store in about 30 seconds.

The right pattern: extension → Cloudflare Worker → Claude API.

The Worker lives server-side, holds the API key as an

environment variable, and acts as a proxy. Cloudflare's free

tier handles 100,000 requests/day, which is more than enough.

export default {

async fetch(request, env) {

const { prompt } = await request.json();

``` js
const response = await fetch('https://api.anthropic.com/v1/messages', {
  method: 'POST',
  headers: {
    'x-api-key': env.ANTHROPIC_API_KEY,
    'anthropic-version': '2023-06-01',
    'content-type': 'application/json'
  },
  body: JSON.stringify({
    model: 'claude-haiku-4-5-20251001',
    max_tokens: 350,
    messages: [{ role: 'user', content: prompt }]
  })
});

return response;
```

}

}

I'm using Haiku, not Opus. For a classification task like

this — is this phishing or not — Haiku is faster, 10x cheaper,

and gets the same result. Opus is overkill.

After a dozen iterations, this is what actually works:

"You are an expert cybersecurity analyst specializing in

phishing detection. Analyze the following message and

determine if it is PHISHING, SUSPICIOUS, or LEGITIMATE.

Pay special attention to impersonation of financial

institutions (PayPal, Chase, Barclays), government agencies

(IRS, HMRC, DVLA), delivery services (UPS, FedEx, Royal Mail)

and major tech companies (Amazon, Apple, Microsoft, Netflix).

Respond ONLY in this format:

VERDICT: [PHISHING / SUSPICIOUS / LEGITIMATE]

CONFIDENCE: [High / Medium / Low]

SIGNALS: [comma-separated list, max 4]

ADVICE: [one clear action sentence]"

One thing worth knowing: parse only the VERDICT line,

not the whole response. Otherwise txt.includes("PHISHING")

will always return true because the word appears in the

template itself.

const verdictLine = txt.split('\n')

.find(l => l.startsWith('VERDICT:')) || '';

const isPhishing = verdictLine.includes('PHISHING');

Obvious in hindsight. Took me longer than I'd like to admit.

Tested against 50 real phishing attempts. Claude got 48 right.

The two it missed were unusually well-crafted —

legitimate-looking domains with no obvious red flags.

For anything with a suspicious link or an urgency pattern,

it's essentially perfect.

If you want the full source code — extension, Worker, and

deploy instructions — I packaged it here: [https://carlosdevlop.gumroad.com/l/ai-phishing-detector-bundle](https://carlosdevlop.gumroad.com/l/ai-phishing-detector-bundle)
