I built an AI Chrome extension with zero backend cost — here's the exact architecture A developer built three AI-powered Chrome extensions—PR summarization, risk scoring, and draft review generation—with zero backend cost by using a Bring Your Own Key (BYOK) architecture. The extensions call AI providers directly from the browser using the user's own API key, eliminating the need for a server and addressing privacy concerns. The approach supports multiple providers including OpenAI, Groq, Mistral, and local models via Ollama. You want to add AI to your Chrome extension. The obvious path: spin up a Node.js server, hold a master API key, charge users monthly, eat the AI cost. That's what everyone does. I didn't do that. I built three Chrome extensions with AI features — PR summarization, risk scoring, draft review generation — and my monthly infrastructure bill is $0. No server. No backend. No API key to protect. Here's the exact architecture, the real trade-offs, and the specific places where this approach breaks down so you don't find out the hard way. Most AI-powered extensions work like this: User → Extension → Your server → AI provider → Your server → Extension → User Your server holds a master API key. Users pay you. You pay the AI provider out of that margin. The problems: You're a proxy business now. You're paying OpenAI $X, charging users $Y, and the difference is your margin. But you're also responsible for rate limiting, uptime, abuse prevention, and GDPR compliance for every request that touches your server. Private code goes through your infra. For a developer tool that reads GitHub diffs, this is the question users ask first: "is my code going to your server?" With a hosted backend, the honest answer is yes. You're competing on price against companies with VC money. CodeRabbit, GitHub Copilot, Linear, and a dozen others are running hosted AI with economies of scale you can't match as a solo developer. There's a different architecture. It's not new — it's called BYOK Bring Your Own Key , and it shifts the AI provider relationship from you to the user. User → Extension → AI provider user's own key No server in the middle. No margin math. No "is my code safe" question. The core mechanic is simple: instead of your extension calling your server, it calls the AI provider directly from the browser using the user's own API key. // The user pastes their API key during onboarding // You store it locally — never send it anywhere else await chrome.storage.local.set { aiApiKey: userProvidedKey, aiProvider: 'groq' // or 'openai', 'mistral', 'ollama' } ; // Every AI call uses their key, from their browser async function callAI prompt { const { aiApiKey, aiProvider } = await chrome.storage.local.get 'aiApiKey', 'aiProvider' ; const endpoint = getEndpoint aiProvider ; const response = await fetch endpoint, { method: 'POST', headers: { 'Authorization': Bearer ${aiApiKey} , 'Content-Type': 'application/json' }, body: JSON.stringify { model: getModel aiProvider , messages: { role: 'user', content: prompt } , max tokens: 500 } } ; return response.json ; } The API key lives in chrome.storage.local . It never leaves the browser except to go directly to the AI provider. Your extension never sees it again after the user pastes it in. For direct API calls from a Chrome extension, declare host permissions for each provider you support: { "manifest version": 3, "permissions": "storage" , "host permissions": "https://api.openai.com/ ", "https://api.groq.com/ ", "https://api.mistral.ai/ ", "http://localhost: / " } The localhost entry covers Ollama — for users who want a fully local model with zero API costs. Important:In MV3, host permissions are scrutinized during review. Be specific. Don't use