How to Use Transformers.js in a Chrome Extension Technical guide for developers on integrating Transformers.js into a Chrome extension under Manifest V3 constraints. It details a three-part architecture using a background service worker to host AI models, a side panel for chat UI, and a content script for page-level actions, with the background acting as the central coordinator. The guide also covers the typed messaging contract used for communication between these components and practical considerations for model loading and runtime management. How to Use Transformers.js in a Chrome Extension While building it, we ran into several practical observations about Manifest V3 runtimes, model loading, and messaging that are worth sharing. Who this is for This guide is for developers who want to run local AI features in a Chrome extension with Transformers.js under Manifest V3 constraints. By the end, you will have the same architecture used in this project: a background service worker that hosts models, a side panel chat UI, and a content script for page-level actions. What we will build In this guide, we will recreate the core architecture of Transformers.js Gemma 4 Browser Assistant, using the published extension as a reference and the open-source codebase as the implementation map. - Live extension: Chrome Web Store - Source code: github.com/nico-martin/gemma4-browser-extension - End result: a background-hosted Transformers.js engine, a side panel chat UI, and a content script for page extraction and highlighting. 1 Chrome extension architecture MV3 Before diving in, a quick scope note: I will not go deep on the React UI layer or Vite build configuration. The focus here is the high-level architecture decisions: what runs in each Chrome runtime and how those pieces are orchestrated. If Manifest V3 is new to you, read this short overview first: What is Manifest V3?. 1.1 Runtime contexts and entry points In MV3, your architecture starts in public/manifest.json . This project defines three entry points: background.service worker = background.js , built fromsrc/background/background.ts .side panel.default path = sidebar.html , built fromsrc/sidebar/index.html .content scripts .js = content.js withmatches: http s :// / andrun at: document idle , built fromsrc/content/content.ts . The background service worker also handles chrome.action.onClicked to open the side panel for the active tab. Related entry point to know: a popup can be defined with action.default popup and works well for quick actions. This project uses a side panel for persistent chat, but the orchestration pattern is the same. 1.2 What runs where The key design decision is to keep heavy orchestration in the background and keep UI/page logic thin. - Background src/background/background.ts is the control plane: agent lifecycle, model initialization, tool execution, and shared services like feature extraction. - Side panel src/sidebar/ is the interaction layer: chat input/output, streaming updates, and setup controls. - Content script src/content/content.ts is the page bridge: DOM extraction and highlight actions. One practical consequence of this division is that the conversation history also lives in background Agent.chatMessages : the UI sends events like AGENT GENERATE TEXT , background appends the message, runs inference, then emits MESSAGES UPDATE back to the side panel. This split avoids duplicate model loads, keeps the UI responsive, and respects Chrome's security boundaries around DOM access. 1.3 Messaging contract Once runtimes are separated, messaging becomes the backbone. In this project, all messages are typed through enums in src/shared/types.ts . - Side panel - background BackgroundTasks :CHECK MODELS ,INITIALIZE MODELS AGENT INITIALIZE ,AGENT GENERATE TEXT ,AGENT GET MESSAGES ,AGENT CLEAR EXTRACT FEATURES - Background - side panel BackgroundMessages :DOWNLOAD PROGRESS ,MESSAGES UPDATE - Background - content ContentTasks :EXTRACT PAGE DATA ,HIGHLIGHT ELEMENTS ,CLEAR HIGHLIGHTS The orchestration rule is simple: the background is the single coordinator; side panel and content script are specialized workers that request actions and render results. Typical request flow: - Side panel sends AGENT GENERATE TEXT . - Background appends to Agent.chatMessages and runs model/tool steps. - Background emits MESSAGES UPDATE . - Side panel re-renders from the updated message list. 2 Transformers.js integration details 2.1 Models and responsibilities In src/shared/constants.ts , this extension uses two model roles: - TextGeneration / LLM: onnx-community/gemma-4-E2B-it-ONNX text-generation ,q4f16 - VectorEmbeddings: onnx-community/all-MiniLM-L6-v2-ONNX feature-extraction ,fp32 The split is intentional: Gemma 4 handles reasoning/tool decisions, while MiniLM generates vector embeddings for the semantic similarity search in ask website and find history . 2.2 Where inference runs All inference runs in background src/background/background.ts : - text generation via pipeline "text-generation", ... with consistent KV Caching enabled by our newDynamicCache class - embeddings via pipeline "feature-extraction", ... plus vector normalization This gives a single model host for all tabs/sessions, avoids duplicate memory usage, and keeps the side panel UI responsive. Because models are loaded from the background service worker, artifacts are cached under the extension origin chrome-extension://