{"slug": "practical-nlp-in-the-browser-with-transformers-js", "title": "Practical NLP in the Browser with Transformers.js", "summary": "Hugging Face released Transformers.js, a JavaScript library that runs state-of-the-art NLP models directly in the browser on the user's device with no server required. The library, which is functionally equivalent to Hugging Face's Python transformers library, uses ONNX Runtime to execute models converted from PyTorch, TensorFlow, or JAX, and caches model weights locally after the first download. This enables developers to perform text classification, zero-shot labeling, and question answering entirely offline through a browser-based pipeline API.", "body_md": "# Practical NLP in the Browser with Transformers.js\n\nThis tutorial covers three NLP tasks: text classification, zero-shot labelling, and question answering using Transformers.js's pipeline() API.\n\n## # Introduction\n\nFor a long time, running transformer models meant maintaining a Python server, paying for GPU time, and routing every inference request through an API. The user typed something, it left their machine, touched your infrastructure, and came back as a prediction. That architecture made sense when the models were too large to run anywhere else. It is no longer the only option.\n\n[ Transformers.js](https://huggingface.co/docs/transformers.js/en/index) changes the equation. It runs state-of-the-art NLP models directly in the browser, on the user's device, with no server involved. The models download once, cache locally, and run offline from that point forward. The Python-to-JavaScript translation is almost one-to-one:\n\n``` js\n// JavaScript -- nearly identical\nimport { pipeline } from '@huggingface/transformers';\nconst classifier = await pipeline('sentiment-analysis');\nconst result = await classifier('I love transformers!');\n```\n\nThis tutorial covers three NLP tasks: text classification, zero-shot labelling, and question answering using Transformers.js's `pipeline()`\n\nAPI. For each task, you will see how to initialize the pipeline, what the output structure looks like and how to interpret it, and a working HTML example you can open directly in a browser. The tutorial closes with a complete support ticket routing application that combines all three pipelines into one practical tool.\n\nEvery code example in this article uses the CDN import path, so there is no build step required. Open a text editor, paste the code, and run it.\n\n## # What Transformers.js Actually Is\n\nThe library is designed to be [functionally equivalent to Hugging Face's Python transformers library](https://huggingface.co/docs/transformers.js/en/index), meaning the same pretrained models, the same task names, and the same pipeline API just in JavaScript. Under the hood, the bridge that makes this possible is [ONNX Runtime](https://onnxruntime.ai/).\n\nModels trained in PyTorch, TensorFlow, or JAX are converted to [ONNX format](https://onnx.ai/) using [Hugging Face Optimum](https://github.com/huggingface/optimum). ONNX Runtime then executes these models in the browser. By default, it runs on CPU via WebAssembly (WASM), which works in every modern browser. If you want GPU acceleration, setting `device: 'webgpu'`\n\nroutes computation through the browser's WebGPU API meaningfully faster where available, though still experimental in some environments.\n\n**Model caching**. The first time a pipeline runs, the model weights download from[Hugging Face Hub](https://huggingface.co/models?library=transformers.js)and cache in the browser IndexedDB in a browser context, the filesystem in Node.js.[Developer testing shows the sentiment analysis pipeline](https://www.raymondcamden.com/2024/12/03/using-transformersjs-for-ai-in-the-browser)downloads around 111 MB on first load. Subsequent runs skip the download entirely and load from cache. This means the first user session has a bandwidth cost; every session after is fast and offline-capable**Quantization**. The`dtype`\n\noption controls model precision.`q8`\n\n(8-bit quantization) is the WASM default; it gives you a good balance of size and accuracy.`q4`\n\ncuts the file roughly in half with a 1–3% accuracy loss on most tasks, which is the right trade-off for mobile or slow connections. For Node.js server-side use,`fp32`\n\ngives full precision with no size constraint\n\n``` js\n// Default WASM execution -- works everywhere\nconst pipe = await pipeline('sentiment-analysis');\n\n// WebGPU for faster inference on compatible hardware\nconst pipe = await pipeline('sentiment-analysis', null, { device: 'webgpu' });\n\n// 4-bit quantization for smaller model downloads\nconst pipe = await pipeline('sentiment-analysis',\n  'Xenova/distilbert-base-uncased-finetuned-sst-2-english',\n  { dtype: 'q4' }\n);\n```\n\n## # The pipeline() API\n\nThe **pipeline** function is the entire public interface for most use cases. It bundles three things: a pretrained model, a tokenizer, and postprocessing logic, into a single callable object. You do not touch the tokenizer or model weights directly. You call the pipeline with text and get structured output back.\n\nThe signature has three parts:\n\n``` js\nconst pipe = await pipeline(task, model?, options?);\nconst result = await pipe(input, inferenceOptions?);\n```\n\n`task`\n\nis a string identifier that tells the library which kind of model to load and how to handle input and output. `model`\n\nis optional; if you omit it, the library loads the default model for that task. If you specify a model ID (like '`Xenova/distilbert-base-uncased-finetuned-sst-2-english`\n\n'), that model loads from the Hub. `options`\n\nis where you set `device, dtype`\n\n, and `progress_callback`\n\n.\n\nBoth steps are async. `pipeline()`\n\ndownloads and loads the model into memory. This is the slow part on the first run. The pipe call itself is usually fast once the model is loaded. Both return Promises, which means your UI needs to handle the loading state.\n\nA `progress_callback`\n\nlets you track the download and show progress to the user:\n\n```\n// progress_callback fires during model download with status updates\n// This is important UX -- users need to know something is happening\nconst pipe = await pipeline(\n  'sentiment-analysis',\n  'Xenova/distilbert-base-uncased-finetuned-sst-2-english',\n  {\n    dtype: 'q8',\n    progress_callback: (progress) => {\n      // progress.status can be: 'initiate', 'download', 'progress', 'done'\n      if (progress.status === 'progress') {\n        const pct = Math.round(progress.progress);\n        document.getElementById('progress').textContent =\n          `Loading model: ${pct}%`;\n      }\n      if (progress.status === 'ready') {\n        document.getElementById('progress').textContent = 'Model ready';\n      }\n    }\n  }\n);\n```\n\nOne important note from the [official documentation](https://huggingface.co/docs/transformers.js/en/index): Transformers.js is an inference-only library. You cannot fine-tune or train models with it. If your task needs a custom model, training happens elsewhere (Python, cloud), and the resulting ONNX export runs in the browser.\n\n## # Task 1: Text Classification\n\nText classification assigns a label and a confidence score to input text. The most common form is sentiment analysis, positive vs. negative, but the same pipeline architecture handles any fixed set of categories the model was trained on.\n\nWhat the output looks like:\n\n``` js\nconst result = await classifier('This product completely exceeded my expectations.');\n// [{ label: 'POSITIVE', score: 0.9997 }]\n```\n\nOutput is an array of objects. Each object has `label`\n\n(the predicted class as a string) and `score`\n\n(a float between 0 and 1 representing the model's confidence). A score of 0.9997 means the model is highly confident. A score of 0.52 means it is barely above the decision threshold treat that as uncertain and handle it accordingly in your application logic.\n\nThe output is always an array, even for a single input, because the same pipeline call handles batches:\n\n``` js\nconst results = await classifier([\n  'This is great!',\n  'Completely broken, waste of money.'\n]);\n// [\n//   { label: 'POSITIVE', score: 0.9998 },\n//   { label: 'NEGATIVE', score: 0.9991 }\n// ]\n```\n\n### // Full Working Example\n\nThe example below is a complete, self-contained HTML file. Open it in any modern browser. The model downloads on first run and caches subsequent loads, which are instant.\n\n```\n<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n  <meta charset=\"UTF-8\" />\n  <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\" />\n  <title>Text Classification with Transformers.js</title>\n  <style>\n    body { font-family: system-ui, sans-serif; max-width: 680px;\n           margin: 2rem auto; padding: 0 1rem; }\n    textarea { width: 100%; height: 100px; padding: 0.5rem;\n               font-size: 1rem; margin-bottom: 0.5rem; }\n    button { padding: 0.5rem 1.5rem; font-size: 1rem; cursor: pointer; }\n    button:disabled { opacity: 0.5; cursor: not-allowed; }\n    #status { color: #666; font-size: 0.9rem; margin: 0.5rem 0; }\n    #result { margin-top: 1rem; font-size: 1.1rem; font-weight: bold; }\n    .positive { color: #16a34a; }\n    .negative { color: #dc2626; }\n  </style>\n</head>\n<body>\n  <h1>Sentiment Classifier</h1>\n  <p>Runs entirely in your browser -- no server, no API calls.</p>\n\n  <textarea id=\"input\" placeholder=\"Enter text to classify...\">\nI really enjoyed using this product. The setup was easy and everything works perfectly.\n  </textarea>\n\n  <button id=\"classify-btn\" disabled>Loading model...</button>\n  <div id=\"status\">Downloading model on first run (this may take a moment)...</div>\n  <div id=\"result\"></div>\n\n  <script type=\"module\">\n    import { pipeline } from\n      'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.2';\n\n    const statusEl  = document.getElementById('status');\n    const resultEl  = document.getElementById('result');\n    const btn       = document.getElementById('classify-btn');\n    const inputEl   = document.getElementById('input');\n\n    let classifier;\n\n    async function loadModel() {\n      classifier = await pipeline(\n        'text-classification',\n        'Xenova/distilbert-base-uncased-finetuned-sst-2-english',\n        {\n          dtype: 'q8',\n          progress_callback: (p) => {\n            if (p.status === 'progress') {\n              const pct = Math.round(p.progress ?? 0);\n              statusEl.textContent = `Downloading model: \\${pct}%`;\n            }\n          }\n        }\n      );\n\n      btn.textContent  = 'Classify';\n      btn.disabled     = false;\n      statusEl.textContent = 'Model loaded and cached. Subsequent loads are instant.';\n    }\n\n    async function classify() {\n      const text = inputEl.value.trim();\n      if (!text) return;\n\n      btn.disabled         = true;\n      btn.textContent      = 'Classifying...';\n      resultEl.textContent = '';\n\n      const results = await classifier(text);\n      const { label, score } = results;\n\n      const pct       = (score * 100).toFixed(1);\n      const cssClass  = label === 'POSITIVE' ? 'positive' : 'negative';\n\n      resultEl.innerHTML =\n        `<span class=\"\\${cssClass}\">\\${label}</span> -- \\${pct}% confidence`;\n\n      btn.disabled    = false;\n      btn.textContent = 'Classify';\n    }\n\n    btn.addEventListener('click', classify);\n\n    loadModel().catch(err => {\n      statusEl.textContent = `Error loading model: \\${err.message}`;\n    });\n  </script>\n</body>\n</html>\n```\n\nThe `loadModel`\n\nfunction calls `pipeline()`\n\nwith the task name, model ID, and options. The `progress_callback`\n\nfires repeatedly during the download and updates the status text so the user is not staring at a frozen screen. Once the model loads, the button is enabled. When the user clicks Classify, `classifier(text)`\n\nruns inference synchronously from cache, typically under 200ms on a modern laptop. The result destructures `label`\n\nand `score`\n\nfrom the first array element, formats the confidence as a percentage, and applies a CSS class for color coding.\n\n## # Task 2: Zero-Shot Classification\n\nZero-shot classification does something regular text classification cannot: it classifies text into categories you define at runtime, with no training data required. You pass the text and a list of labels in plain English. The model decides which label fits best based on its understanding of language semantics.\n\nThis is useful any time you cannot or do not want to train a model on labelled examples, which is most of the time in real projects.\n\n### // How It Works Under the Hood\n\nThe model reformulates each candidate label as a natural language inference (NLI) hypothesis. For the label \"**billing issue**\", it generates the hypothesis \"** This text is about a billing issue**\" and computes the probability that the hypothesis is entailed by the input text. The label with the highest entailment score wins. This [NLI-based approach](https://huggingface.co/tasks/zero-shot-classification) is why you can use any descriptive English phrase as a label and get a meaningful result. The model understands the meaning of your labels, not just their surface form.\n\nWhat the output looks like:\n\n``` js\nconst classifier = await pipeline('zero-shot-classification',\n  'Xenova/bart-large-mnli');\n\nconst result = await classifier(\n  'My invoice is wrong and I was charged twice.',\n  ['billing', 'technical support', 'shipping', 'returns', 'account access']\n);\n\n// {\n//   sequence: 'My invoice is wrong and I was charged twice.',\n//   labels:   ['billing', 'returns', 'account access', 'technical support', 'shipping'],\n//   scores:   [0.871,      0.063,     0.031,             0.022,               0.013]\n// }\n```\n\nThe output is an object with three fields. `sequence`\n\nis the original input text. `labels`\n\nis an array of your candidate labels, sorted from highest to lowest score. `scores`\n\nis an array of confidence scores in the same order. The first element of both arrays is always the winning prediction. Scores across all labels sum to approximately 1 when `multi_label`\n\nis false (the default).\n\nSetting `multi_label: true`\n\nchanges the behavior: each label scores independently rather than competing, so multiple labels can all have high scores simultaneously. Use this when text plausibly belongs to several categories at once.\n\n### // Full Working Example\n\nHere is your updated script block with all the HTML brackets fully escaped. You can paste this directly into your Custom HTML block in WordPress, and it will render perfectly as a code snippet.\n\n```\n<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n  <meta charset=\"UTF-8\" />\n  <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\" />\n  <title>Zero-Shot Classifier -- Support Ticket Router</title>\n  <style>\n    body { font-family: system-ui, sans-serif; max-width: 720px;\n           margin: 2rem auto; padding: 0 1rem; }\n    textarea { width: 100%; height: 120px; padding: 0.5rem; font-size: 1rem; }\n    button { margin-top: 0.5rem; padding: 0.5rem 1.5rem;\n             font-size: 1rem; cursor: pointer; }\n    button:disabled { opacity: 0.5; cursor: not-allowed; }\n    #status  { color: #666; font-size: 0.9rem; margin: 0.5rem 0; }\n    .result-row { display: flex; justify-content: space-between;\n                  padding: 0.4rem 0; border-bottom: 1px solid #eee; }\n    .bar-container { width: 60%; background: #f0f0f0;\n                     border-radius: 4px; height: 18px; }\n    .bar { background: #2563eb; height: 100%;\n           border-radius: 4px; transition: width 0.3s; }\n    .label-name { min-width: 160px; font-weight: 500; }\n    .score-text { min-width: 50px; text-align: right; color: #555; }\n  </style>\n</head>\n<body>\n  <h1>Support Ticket Router</h1>\n  <p>Paste a support ticket. The model routes it to the right department\n     with no training data needed.</p>\n\n  <textarea id=\"ticket\">\nI placed an order three days ago but it still hasn't shipped. I have an event\nthis weekend and really need this to arrive on time. My order number is #48821.\n  </textarea>\n\n  <button id=\"route-btn\" disabled>Loading model...</button>\n  <div id=\"status\">Downloading model on first run...</div>\n  <div id=\"results\"></div>\n\n  <script type=\"module\">\n    import { pipeline } from\n      'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.2';\n\n    const statusEl  = document.getElementById('status');\n    const resultsEl = document.getElementById('results');\n    const btn       = document.getElementById('route-btn');\n    const ticketEl  = document.getElementById('ticket');\n\n    const DEPARTMENTS = [\n      'shipping and delivery',\n      'billing and payment',\n      'technical support',\n      'returns and refunds',\n      'account and login'\n    ];\n\n    let classifier;\n\n    async function loadModel() {\n      classifier = await pipeline(\n        'zero-shot-classification',\n        'Xenova/bart-large-mnli',\n        {\n          dtype: 'q8',\n          progress_callback: (p) => {\n            if (p.status === 'progress') {\n              statusEl.textContent =\n                `Downloading model: ${Math.round(p.progress ?? 0)}%`;\n            }\n          }\n        }\n      );\n\n      btn.disabled    = false;\n      btn.textContent = 'Route Ticket';\n      statusEl.textContent = 'Model ready.';\n    }\n\n    async function routeTicket() {\n      const text = ticketEl.value.trim();\n      if (!text) return;\n\n      btn.disabled         = true;\n      btn.textContent      = 'Routing...';\n      resultsEl.innerHTML  = '';\n\n      const output = await classifier(text, DEPARTMENTS, {\n        multi_label: false\n      });\n\n      const winner = output.labels;\n      const confidence = (output.scores * 100).toFixed(1);\n\n      let html = `<h3>Route to: <strong>\\${winner}</strong>\n                  (\\${confidence}% confidence)</h3>\n                  <p style=\"color:#666; font-size:0.9rem\">\n                  Full department score breakdown:</p>`;\n\n      output.labels.forEach((label, i) => {\n        const pct = (output.scores[i] * 100).toFixed(1);\n        const barWidth = (output.scores[i] * 100).toFixed(0);\n        html += `\n          <div class=\"result-row\">\n            <span class=\"label-name\">\\${label}</span>\n            <div class=\"bar-container\">\n              <div class=\"bar\" style=\"width: \\${barWidth}%\"></div>\n            </div>\n            <span class=\"score-text\">\\${pct}%</span>\n          </div>`;\n      });\n\n      resultsEl.innerHTML  = html;\n      btn.disabled         = false;\n      btn.textContent      = 'Route Ticket';\n    }\n\n    btn.addEventListener('click', routeTicket);\n    loadModel().catch(err => {\n      statusEl.textContent = `Error: \\${err.message}`;\n    });\n  </script>\n</body>\n</html>\n```\n\nThe `DEPARTMENTS`\n\narray is all the routing configuration this system needs. No training data, no labeled examples. When a ticket arrives, `classifier(text, DEPARTMENTS, { multi_label: false })`\n\nruns all five entailment checks internally and returns them ranked. The results loop builds a horizontal bar chart showing each department's score, a sorted visualization that makes it immediately obvious where the ticket should go and how confident the model was. Try changing the `DEPARTMENTS`\n\narray to completely different labels; the model routes correctly without any code change beyond that array.\n\n## # Task 3: Question Answering\n\nQuestion answering in Transformers.js is extractive: you provide a passage of text as context and ask a question in plain English. The model locates the span within the passage that best answers the question and returns it. It does not generate text or reason beyond what is literally in the context. The answer is always a substring of the input you provided.\n\nThis makes it well-suited for document interrogation. The user provides the document; the model navigates it.\n\nWhat the output looks like:\n\n``` js\nconst qa = await pipeline('question-answering', 'Xenova/distilbert-base-uncased-distilled-squad');\n\nconst result = await qa({\n  question: 'What is the return window for electronics?',\n  context: `Our return policy allows customers to return most items within 30 days\n            of purchase. Electronics must be returned within 15 days and must be\n            in original packaging. Software and digital downloads are non-refundable.`\n});\n\n// {\n//   answer: '15 days',\n//   score:  0.9823,\n//   start:  97,    // character index of answer start in context\n//   end:    104    // character index of answer end in context\n// }\n```\n\nThe output has four fields. `answer`\n\nis the extracted substring. `score`\n\nis the model's confidence that this span answers the question. `start`\n\nand `end`\n\nare character indices into the original context you can use these to highlight the answer in the source text, which is valuable UX for longer documents.\n\nWhen the question has no clear answer in the context, `score`\n\nwill be low and `answer`\n\nmay be a short, seemingly random span. Treating low-confidence answers (below 0.3 or 0.4) as \"not found\" is standard practice.\n\n### // Full Working Example\n\nHere is the escaped code for your Document Q&A article block. This handles all the `<` and `>` brackets inside the script and templates perfectly so it will show up cleanly on your site.\n\n```\n<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n  <meta charset=\"UTF-8\" />\n  <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\" />\n  <title>Document Q&A with Transformers.js</title>\n  <style>\n    body { font-family: system-ui, sans-serif; max-width: 720px;\n           margin: 2rem auto; padding: 0 1rem; }\n    label { font-weight: 600; display: block; margin-top: 1rem; }\n    textarea { width: 100%; padding: 0.5rem; font-size: 0.95rem; }\n    input[type=\"text\"] { width: 100%; padding: 0.5rem;\n                         font-size: 0.95rem; box-sizing: border-box; }\n    button { margin-top: 0.75rem; padding: 0.5rem 1.5rem;\n             font-size: 1rem; cursor: pointer; }\n    button:disabled { opacity: 0.5; cursor: not-allowed; }\n    #status { color: #666; font-size: 0.9rem; margin: 0.5rem 0; }\n    #answer-box { margin-top: 1rem; padding: 1rem;\n                  background: #f8fafc; border-left: 3px solid #2563eb; }\n    .highlight { background: #fef08a; border-radius: 2px; }\n    .confidence { color: #666; font-size: 0.85rem; margin-top: 0.5rem; }\n  </style>\n</head>\n<body>\n  <h1>Document Question Answering</h1>\n  <p>Paste any document, then ask questions about it.\n     Answers are extracted directly from the text.</p>\n\n  <label for=\"context\">Document / Context</label>\n  <textarea id=\"context\" rows=\"8\">\nAcme Corp Return Policy (Updated March 2025)\n\nCustomers may return most standard items within 30 days of the original purchase\ndate for a full refund. Electronics and peripherals have a shorter return window\nof 15 days and must be returned in original, unopened packaging to qualify.\n\nRefunds are processed within 3-5 business days after we receive the returned item.\nOriginal shipping charges are non-refundable. For items valued over $200, customers\nmust contact support at returns@acmecorp.com before initiating a return.\n\nSoftware licenses and digital downloads are non-refundable under any circumstances.\nGift cards cannot be returned or exchanged for cash.\n  </textarea>\n\n  <label for=\"question\">Your Question</label>\n  <input type=\"text\" id=\"question\"\n         value=\"How long does it take to process a refund?\" />\n\n  <button id=\"ask-btn\" disabled>Loading model...</button>\n  <div id=\"status\">Downloading model on first run...</div>\n  <div id=\"answer-box\" style=\"display:none\"></div>\n\n  <script type=\"module\">\n    import { pipeline } from\n      'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.2';\n\n    const contextEl   = document.getElementById('context');\n    const questionEl  = document.getElementById('question');\n    const statusEl    = document.getElementById('status');\n    const answerBox   = document.getElementById('answer-box');\n    const btn         = document.getElementById('ask-btn');\n\n    const CONFIDENCE_THRESHOLD = 0.1;\n\n    let qaModel;\n\n    async function loadModel() {\n      qaModel = await pipeline(\n        'question-answering',\n        'Xenova/distilbert-base-uncased-distilled-squad',\n        {\n          dtype: 'q8',\n          progress_callback: (p) => {\n            if (p.status === 'progress') {\n              statusEl.textContent =\n                `Downloading model: ${Math.round(p.progress ?? 0)}%`;\n            }\n          }\n        }\n      );\n\n      btn.disabled    = false;\n      btn.textContent = 'Ask';\n      statusEl.textContent = 'Model ready.';\n    }\n\n    async function askQuestion() {\n      const context  = contextEl.value.trim();\n      const question = questionEl.value.trim();\n      if (!context || !question) return;\n\n      btn.disabled         = true;\n      btn.textContent      = 'Thinking...';\n      answerBox.style.display = 'none';\n\n      const result = await qaModel({ question, context });\n\n      answerBox.style.display = 'block';\n\n      if (result.score < CONFIDENCE_THRESHOLD) {\n        answerBox.innerHTML = `\n          <strong>Answer not found</strong>\n          <p class=\"confidence\">The model could not find a clear answer\n          to this question in the provided text.</p>`;\n      } else {\n        const before    = context.slice(0, result.start);\n        const answer    = context.slice(result.start, result.end);\n        const after     = context.slice(result.end);\n        const highlight = `\\${before}<mark class=\"highlight\">\\${answer}</mark>\\${after}`;\n\n        const confidence = (result.score * 100).toFixed(1);\n        answerBox.innerHTML = `\n          <strong>Answer:</strong> \\${result.answer}\n          <p class=\"confidence\">Confidence: \\${confidence}%</p>\n          <details style=\"margin-top:1rem\">\n            <summary style=\"cursor:pointer; color:#2563eb\">\n              Show answer highlighted in document\n            </summary>\n            <pre style=\"white-space:pre-wrap; font-size:0.85rem;\n                        margin-top:0.5rem\">\\${highlight}</pre>\n          </details>`;\n      }\n\n      btn.disabled    = false;\n      btn.textContent = 'Ask';\n    }\n\n    btn.addEventListener('click', askQuestion);\n\n    questionEl.addEventListener('keydown', (e) => {\n      if (e.key === 'Enter' && !btn.disabled) askQuestion();\n    });\n\n    loadModel().catch(err => {\n      statusEl.textContent = `Error: \\${err.message}`;\n    });\n  script>\n</body>\n</html>\n```\n\nThe QA pipeline receives an object with `question`\n\nand `context`\n\nrather than a plain string. This is the format the task requires. The model's `start`\n\nand `end`\n\nfields are character indices into the context string, which the code uses to inject a `<mark>`\n\ntag around the exact span the model identified. The `<details>`\n\nelement wraps the highlighted context in a collapsible section so the UI stays clean. The confidence threshold prevents low-quality extractions from appearing as confident answers; any result below 0.1 gets replaced with a \"not found\" message.\n\n## # Real-World Application: Support Ticket Router\n\nThe three pipelines cover the full analytical surface of a support ticket. Sentiment tells you how the customer feels. Zero-shot classification routes the ticket to the right team. Question answering extracts the structured data you need: order number, product name, and the core issue, without parsing rules or regex.\n\nThis is a complete support ticket analysis tool that combines all three. It is a single HTML file, fully self-contained, fully commented.\n\nHere is the completely escaped version of your Support Ticket Analyzer code block. All internal HTML brackets within your layout templates and script configurations have been securely converted to entities. You can drop this directly into your Custom HTML block in WordPress.\n\n```\n<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n  <meta charset=\"UTF-8\" />\n  <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\" />\n  <title>Support Ticket Analyzer</title>\n  <style>\n    * { box-sizing: border-box; }\n    body { font-family: system-ui, sans-serif; max-width: 800px;\n           margin: 2rem auto; padding: 0 1rem; background: #f9fafb; }\n    h1   { margin-bottom: 0.25rem; }\n    .subtitle { color: #666; margin-bottom: 1.5rem; }\n\n    textarea { width: 100%; height: 130px; padding: 0.75rem;\n               font-size: 0.95rem; border: 1px solid #d1d5db;\n               border-radius: 6px; resize: vertical; }\n    button { padding: 0.6rem 1.8rem; font-size: 1rem;\n             background: #2563eb; color: white; border: none;\n             border-radius: 6px; cursor: pointer; margin-top: 0.5rem; }\n    button:disabled { background: #93c5fd; cursor: not-allowed; }\n\n    #status { font-size: 0.85rem; color: #666; margin: 0.5rem 0; }\n\n    .cards { display: grid; grid-template-columns: repeat(3, 1fr);\n             gap: 1rem; margin-top: 1.5rem; }\n    .card  { background: white; border-radius: 8px; padding: 1rem;\n             border: 1px solid #e5e7eb; }\n    .card h3 { margin: 0 0 0.75rem; font-size: 0.9rem;\n               text-transform: uppercase; letter-spacing: 0.05em;\n               color: #6b7280; }\n   .card .value { font-size: 1.15rem; font-weight: 600; }\n    .card .sub   { font-size: 0.85rem; color: #666; margin-top: 0.25rem; }\n\n    .positive { color: #16a34a; }\n    .negative { color: #dc2626; }\n    .neutral  { color: #d97706; }\n\n    .dept-bar { display: flex; align-items: center; gap: 0.5rem;\n                margin-top: 0.4rem; font-size: 0.85rem; }\n    .bar-bg   { flex: 1; background: #f0f0f0; border-radius: 3px; height: 8px; }\n    .bar-fill { background: #2563eb; height: 100%;\n                border-radius: 3px; transition: width 0.4s; }\n\n    .qa-item  { margin-top: 0.6rem; font-size: 0.9rem; }\n    .qa-label { font-weight: 600; color: #374151; }\n    .qa-ans   { color: #111; }\n    .qa-low   { color: #9ca3af; font-style: italic; }\n\n    @media (max-width: 600px) {\n      .cards { grid-template-columns: 1fr; }\n    }\n  </style>\n</head>\n<body>\n  <h1>Support Ticket Analyzer</h1>\n  <p class=\"subtitle\">Powered by Transformers.js -- runs entirely in your browser</p>\n\n  <textarea id=\"ticket\">\nHi, I ordered a laptop stand last Tuesday (order #73021) but it arrived completely\nbroken -- one of the arms snapped off right out of the box. I've been a customer for\nthree years and this is honestly really disappointing. I need a replacement sent out\nas soon as possible or I'd like a full refund. Please advise.\n  </textarea>\n\n  <button id=\"analyze-btn\" disabled>Loading models...</button>\n  <div id=\"status\">Initializing -- downloading models on first run...</div>\n\n  <div class=\"cards\" id=\"cards\" style=\"display:none\">\n    <div class=\"card\" id=\"card-sentiment\">\n      <h3>Sentiment</h3>\n      <div class=\"value\" id=\"sent-label\">--</div>\n      <div class=\"sub\"   id=\"sent-score\">--</div>\n    </div>\n\n    <div class=\"card\" id=\"card-route\">\n      <h3>Department</h3>\n      <div id=\"dept-results\"></div>\n    </div>\n\n    <div class=\"card\" id=\"card-qa\">\n      <h3>Key Info</h3>\n      <div id=\"qa-results\"></div>\n    </div>\n  </div>\n\n  <script type=\"module\">\n    import { pipeline } from\n      'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.2';\n\n    const ticketEl  = document.getElementById('ticket');\n    const btn       = document.getElementById('analyze-btn');\n    const statusEl  = document.getElementById('status');\n    const cardsEl   = document.getElementById('cards');\n\n    const DEPARTMENTS = [\n      'returns and refunds',\n      'shipping and delivery',\n      'billing and payment',\n      'technical support',\n      'account management'\n    ];\n\n    const QA_QUERIES = [\n      { label: 'Order number',  question: 'What is the order number?' },\n      { label: 'Issue',         question: 'What is the main problem or complaint?' },\n      { label: 'Request',       question: 'What does the customer want?' }\n    ];\n\n    let sentimentPipe, zeroPipe, qaPipe;\n    let modelsLoaded = 0;\n\n    function onModelLoaded(name) {\n      modelsLoaded++;\n      statusEl.textContent =\n        `Loading models: \\${modelsLoaded}/3 ready (\\${name} loaded)`;\n      if (modelsLoaded === 3) {\n        btn.disabled    = false;\n        btn.textContent = 'Analyze Ticket';\n        statusEl.textContent = 'All models ready.';\n      }\n    }\n\n    async function loadModels() {\n      [sentimentPipe, zeroPipe, qaPipe] = await Promise.all([\n\n        pipeline(\n          'text-classification',\n          'Xenova/distilbert-base-uncased-finetuned-sst-2-english',\n          { dtype: 'q8',\n            progress_callback: p =>\n              p.status === 'done' && onModelLoaded('Sentiment') }\n        ),\n\n        pipeline(\n          'zero-shot-classification',\n          'Xenova/bart-large-mnli',\n          { dtype: 'q8',\n            progress_callback: p =>\n              p.status === 'done' && onModelLoaded('Routing') }\n        ),\n\n        pipeline(\n          'question-answering',\n          'Xenova/distilbert-base-uncased-distilled-squad',\n          { dtype: 'q8',\n            progress_callback: p =>\n              p.status === 'done' && onModelLoaded('Q&A') }\n        )\n      ]);\n    }\n\n    async function analyzeTicket() {\n      const text = ticketEl.value.trim();\n      if (!text) return;\n\n      btn.disabled    = true;\n      btn.textContent = 'Analyzing...';\n      cardsEl.style.display = 'none';\n\n      const [sentResult, zeroResult, qaResults] = await Promise.all([\n        sentimentPipe(text),\n        zeroPipe(text, DEPARTMENTS, { multi_label: false }),\n        Promise.all(\n          QA_QUERIES.map(({ question }) =>\n            qaPipe({ question, context: text })\n          )\n        )\n      ]);\n\n      const { label, score } = sentResult;\n      const sentLabel = document.getElementById('sent-label');\n      const sentScore = document.getElementById('sent-score');\n\n      sentLabel.textContent = label;\n      sentLabel.className   = `value \\${label === 'POSITIVE' ? 'positive' : 'negative'}`;\n      sentScore.textContent = `\\${(score * 100).toFixed(1)}% confidence`;\n\n      if (label === 'NEGATIVE' && score > 0.85) {\n        sentScore.textContent += ' -- HIGH URGENCY';\n        sentScore.style.color = '#dc2626';\n      }\n\n      const deptEl = document.getElementById('dept-results');\n      deptEl.innerHTML = `<div class=\"value\">\\${zeroResult.labels}</div>`;\n\n      zeroResult.labels.slice(0, 3).forEach((dept, i) => {\n        const pct = (zeroResult.scores[i] * 100).toFixed(0);\n        deptEl.innerHTML += `\n          <div class=\"dept-bar\">\n            <span style=\"min-width:130px\">\\${dept}</span>\n            <div class=\"bar-bg\">\n              <div class=\"bar-fill\" style=\"width:\\${pct}%\"></div>\n            </div>\n            <span>\\${pct}%</span>\n          </div>`;\n      });\n\n      const qaEl = document.getElementById('qa-results');\n      qaEl.innerHTML = '';\n\n      QA_QUERIES.forEach(({ label: qLabel }, i) => {\n        const { answer, score: qScore } = qaResults[i];\n        const found = qScore >= 0.1;\n\n        qaEl.innerHTML += `\n          <div class=\"qa-item\">\n            <span class=\"qa-label\">\\${qLabel}: </span>\n            <span class=\"\\${found ? 'qa-ans' : 'qa-low'}\">\n              \\${found ? answer : 'not found'}\n            </span>\n          </div>`;\n      });\n\n      cardsEl.style.display = 'grid';\n      btn.disabled    = false;\n      btn.textContent = 'Analyze Ticket';\n    }\n\n    btn.addEventListener('click', analyzeTicket);\n    loadModels().catch(err => {\n      statusEl.textContent = `Error loading models: \\${err.message}`;\n    });\n  </script>\n</body>\n</html>\n```\n\nThe three pipelines load in parallel via `Promise.all`\n\n. This is faster than loading them sequentially because the downloads overlap. A counter tracks how many have finished, so the button only enables once all three are ready. When the user submits a ticket, all three inferences also run in parallel. The sentiment card checks whether the result is high-confidence negative and flags it as an urgent practical routing signal that requires no additional model.\n\nThe department card shows the top three candidates as score bars rather than just the winner, which gives the support team enough information to override the routing if the top score is close to the second. The QA card runs three extractive queries against the ticket body and displays the results with a confidence threshold answers below 0.1 show as \"not found\" rather than surfacing low-quality extractions.\n\n## # Performance, Limitations, and When Not to Use It\n\nTransformers.js removes the server but does not eliminate trade-offs. Knowing them up front saves you from unpleasant surprises in production.\n\n**Download size**. The sentiment analysis pipeline downloads around[111 MB on first load](https://www.raymondcamden.com/2024/12/03/using-transformersjs-for-ai-in-the-browser), not huge, but not invisible either. The zero-shot BART model is larger. For applications targeting mobile users or users on metered connections, use`to cut model sizes roughly in half, and treat the model as a progressive enhancement; do not block the user interface on model load`\n\n**Inference speed**. On a modern laptop, WASM inference for a short text classification takes 50–200ms. Zero-shot classification is slower because it runs multiple NLI passes, one per candidate label. A five-label zero-shot run typically takes 1–3 seconds on CPU. WebGPU reduces this significantly where supported**Inference only**. Transformers.js[cannot fine-tune or train models](https://transformersjs-for-developers.hashnode.dev/comprehensive-guide-to-using-transformersjs-for-developers-and-aiml-fans). If your use case requires a custom model, a classifier trained on your own labelled tickets, for example, training happens on a server (Python, cloud), and the ONNX export runs in the browser**Model availability**. Not every model on Hugging Face Hub has an ONNX version available. To find compatible models,[filter by the transformers.js library tag on the Hub](https://huggingface.co/models?library=transformers.js)- When to prefer a server instead: bulk processing of hundreds of texts where latency per item matters, tasks that require the largest frontier models, which are too large for browser delivery, or simple applications where the development cost of browser-based inference outweighs its benefits\n\nA quick reference for choosing dtype by context:\n\nContext |\nRecommended dtype |\nWhy |\n|---|---|---|\n| Browser, general use | q8 | WASM default, good balance of size and accuracy |\n| Mobile or slow connection | q4 | Roughly half the file size, 1-3% accuracy cost |\n| Node.js server-side | fp32 | Full precision, no download size concern |\n| WebGPU enabled | fp16 | Fast, good quality on compatible GPU hardware |\n\n## # Wrapping Up\n\nTransformers.js puts production-quality NLP in the browser without a server, without an API key, and without user data leaving the device. The three pipelines in this tutorial text classification, zero-shot labelling, and question answering cover the analytical surface of a large share of real NLP use cases. The support ticket router shows how they combine into something genuinely useful in fewer than 200 lines of HTML and JavaScript.\n\nThe entry point is as low as it gets: one CDN import, one `await pipeline()`\n\ncall, one inference call. Start with the simplest example in this article and run it. Modify the labels in the zero-shot demo. Point the QA model at a different document. The [official Transformers.js documentation](https://huggingface.co/docs/transformers.js/en/index) and the [examples repository](https://github.com/huggingface/transformers.js-examples) cover a much wider task range summarization, translation, named entity recognition, and more, all following the same `pipeline()`\n\npattern.\n\nis a software engineer and technical writer passionate about leveraging cutting-edge technologies to craft compelling narratives, with a keen eye for detail and a knack for simplifying complex concepts. You can also find Shittu on\n\n[Shittu Olumide](https://www.linkedin.com/in/olumide-shittu/)", "url": "https://wpnews.pro/news/practical-nlp-in-the-browser-with-transformers-js", "canonical_source": "https://www.kdnuggets.com/practical-nlp-in-the-browser-with-transformers-js", "published_at": "2026-05-29 14:00:02+00:00", "updated_at": "2026-05-29 14:50:42.124316+00:00", "lang": "en", "topics": ["natural-language-processing", "machine-learning", "ai-tools", "ai-infrastructure", "artificial-intelligence"], "entities": ["Transformers.js", "Hugging Face", "NLP", "pipeline() API"], "alternates": {"html": "https://wpnews.pro/news/practical-nlp-in-the-browser-with-transformers-js", "markdown": "https://wpnews.pro/news/practical-nlp-in-the-browser-with-transformers-js.md", "text": "https://wpnews.pro/news/practical-nlp-in-the-browser-with-transformers-js.txt", "jsonld": "https://wpnews.pro/news/practical-nlp-in-the-browser-with-transformers-js.jsonld"}}