Practical NLP in the Browser with Transformers.js

wpnews.pro

This tutorial covers three NLP tasks: text classification, zero-shot labelling, and question answering using Transformers.js's pipeline() API.

# Introduction #

For a long time, running transformer models meant maintaining a Python server, paying for GPU time, and routing every inference request through an API. The user typed something, it left their machine, touched your infrastructure, and came back as a prediction. That architecture made sense when the models were too large to run anywhere else. It is no longer the only option.

Transformers.js changes the equation. It runs state-of-the-art NLP models directly in the browser, on the user's device, with no server involved. The models download once, cache locally, and run offline from that point forward. The Python-to-JavaScript translation is almost one-to-one:

// JavaScript -- nearly identical
import { pipeline } from '@huggingface/transformers';
const classifier = await pipeline('sentiment-analysis');
const result = await classifier('I love transformers!');

This tutorial covers three NLP tasks: text classification, zero-shot labelling, and question answering using Transformers.js's pipeline()

API. For each task, you will see how to initialize the pipeline, what the output structure looks like and how to interpret it, and a working HTML example you can open directly in a browser. The tutorial closes with a complete support ticket routing application that combines all three pipelines into one practical tool.

Every code example in this article uses the CDN import path, so there is no build step required. Open a text editor, paste the code, and run it.

# What Transformers.js Actually Is #

The library is designed to be functionally equivalent to Hugging Face's Python transformers library, meaning the same pretrained models, the same task names, and the same pipeline API just in JavaScript. Under the hood, the bridge that makes this possible is ONNX Runtime.

Models trained in PyTorch, TensorFlow, or JAX are converted to ONNX format using Hugging Face Optimum. ONNX Runtime then executes these models in the browser. By default, it runs on CPU via WebAssembly (WASM), which works in every modern browser. If you want GPU acceleration, setting device: 'webgpu'

routes computation through the browser's WebGPU API meaningfully faster where available, though still experimental in some environments.

Model caching. The first time a pipeline runs, the model weights download fromHugging Face Huband cache in the browser IndexedDB in a browser context, the filesystem in Node.js.Developer testing shows the sentiment analysis pipelinedownloads around 111 MB on first load. Subsequent runs skip the download entirely and load from cache. This means the first user session has a bandwidth cost; every session after is fast and offline-capableQuantization. Thedtype

option controls model precision.q8

(8-bit quantization) is the WASM default; it gives you a good balance of size and accuracy.q4

cuts the file roughly in half with a 1–3% accuracy loss on most tasks, which is the right trade-off for mobile or slow connections. For Node.js server-side use,fp32

gives full precision with no size constraint

// Default WASM execution -- works everywhere
const pipe = await pipeline('sentiment-analysis');

// WebGPU for faster inference on compatible hardware
const pipe = await pipeline('sentiment-analysis', null, { device: 'webgpu' });

// 4-bit quantization for smaller model downloads
const pipe = await pipeline('sentiment-analysis',
  'Xenova/distilbert-base-uncased-finetuned-sst-2-english',
  { dtype: 'q4' }
);

# The pipeline() API #

The pipeline function is the entire public interface for most use cases. It bundles three things: a pretrained model, a tokenizer, and postprocessing logic, into a single callable object. You do not touch the tokenizer or model weights directly. You call the pipeline with text and get structured output back.

The signature has three parts:

const pipe = await pipeline(task, model?, options?);
const result = await pipe(input, inferenceOptions?);

task

is a string identifier that tells the library which kind of model to load and how to handle input and output. model

is optional; if you omit it, the library loads the default model for that task. If you specify a model ID (like 'Xenova/distilbert-base-uncased-finetuned-sst-2-english

'), that model loads from the Hub. options

is where you set device, dtype

, and progress_callback

.

Both steps are async. pipeline()

downloads and loads the model into memory. This is the slow part on the first run. The pipe call itself is usually fast once the model is loaded. Both return Promises, which means your UI needs to handle the state.

A progress_callback

lets you track the download and show progress to the user:

// progress_callback fires during model download with status updates
// This is important UX -- users need to know something is happening
const pipe = await pipeline(
  'sentiment-analysis',
  'Xenova/distilbert-base-uncased-finetuned-sst-2-english',
  {
    dtype: 'q8',
    progress_callback: (progress) => {
      // progress.status can be: 'initiate', 'download', 'progress', 'done'
      if (progress.status === 'progress') {
        const pct = Math.round(progress.progress);
        document.getElementById('progress').textContent =
          ` model: ${pct}%`;
      }
      if (progress.status === 'ready') {
        document.getElementById('progress').textContent = 'Model ready';
      }
    }
  }
);

One important note from the official documentation: Transformers.js is an inference-only library. You cannot fine-tune or train models with it. If your task needs a custom model, training happens elsewhere (Python, cloud), and the resulting ONNX export runs in the browser.

# Task 1: Text Classification #

Text classification assigns a label and a confidence score to input text. The most common form is sentiment analysis, positive vs. negative, but the same pipeline architecture handles any fixed set of categories the model was trained on.

What the output looks like:

const result = await classifier('This product completely exceeded my expectations.');
// [{ label: 'POSITIVE', score: 0.9997 }]

Output is an array of objects. Each object has label

(the predicted class as a string) and score

(a float between 0 and 1 representing the model's confidence). A score of 0.9997 means the model is highly confident. A score of 0.52 means it is barely above the decision threshold treat that as uncertain and handle it accordingly in your application logic.

The output is always an array, even for a single input, because the same pipeline call handles batches:

const results = await classifier([
  'This is great!',
  'Completely broken, waste of money.'
]);
// [
//   { label: 'POSITIVE', score: 0.9998 },
//   { label: 'NEGATIVE', score: 0.9991 }
// ]

// Full Working Example

The example below is a complete, self-contained HTML file. Open it in any modern browser. The model downloads on first run and caches subsequent loads, which are instant.

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>Text Classification with Transformers.js</title>
  <style>
    body { font-family: system-ui, sans-serif; max-width: 680px;
           margin: 2rem auto; padding: 0 1rem; }
    textarea { width: 100%; height: 100px; padding: 0.5rem;
               font-size: 1rem; margin-bottom: 0.5rem; }
    button { padding: 0.5rem 1.5rem; font-size: 1rem; cursor: pointer; }
    button:disabled { opacity: 0.5; cursor: not-allowed; }
    #status { color: #666; font-size: 0.9rem; margin: 0.5rem 0; }
    #result { margin-top: 1rem; font-size: 1.1rem; font-weight: bold; }
    .positive { color: #16a34a; }
    .negative { color: #dc2626; }
  </style>
</head>
<body>
  <h1>Sentiment Classifier</h1>
  <p>Runs entirely in your browser -- no server, no API calls.</p>

  <textarea id="input" placeholder="Enter text to classify...">
I really enjoyed using this product. The setup was easy and everything works perfectly.
  </textarea>

  <button id="classify-btn" disabled> model...</button>
  <div id="status">Down model on first run (this may take a moment)...</div>
  <div id="result"></div>

  <script type="module">
    import { pipeline } from
      'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.2';

    const statusEl  = document.getElementById('status');
    const resultEl  = document.getElementById('result');
    const btn       = document.getElementById('classify-btn');
    const inputEl   = document.getElementById('input');

    let classifier;

    async function loadModel() {
      classifier = await pipeline(
        'text-classification',
        'Xenova/distilbert-base-uncased-finetuned-sst-2-english',
        {
          dtype: 'q8',
          progress_callback: (p) => {
            if (p.status === 'progress') {
              const pct = Math.round(p.progress ?? 0);
              statusEl.textContent = `Down model: \${pct}%`;
            }
          }
        }
      );

      btn.textContent  = 'Classify';
      btn.disabled     = false;
      statusEl.textContent = 'Model loaded and cached. Subsequent loads are instant.';
    }

    async function classify() {
      const text = inputEl.value.trim();
      if (!text) return;

      btn.disabled         = true;
      btn.textContent      = 'Classifying...';
      resultEl.textContent = '';

      const results = await classifier(text);
      const { label, score } = results;

      const pct       = (score * 100).toFixed(1);
      const cssClass  = label === 'POSITIVE' ? 'positive' : 'negative';

      resultEl.innerHTML =
        `<span class="\${cssClass}">\${label}</span> -- \${pct}% confidence`;

      btn.disabled    = false;
      btn.textContent = 'Classify';
    }

    btn.addEventListener('click', classify);

    loadModel().catch(err => {
      statusEl.textContent = `Error  model: \${err.message}`;
    });
  </script>
</body>
</html>

The loadModel

function calls pipeline()

with the task name, model ID, and options. The progress_callback

fires repeatedly during the download and updates the status text so the user is not staring at a frozen screen. Once the model loads, the button is enabled. When the user clicks Classify, classifier(text)

runs inference synchronously from cache, typically under 200ms on a modern laptop. The result destructures label

and score

from the first array element, formats the confidence as a percentage, and applies a CSS class for color coding.

# Task 2: Zero-Shot Classification #

Zero-shot classification does something regular text classification cannot: it classifies text into categories you define at runtime, with no training data required. You pass the text and a list of labels in plain English. The model decides which label fits best based on its understanding of language semantics.

This is useful any time you cannot or do not want to train a model on labelled examples, which is most of the time in real projects.

// How It Works Under the Hood

The model reformulates each candidate label as a natural language inference (NLI) hypothesis. For the label "billing issue", it generates the hypothesis "** This text is about a billing issue**" and computes the probability that the hypothesis is entailed by the input text. The label with the highest entailment score wins. This NLI-based approach is why you can use any descriptive English phrase as a label and get a meaningful result. The model understands the meaning of your labels, not just their surface form.

What the output looks like:

const classifier = await pipeline('zero-shot-classification',
  'Xenova/bart-large-mnli');

const result = await classifier(
  'My invoice is wrong and I was charged twice.',
  ['billing', 'technical support', 'shipping', 'returns', 'account access']
);

// {
//   sequence: 'My invoice is wrong and I was charged twice.',
//   labels:   ['billing', 'returns', 'account access', 'technical support', 'shipping'],
//   scores:   [0.871,      0.063,     0.031,             0.022,               0.013]
// }

The output is an object with three fields. sequence

is the original input text. labels

is an array of your candidate labels, sorted from highest to lowest score. scores

is an array of confidence scores in the same order. The first element of both arrays is always the winning prediction. Scores across all labels sum to approximately 1 when multi_label

is false (the default).

Setting multi_label: true

changes the behavior: each label scores independently rather than competing, so multiple labels can all have high scores simultaneously. Use this when text plausibly belongs to several categories at once.

// Full Working Example

Here is your updated script block with all the HTML brackets fully escaped. You can paste this directly into your Custom HTML block in WordPress, and it will render perfectly as a code snippet.

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>Zero-Shot Classifier -- Support Ticket Router</title>
  <style>
    body { font-family: system-ui, sans-serif; max-width: 720px;
           margin: 2rem auto; padding: 0 1rem; }
    textarea { width: 100%; height: 120px; padding: 0.5rem; font-size: 1rem; }
    button { margin-top: 0.5rem; padding: 0.5rem 1.5rem;
             font-size: 1rem; cursor: pointer; }
    button:disabled { opacity: 0.5; cursor: not-allowed; }
    #status  { color: #666; font-size: 0.9rem; margin: 0.5rem 0; }
    .result-row { display: flex; justify-content: space-between;
                  padding: 0.4rem 0; border-bottom: 1px solid #eee; }
    .bar-container { width: 60%; background: #f0f0f0;
                     border-radius: 4px; height: 18px; }
    .bar { background: #2563eb; height: 100%;
           border-radius: 4px; transition: width 0.3s; }
    .label-name { min-width: 160px; font-weight: 500; }
    .score-text { min-width: 50px; text-align: right; color: #555; }
  </style>
</head>
<body>
  <h1>Support Ticket Router</h1>
  <p>Paste a support ticket. The model routes it to the right department
     with no training data needed.</p>

  <textarea id="ticket">
I placed an order three days ago but it still hasn't shipped. I have an event
this weekend and really need this to arrive on time. My order number is #48821.
  </textarea>

  <button id="route-btn" disabled> model...</button>
  <div id="status">Down model on first run...</div>
  <div id="results"></div>

  <script type="module">
    import { pipeline } from
      'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.2';

    const statusEl  = document.getElementById('status');
    const resultsEl = document.getElementById('results');
    const btn       = document.getElementById('route-btn');
    const ticketEl  = document.getElementById('ticket');

    const DEPARTMENTS = [
      'shipping and delivery',
      'billing and payment',
      'technical support',
      'returns and refunds',
      'account and login'
    ];

    let classifier;

    async function loadModel() {
      classifier = await pipeline(
        'zero-shot-classification',
        'Xenova/bart-large-mnli',
        {
          dtype: 'q8',
          progress_callback: (p) => {
            if (p.status === 'progress') {
              statusEl.textContent =
                `Down model: ${Math.round(p.progress ?? 0)}%`;
            }
          }
        }
      );

      btn.disabled    = false;
      btn.textContent = 'Route Ticket';
      statusEl.textContent = 'Model ready.';
    }

    async function routeTicket() {
      const text = ticketEl.value.trim();
      if (!text) return;

      btn.disabled         = true;
      btn.textContent      = 'Routing...';
      resultsEl.innerHTML  = '';

      const output = await classifier(text, DEPARTMENTS, {
        multi_label: false
      });

      const winner = output.labels;
      const confidence = (output.scores * 100).toFixed(1);

      let html = `<h3>Route to: <strong>\${winner}</strong>
                  (\${confidence}% confidence)</h3>
                  <p style="color:#666; font-size:0.9rem">
                  Full department score breakdown:</p>`;

      output.labels.forEach((label, i) => {
        const pct = (output.scores[i] * 100).toFixed(1);
        const barWidth = (output.scores[i] * 100).toFixed(0);
        html += `
          <div class="result-row">
            <span class="label-name">\${label}</span>
            <div class="bar-container">
              <div class="bar" style="width: \${barWidth}%"></div>
            </div>
            <span class="score-text">\${pct}%</span>
          </div>`;
      });

      resultsEl.innerHTML  = html;
      btn.disabled         = false;
      btn.textContent      = 'Route Ticket';
    }

    btn.addEventListener('click', routeTicket);
    loadModel().catch(err => {
      statusEl.textContent = `Error: \${err.message}`;
    });
  </script>
</body>
</html>

The DEPARTMENTS

array is all the routing configuration this system needs. No training data, no labeled examples. When a ticket arrives, classifier(text, DEPARTMENTS, { multi_label: false })

runs all five entailment checks internally and returns them ranked. The results loop builds a horizontal bar chart showing each department's score, a sorted visualization that makes it immediately obvious where the ticket should go and how confident the model was. Try changing the DEPARTMENTS

array to completely different labels; the model routes correctly without any code change beyond that array.

# Task 3: Question Answering #

Question answering in Transformers.js is extractive: you provide a passage of text as context and ask a question in plain English. The model locates the span within the passage that best answers the question and returns it. It does not generate text or reason beyond what is literally in the context. The answer is always a substring of the input you provided.

This makes it well-suited for document interrogation. The user provides the document; the model navigates it.

What the output looks like:

const qa = await pipeline('question-answering', 'Xenova/distilbert-base-uncased-distilled-squad');

const result = await qa({
  question: 'What is the return window for electronics?',
  context: `Our return policy allows customers to return most items within 30 days
            of purchase. Electronics must be returned within 15 days and must be
            in original packaging. Software and digital downloads are non-refundable.`
});

// {
//   answer: '15 days',
//   score:  0.9823,
//   start:  97,    // character index of answer start in context
//   end:    104    // character index of answer end in context
// }

The output has four fields. answer

is the extracted substring. score

is the model's confidence that this span answers the question. start

and end

are character indices into the original context you can use these to highlight the answer in the source text, which is valuable UX for longer documents.

When the question has no clear answer in the context, score

will be low and answer

may be a short, seemingly random span. Treating low-confidence answers (below 0.3 or 0.4) as "not found" is standard practice.

// Full Working Example

Here is the escaped code for your Document Q&A article block. This handles all the < and > brackets inside the script and templates perfectly so it will show up cleanly on your site.

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>Document Q&A with Transformers.js</title>
  <style>
    body { font-family: system-ui, sans-serif; max-width: 720px;
           margin: 2rem auto; padding: 0 1rem; }
    label { font-weight: 600; display: block; margin-top: 1rem; }
    textarea { width: 100%; padding: 0.5rem; font-size: 0.95rem; }
    input[type="text"] { width: 100%; padding: 0.5rem;
                         font-size: 0.95rem; box-sizing: border-box; }
    button { margin-top: 0.75rem; padding: 0.5rem 1.5rem;
             font-size: 1rem; cursor: pointer; }
    button:disabled { opacity: 0.5; cursor: not-allowed; }
    #status { color: #666; font-size: 0.9rem; margin: 0.5rem 0; }
    #answer-box { margin-top: 1rem; padding: 1rem;
                  background: #f8fafc; border-left: 3px solid #2563eb; }
    .highlight { background: #fef08a; border-radius: 2px; }
    .confidence { color: #666; font-size: 0.85rem; margin-top: 0.5rem; }
  </style>
</head>
<body>
  <h1>Document Question Answering</h1>
  <p>Paste any document, then ask questions about it.
     Answers are extracted directly from the text.</p>

  <label for="context">Document / Context</label>
  <textarea id="context" rows="8">
Acme Corp Return Policy (Updated March 2025)

Customers may return most standard items within 30 days of the original purchase
date for a full refund. Electronics and peripherals have a shorter return window
of 15 days and must be returned in original, unopened packaging to qualify.

Refunds are processed within 3-5 business days after we receive the returned item.
Original shipping charges are non-refundable. For items valued over $200, customers
must contact support at returns@acmecorp.com before initiating a return.

Software licenses and digital downloads are non-refundable under any circumstances.
Gift cards cannot be returned or exchanged for cash.
  </textarea>

  <label for="question">Your Question</label>
  <input type="text" id="question"
         value="How long does it take to process a refund?" />

  <button id="ask-btn" disabled> model...</button>
  <div id="status">Down model on first run...</div>
  <div id="answer-box" style="display:none"></div>

  <script type="module">
    import { pipeline } from
      'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.2';

    const contextEl   = document.getElementById('context');
    const questionEl  = document.getElementById('question');
    const statusEl    = document.getElementById('status');
    const answerBox   = document.getElementById('answer-box');
    const btn         = document.getElementById('ask-btn');

    const CONFIDENCE_THRESHOLD = 0.1;

    let qaModel;

    async function loadModel() {
      qaModel = await pipeline(
        'question-answering',
        'Xenova/distilbert-base-uncased-distilled-squad',
        {
          dtype: 'q8',
          progress_callback: (p) => {
            if (p.status === 'progress') {
              statusEl.textContent =
                `Down model: ${Math.round(p.progress ?? 0)}%`;
            }
          }
        }
      );

      btn.disabled    = false;
      btn.textContent = 'Ask';
      statusEl.textContent = 'Model ready.';
    }

    async function askQuestion() {
      const context  = contextEl.value.trim();
      const question = questionEl.value.trim();
      if (!context || !question) return;

      btn.disabled         = true;
      btn.textContent      = 'Thinking...';
      answerBox.style.display = 'none';

      const result = await qaModel({ question, context });

      answerBox.style.display = 'block';

      if (result.score < CONFIDENCE_THRESHOLD) {
        answerBox.innerHTML = `
          <strong>Answer not found</strong>
          <p class="confidence">The model could not find a clear answer
          to this question in the provided text.</p>`;
      } else {
        const before    = context.slice(0, result.start);
        const answer    = context.slice(result.start, result.end);
        const after     = context.slice(result.end);
        const highlight = `\${before}<mark class="highlight">\${answer}</mark>\${after}`;

        const confidence = (result.score * 100).toFixed(1);
        answerBox.innerHTML = `
          <strong>Answer:</strong> \${result.answer}
          <p class="confidence">Confidence: \${confidence}%</p>
          <details style="margin-top:1rem">
            <summary style="cursor:pointer; color:#2563eb">
              Show answer highlighted in document
            </summary>
            <pre style="white-space:pre-wrap; font-size:0.85rem;
                        margin-top:0.5rem">\${highlight}</pre>
          </details>`;
      }

      btn.disabled    = false;
      btn.textContent = 'Ask';
    }

    btn.addEventListener('click', askQuestion);

    questionEl.addEventListener('keydown', (e) => {
      if (e.key === 'Enter' && !btn.disabled) askQuestion();
    });

    loadModel().catch(err => {
      statusEl.textContent = `Error: \${err.message}`;
    });
  script>
</body>
</html>

The QA pipeline receives an object with question

and context

rather than a plain string. This is the format the task requires. The model's start

and end

fields are character indices into the context string, which the code uses to inject a <mark>

tag around the exact span the model identified. The <details>

element wraps the highlighted context in a collapsible section so the UI stays clean. The confidence threshold prevents low-quality extractions from appearing as confident answers; any result below 0.1 gets replaced with a "not found" message.

# Real-World Application: Support Ticket Router #

The three pipelines cover the full analytical surface of a support ticket. Sentiment tells you how the customer feels. Zero-shot classification routes the ticket to the right team. Question answering extracts the structured data you need: order number, product name, and the core issue, without parsing rules or regex.

This is a complete support ticket analysis tool that combines all three. It is a single HTML file, fully self-contained, fully commented.

Here is the completely escaped version of your Support Ticket Analyzer code block. All internal HTML brackets within your layout templates and script configurations have been securely converted to entities. You can drop this directly into your Custom HTML block in WordPress.

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>Support Ticket Analyzer</title>
  <style>
    * { box-sizing: border-box; }
    body { font-family: system-ui, sans-serif; max-width: 800px;
           margin: 2rem auto; padding: 0 1rem; background: #f9fafb; }
    h1   { margin-bottom: 0.25rem; }
    .subtitle { color: #666; margin-bottom: 1.5rem; }

    textarea { width: 100%; height: 130px; padding: 0.75rem;
               font-size: 0.95rem; border: 1px solid #d1d5db;
               border-radius: 6px; resize: vertical; }
    button { padding: 0.6rem 1.8rem; font-size: 1rem;
             background: #2563eb; color: white; border: none;
             border-radius: 6px; cursor: pointer; margin-top: 0.5rem; }
    button:disabled { background: #93c5fd; cursor: not-allowed; }

    #status { font-size: 0.85rem; color: #666; margin: 0.5rem 0; }

    .cards { display: grid; grid-template-columns: repeat(3, 1fr);
             gap: 1rem; margin-top: 1.5rem; }
    .card  { background: white; border-radius: 8px; padding: 1rem;
             border: 1px solid #e5e7eb; }
    .card h3 { margin: 0 0 0.75rem; font-size: 0.9rem;
               text-transform: uppercase; letter-spacing: 0.05em;
               color: #6b7280; }
   .card .value { font-size: 1.15rem; font-weight: 600; }
    .card .sub   { font-size: 0.85rem; color: #666; margin-top: 0.25rem; }

    .positive { color: #16a34a; }
    .negative { color: #dc2626; }
    .neutral  { color: #d97706; }

    .dept-bar { display: flex; align-items: center; gap: 0.5rem;
                margin-top: 0.4rem; font-size: 0.85rem; }
    .bar-bg   { flex: 1; background: #f0f0f0; border-radius: 3px; height: 8px; }
    .bar-fill { background: #2563eb; height: 100%;
                border-radius: 3px; transition: width 0.4s; }

    .qa-item  { margin-top: 0.6rem; font-size: 0.9rem; }
    .qa-label { font-weight: 600; color: #374151; }
    .qa-ans   { color: #111; }
    .qa-low   { color: #9ca3af; font-style: italic; }

    @media (max-width: 600px) {
      .cards { grid-template-columns: 1fr; }
    }
  </style>
</head>
<body>
  <h1>Support Ticket Analyzer</h1>
  <p class="subtitle">Powered by Transformers.js -- runs entirely in your browser</p>

  <textarea id="ticket">
Hi, I ordered a laptop stand last Tuesday (order #73021) but it arrived completely
broken -- one of the arms snapped off right out of the box. I've been a customer for
three years and this is honestly really disappointing. I need a replacement sent out
as soon as possible or I'd like a full refund. Please advise.
  </textarea>

  <button id="analyze-btn" disabled> models...</button>
  <div id="status">Initializing -- down models on first run...</div>

  <div class="cards" id="cards" style="display:none">
    <div class="card" id="card-sentiment">
      <h3>Sentiment</h3>
      <div class="value" id="sent-label">--</div>
      <div class="sub"   id="sent-score">--</div>
    </div>

    <div class="card" id="card-route">
      <h3>Department</h3>
      <div id="dept-results"></div>
    </div>

    <div class="card" id="card-qa">
      <h3>Key Info</h3>
      <div id="qa-results"></div>
    </div>
  </div>

  <script type="module">
    import { pipeline } from
      'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.2';

    const ticketEl  = document.getElementById('ticket');
    const btn       = document.getElementById('analyze-btn');
    const statusEl  = document.getElementById('status');
    const cardsEl   = document.getElementById('cards');

    const DEPARTMENTS = [
      'returns and refunds',
      'shipping and delivery',
      'billing and payment',
      'technical support',
      'account management'
    ];

    const QA_QUERIES = [
      { label: 'Order number',  question: 'What is the order number?' },
      { label: 'Issue',         question: 'What is the main problem or complaint?' },
      { label: 'Request',       question: 'What does the customer want?' }
    ];

    let sentimentPipe, zeroPipe, qaPipe;
    let modelsLoaded = 0;

    function onModelLoaded(name) {
      modelsLoaded++;
      statusEl.textContent =
        ` models: \${modelsLoaded}/3 ready (\${name} loaded)`;
      if (modelsLoaded === 3) {
        btn.disabled    = false;
        btn.textContent = 'Analyze Ticket';
        statusEl.textContent = 'All models ready.';
      }
    }

    async function loadModels() {
      [sentimentPipe, zeroPipe, qaPipe] = await Promise.all([

        pipeline(
          'text-classification',
          'Xenova/distilbert-base-uncased-finetuned-sst-2-english',
          { dtype: 'q8',
            progress_callback: p =>
              p.status === 'done' && onModelLoaded('Sentiment') }
        ),

        pipeline(
          'zero-shot-classification',
          'Xenova/bart-large-mnli',
          { dtype: 'q8',
            progress_callback: p =>
              p.status === 'done' && onModelLoaded('Routing') }
        ),

        pipeline(
          'question-answering',
          'Xenova/distilbert-base-uncased-distilled-squad',
          { dtype: 'q8',
            progress_callback: p =>
              p.status === 'done' && onModelLoaded('Q&A') }
        )
      ]);
    }

    async function analyzeTicket() {
      const text = ticketEl.value.trim();
      if (!text) return;

      btn.disabled    = true;
      btn.textContent = 'Analyzing...';
      cardsEl.style.display = 'none';

      const [sentResult, zeroResult, qaResults] = await Promise.all([
        sentimentPipe(text),
        zeroPipe(text, DEPARTMENTS, { multi_label: false }),
        Promise.all(
          QA_QUERIES.map(({ question }) =>
            qaPipe({ question, context: text })
          )
        )
      ]);

      const { label, score } = sentResult;
      const sentLabel = document.getElementById('sent-label');
      const sentScore = document.getElementById('sent-score');

      sentLabel.textContent = label;
      sentLabel.className   = `value \${label === 'POSITIVE' ? 'positive' : 'negative'}`;
      sentScore.textContent = `\${(score * 100).toFixed(1)}% confidence`;

      if (label === 'NEGATIVE' && score > 0.85) {
        sentScore.textContent += ' -- HIGH URGENCY';
        sentScore.style.color = '#dc2626';
      }

      const deptEl = document.getElementById('dept-results');
      deptEl.innerHTML = `<div class="value">\${zeroResult.labels}</div>`;

      zeroResult.labels.slice(0, 3).forEach((dept, i) => {
        const pct = (zeroResult.scores[i] * 100).toFixed(0);
        deptEl.innerHTML += `
          <div class="dept-bar">
            <span style="min-width:130px">\${dept}</span>
            <div class="bar-bg">
              <div class="bar-fill" style="width:\${pct}%"></div>
            </div>
            <span>\${pct}%</span>
          </div>`;
      });

      const qaEl = document.getElementById('qa-results');
      qaEl.innerHTML = '';

      QA_QUERIES.forEach(({ label: qLabel }, i) => {
        const { answer, score: qScore } = qaResults[i];
        const found = qScore >= 0.1;

        qaEl.innerHTML += `
          <div class="qa-item">
            <span class="qa-label">\${qLabel}: </span>
            <span class="\${found ? 'qa-ans' : 'qa-low'}">
              \${found ? answer : 'not found'}
            </span>
          </div>`;
      });

      cardsEl.style.display = 'grid';
      btn.disabled    = false;
      btn.textContent = 'Analyze Ticket';
    }

    btn.addEventListener('click', analyzeTicket);
    loadModels().catch(err => {
      statusEl.textContent = `Error  models: \${err.message}`;
    });
  </script>
</body>
</html>

The three pipelines load in parallel via Promise.all

. This is faster than them sequentially because the downloads overlap. A counter tracks how many have finished, so the button only enables once all three are ready. When the user submits a ticket, all three inferences also run in parallel. The sentiment card checks whether the result is high-confidence negative and flags it as an urgent practical routing signal that requires no additional model.

The department card shows the top three candidates as score bars rather than just the winner, which gives the support team enough information to override the routing if the top score is close to the second. The QA card runs three extractive queries against the ticket body and displays the results with a confidence threshold answers below 0.1 show as "not found" rather than surfacing low-quality extractions.

# Performance, Limitations, and When Not to Use It #

Transformers.js removes the server but does not eliminate trade-offs. Knowing them up front saves you from unpleasant surprises in production.

Download size. The sentiment analysis pipeline downloads around111 MB on first load, not huge, but not invisible either. The zero-shot BART model is larger. For applications targeting mobile users or users on metered connections, useto cut model sizes roughly in half, and treat the model as a progressive enhancement; do not block the user interface on model load

Inference speed. On a modern laptop, WASM inference for a short text classification takes 50–200ms. Zero-shot classification is slower because it runs multiple NLI passes, one per candidate label. A five-label zero-shot run typically takes 1–3 seconds on CPU. WebGPU reduces this significantly where supportedInference only. Transformers.jscannot fine-tune or train models. If your use case requires a custom model, a classifier trained on your own labelled tickets, for example, training happens on a server (Python, cloud), and the ONNX export runs in the browserModel availability. Not every model on Hugging Face Hub has an ONNX version available. To find compatible models,filter by the transformers.js library tag on the Hub- When to prefer a server instead: bulk processing of hundreds of texts where latency per item matters, tasks that require the largest frontier models, which are too large for browser delivery, or simple applications where the development cost of browser-based inference outweighs its benefits

A quick reference for choosing dtype by context:

Context | Recommended dtype | Why | |---|---|---| | Browser, general use | q8 | WASM default, good balance of size and accuracy | | Mobile or slow connection | q4 | Roughly half the file size, 1-3% accuracy cost | | Node.js server-side | fp32 | Full precision, no download size concern | | WebGPU enabled | fp16 | Fast, good quality on compatible GPU hardware |

# Wrapping Up #

Transformers.js puts production-quality NLP in the browser without a server, without an API key, and without user data leaving the device. The three pipelines in this tutorial text classification, zero-shot labelling, and question answering cover the analytical surface of a large share of real NLP use cases. The support ticket router shows how they combine into something genuinely useful in fewer than 200 lines of HTML and JavaScript.

The entry point is as low as it gets: one CDN import, one await pipeline()

call, one inference call. Start with the simplest example in this article and run it. Modify the labels in the zero-shot demo. Point the QA model at a different document. The official Transformers.js documentation and the examples repository cover a much wider task range summarization, translation, named entity recognition, and more, all following the same pipeline()

pattern.

is a software engineer and technical writer passionate about leveraging cutting-edge technologies to craft compelling narratives, with a keen eye for detail and a knack for simplifying complex concepts. You can also find Shittu on

Shittu Olumide

source & further reading

kdnuggets.com — original article Structured Language Model Generation with Outlines Fine-Tuning Explained for Noobs (How Pretrained Models Learn New Skills) Running OpenClaw with Ollama

Practical NLP in the Browser with Transformers.js

# Introduction #

# What Transformers.js Actually Is #

# The pipeline() API #

# Task 1: Text Classification #

// Full Working Example

# Task 2: Zero-Shot Classification #

// How It Works Under the Hood

// Full Working Example

# Task 3: Question Answering #

// Full Working Example

# Real-World Application: Support Ticket Router #

# Performance, Limitations, and When Not to Use It #

# Wrapping Up #

Run your AI side-project on zahid.host