cd /news/computer-vision/how-i-built-a-browser-side-backgroun… · home topics computer-vision article
[ARTICLE · art-39107] src=dev.to ↗ pub= topic=computer-vision verified=true sentiment=· neutral

How I built a browser-side background remover (and benchmarked Canvas vs WebAssembly)

A developer built two browser-based background removal tools—one using the Canvas API for uniform backgrounds and another using WebAssembly with an ONNX Runtime machine learning model for complex backgrounds. Benchmarking 500 product photos on a 2023 MacBook Pro, the Canvas approach processed images in 12-18 ms with 24 MB memory, while the ML model took 180-220 ms per image and 180 MB memory but handled complex backgrounds like hair and fur.

read5 min views1 publishedJun 25, 2026

I had 300 product photos sitting in a folder. White backgrounds, mostly. I needed them on transparent backgrounds for a client's Shopify store. The obvious move: upload them to some online background remover. Five minutes, done.

Then I thought about it. These were unreleased product shots. Up them to a random server felt wrong. Plus, I'd have to do this every month when new products came in. I wanted something that ran locally.

Browsers can do this now. The question was how well.

There are two practical ways to remove backgrounds in the browser without sending pixels to a server:

Canvas API pixel bashing. You load the image onto a <canvas>

, grab the pixel data with getImageData()

, and manually set alpha values. Pick a reference color from the background, calculate each pixel's distance from it, threshold it. This is fast and needs zero dependencies. But it only works on uniform backgrounds.

WebAssembly + ML model. You compile a segmentation model to WASM, load it, and run inference in the browser. ONNX Runtime Web makes this practical. It handles hair, fur, complex edges — but you're down an 8+ MB model and burning more CPU.

I built both and ran them through 500 test images. Here's what happened.

The code is straightforward. Load the image, sample a pixel from a corner (assuming that's the background), and set alpha to zero for every pixel within a threshold:

function removeBackgroundCanvas(image, threshold = 40) {
  const canvas = document.createElement('canvas');
  canvas.width = image.width;
  canvas.height = image.height;
  const ctx = canvas.getContext('2d');
  ctx.drawImage(image, 0, 0);

  const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
  const data = imageData.data;

  // Sample background from top-left corner
  const bgR = data[0], bgG = data[1], bgB = data[2];

  for (let i = 0; i < data.length; i += 4) {
    const dr = data[i] - bgR;
    const dg = data[i + 1] - bgG;
    const db = data[i + 2] - bgB;
    const distance = Math.sqrt(dr * dr + dg * dg + db * db);

    if (distance < threshold) {
      data[i + 3] = 0; // Set alpha to 0
    }
  }

  ctx.putImageData(imageData, 0, 0);
  return canvas;
}

This works surprisingly well on studio-lit product photos. A white or light-gray background gets nuked in about 15 milliseconds per image on my laptop. No dependencies, no spinners, no model downloads.

The problem: as soon as the background isn't uniform — a gradient, a textured wall, someone's shirt that's close to the wall color — it falls apart. You get jagged edges, halos, or chunks of the subject vanishing.

For real images, you need a model that understands what's foreground and what's not. MediaPipe's selfie segmentation model runs in the browser and can be loaded via ONNX Runtime Web:

import * as ort from 'onnxruntime-web';

async function removeBackgroundML(image) {
  const session = await ort.InferenceSession.create('model.onnx');

  // Preprocess: resize to model input size, normalize
  const tensor = preprocessImage(image, 256, 256);

  const results = await session.run({ input: tensor });
  const mask = results.output.data; // Float32Array, 256x256

  // Apply mask as alpha channel
  return applyMaskToImage(image, mask);
}

The catch: the model file is 8.3 MB. First load takes about 1.2 seconds on a fast connection. Inference takes roughly 180-220 milliseconds per image. You're also pulling in onnxruntime-web

which adds about 2 MB to your bundle.

But the output is dramatically better. Hair strands, fur, transparent objects — things the Canvas approach can't touch — get handled reasonably well.

I ran 500 images through both approaches on a 2023 MacBook Pro (M2, 16 GB RAM):

Metric Canvas WebAssembly
Time per image 12-18 ms 180-220 ms
Model load time 0 ms 1,100 ms
Memory peak 24 MB 180 MB
Works on uniform bg Yes Yes
Works on complex bg No Yes (mostly)
Handles hair/fur No Yes
Bundle size added 0 KB ~10 MB

The Canvas approach is basically free — you're already paying for image decoding, and the pixel loop runs at native speed once the JIT kicks in. On 500 images, total processing time was under 8 seconds.

The WebAssembly approach took about 95 seconds for the same batch, plus the initial model download. But it handled 412 out of 500 images correctly, versus 187 for Canvas.

I combined both. The tool tries the Canvas approach first. If more than 30% of edge pixels end up partially transparent — a sign of a non-uniform background — it falls back to the ML model. This hybrid approach averages about 40 ms per image on typical product photo batches.

If you need a browser-based background remover that handles both simple and complex images, this one runs entirely on your machine. The UI is in Spanish, but drag-and-drop needs no translation.

A few things I learned that aren't obvious from the docs:

** getImageData() triggers a GPU-to-CPU readback on most browsers.** If you're processing multiple images in sequence, batch your Canvas operations before reading pixels back, or you'll pay the sync cost every time.

ONNX Runtime Web has two backends: wasm and webgl. The WebGL backend is 3-5x faster for inference but only works if WebGL is available. Always check

ort.env.wasm.numThreads

and set it to navigator.hardwareConcurrency

— otherwise you'll leave cores idle.For product photos, 256x256 model input is enough. Going to 512x512 buys you noticeably better edges but roughly 4x the inference time. Not worth it unless you're doing professional retouching.

If you just need to quickly strip a background from a photo, try the Canvas approach first. It's simpler than you think. If it doesn't work, the browser can run a real ML model now — you just need to wait a second.

── more in #computer-vision 4 stories · sorted by recency
── more on @canvas api 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/how-i-built-a-browse…] indexed:0 read:5min 2026-06-25 ·