Most background removal tools work like this: upload your photo to a server, wait for an AI model to process it, download the result. Your image sits on someone else's infrastructure. You hope they delete it.
I built one that works differently. The AI model runs in your browser tab. Your image never leaves your device. And I just open-sourced the core logic — two files, zero dependencies beyond a CDN import.
Here's how it works under the hood.
The Pipeline #
The full flow from "user drops an image" to "transparent PNG download" goes through five stages:
Upload → ONNX Model Load → WebAssembly Inference → Mask Generation → Canvas Compositing
Each stage runs entirely client-side. Let me walk through them.
Stage 1: the AI Model in the Browser #
The backbone is @imgly/background-removal, an open-source library that bundles an ONNX segmentation model with ONNX Runtime Web (WebAssembly backend).
const LIB_CDN = 'https://cdn.jsdelivr.net/npm/@imgly/background-removal@1.5.5';
async function loadLibrary() {
const module = await import(LIB_CDN + '/+esm');
removeBackgroundFn = module.removeBackground;
}
The first call downloads ~40MB of model weights. That sounds heavy, but:
- The browser caches it automatically
- Subsequent uses load instantly from cache
- No server round-trip on any future use
This is the same trade-off FFmpeg.wasm makes — big initial download, but then your browser becomes a local processing powerhouse.
Stage 2: Running AI Inference Locally #
Once the model is loaded, inference is straightforward:
const imageBlob = await new Promise(r => canvas.toBlob(r, 'image/png'));
const resultBlob = await removeBackgroundFn(imageBlob, {
model: 'medium',
output: { format: 'image/png' },
progress: (key, current, total) => {
// Update UI
}
});
What's happening behind the scenes:
- The library resizes your image to the model's input dimensions
- Pixel data is converted to a tensor
- ONNX Runtime Web runs the segmentation model via WebAssembly
- The output tensor (a per-pixel foreground probability map) is converted back to an image with transparent background
The medium
model balances quality and speed. On a decent laptop, inference takes 2-5 seconds for a typical photo. On a phone, maybe 8-15 seconds. Acceptable for a free, private tool.
Stage 3: Building the Editable Mask #
Here's where it gets interesting. The AI output isn't final — it's a starting point. I extract the alpha channel from the AI result and build an editable grayscale mask:
async function buildMaskFromResult() {
const w = originalImage.naturalWidth;
const h = originalImage.naturalHeight;
// Draw AI result to a temporary canvas
const resultCanvas = document.createElement('canvas');
resultCanvas.width = w;
resultCanvas.height = h;
const rCtx = resultCanvas.getContext('2d');
rCtx.drawImage(resultImg, 0, 0);
const resultData = rCtx.getImageData(0, 0, w, h);
// Extract alpha channel → grayscale mask
// White = foreground (keep), Black = background (remove)
maskCanvas = document.createElement('canvas');
maskCanvas.width = w;
maskCanvas.height = h;
maskCtx = maskCanvas.getContext('2d');
const maskData = maskCtx.createImageData(w, h);
for (let i = 0; i < resultData.data.length; i += 4) {
const alpha = resultData.data[i + 3];
maskData.data[i] = alpha; // R
maskData.data[i + 1] = alpha; // G
maskData.data[i + 2] = alpha; // B
maskData.data[i + 3] = 255; // A (mask itself is always opaque)
}
maskCtx.putImageData(maskData, 0, 0);
}
Why a separate mask canvas?
Because users need to fix the AI's mistakes. Hair edges, transparent objects, similar-colored backgrounds — no AI gets these perfect 100% of the time. The mask canvas becomes a paintable surface.
Stage 4: Manual Refinement with Brush & Eraser #
This is the feature that separates a toy demo from a usable tool. Users can:
Brush(paint white on mask) → restore foreground areas the AI removed - Eraser(paint black on mask) → remove background areas the AI missed
function paintOnMask(e) {
const rect = editCanvas.getBoundingClientRect();
const x = (e.clientX - rect.left) / rect.width * maskCanvas.width;
const y = (e.clientY - rect.top) / rect.height * maskCanvas.height;
const brushSize = parseInt(brushSizeEl.value);
const softness = parseInt(brushSoftEl.value) / 100;
maskCtx.lineCap = 'round';
maskCtx.lineWidth = brushSize;
// Softness = CSS filter blur on the mask canvas context
if (softness > 0) {
maskCtx.filter = `blur(${Math.round(brushSize * softness * 0.3)}px)`;
}
if (currentTool === 'brush') {
maskCtx.globalCompositeOperation = 'lighter';
maskCtx.strokeStyle = '#ffffff';
} else {
maskCtx.globalCompositeOperation = 'source-over';
maskCtx.strokeStyle = '#000000';
}
maskCtx.beginPath();
maskCtx.moveTo(lastX, lastY);
maskCtx.lineTo(x, y);
maskCtx.stroke();
}
Key details:
Coordinate mapping: The edit canvas is CSS-scaled to fit the viewport, but the mask operates at full image resolution. Every mouse position gets mapped from display coordinates to mask coordinates. -
Edge softness: Uses Canvas 2Dfilter: blur()
on the stroke — this creates feathered edges instead of hard cuts. -
Undo stack: Each mousedown saves a fullImageData
snapshot of the mask. Up to 20 undo levels.
The brush cursor is a position: fixed
div that follows the mouse, sized to match the display-scaled brush diameter. The actual canvas cursor is set to none
.
Stage 5: Compositing the Final Output #
To generate the downloadable PNG, the mask is applied to the original image:
function applyMaskToOriginal() {
const origData = origCtx.getImageData(0, 0, w, h);
const mData = maskCtx.getImageData(0, 0, w, h);
const outData = oCtx.createImageData(w, h);
for (let i = 0; i < origData.data.length; i += 4) {
outData.data[i] = origData.data[i]; // R — original
outData.data[i + 1] = origData.data[i + 1]; // G — original
outData.data[i + 2] = origData.data[i + 2]; // B — original
outData.data[i + 3] = mData.data[i]; // A — from mask R channel
}
oCtx.putImageData(outData, 0, 0);
return outCanvas;
}
The mask's R channel (which equals G and B since it's grayscale) becomes the alpha channel of the output. White mask pixels → fully opaque. Black → fully transparent. Gray → semi-transparent (useful for hair and soft edges).
The Refine Mode Overlay #
In refine mode, users see the original image with a semi-transparent red overlay on removed areas:
function renderMaskOverlay() {
editCtx.drawImage(maskCanvas, 0, 0, dw, dh);
const overlayData = editCtx.getImageData(0, 0, dw, dh);
for (let i = 0; i < overlayData.data.length; i += 4) {
const maskVal = overlayData.data[i];
if (maskVal < 128) {
// Removed area → semi-transparent red
overlayData.data[i] = 220; // R
overlayData.data[i + 1] = 50; // G
overlayData.data[i + 2] = 50; // B
overlayData.data[i + 3] = 120; // A
} else {
// Kept area → fully transparent (show original underneath)
overlayData.data[i + 3] = 0;
}
}
editCtx.putImageData(overlayData, 0, 0);
}
This gives immediate visual feedback — you can see exactly what the AI removed and paint corrections in real time.
Performance Considerations #
Memory: Three full-resolution canvases live in memory (original, mask, output). For a 4000×3000 photo, that's ~144MB of pixel data. Mobile devices with <4GB RAM may struggle. -
Real-time rendering: Every brush stroke triggersrenderPreview()
viarequestAnimationFrame
. This redraws the preview canvas + overlay from the mask. On large images, there's a noticeable lag. -
Touch support: Full touch event handling withpassive: false
to prevent scroll interference.
What I Stripped for the Open-Source Version #
The production version on ToolKnit includes:
- Daily usage limits (fair-use throttling)
- Analytics tracking
- Self-hosted model weights (faster from our CDN)
- Sound effects on completion
- Site navigation and SEO shell
The open-source version strips all of that down to two files:
index.html
— standalone UI (~250 lines) -
app.js
— core logic (~380 lines)
You can clone it, run npx serve .
, and have a working background remover in 30 seconds.
What's Next #
Some ideas for anyone who wants to fork and extend:
Background replacement— solid color or custom image behind the subject - Batch processing— drop multiple images, process all sequentially - WebGPU acceleration— ONNX Runtime Web supports WebGPU; inference could be 3-5x faster - Edge feathering controls— post-process the mask with adjustable blur radius - Before/after slider— drag to compare original and result
Try It #
Live tool:toolknit.com/tools/background-remover.html - Open source:github.com/2645149786-dotcom/toolknit - All 61 tools:toolknit.com
If you've ever needed to remove a background without up your photo to a random website — this is it. Clone it, use it, break it, improve it.
Built by Zihang Dong. Building browser-first tools at ToolKnit.