{"slug": "i-built-an-ai-video-clip-finder-that-runs-100-in-your-browser-no-uploads-no-api", "title": "I built an AI video clip finder that runs 100% in your browser — no uploads, no API, no GPU costs", "summary": "A developer built ClipGG's AI Video Highlights tool, which runs entirely in the browser using Web Audio API and FFmpeg.wasm to find and extract video highlights without file uploads, server costs, or GPU requirements. The tool processes audio locally by decoding and downsampling, then scores segments based on RMS, zero-crossing rate, and peak volume with content-type-specific weights. It handles iOS limitations by pre-extracting audio to WAV format and supports fast export via stream copying for H.264/AAC MP4 files.", "body_md": "Every time I used Opus Clip or Vidyo.ai, the same thought hit me:\n\nI’m paying $20/month to upload my video to someone else’s server,\n\nwait in a queue, and hope their AI finds something useful.\n\nSo I built an alternative that runs entirely in the browser.\n\nNo file uploads. No subscriptions. No server costs on my end.\n\nThe result is ClipGG’s AI Video Highlights tool —\n\nand in this post I’ll walk through exactly how it works technically.\n\nFinding highlights in a long video is genuinely hard to automate well.\n\nThe expensive approach: transcribe with Whisper, feed text to GPT-4,\n\nprofit. But that requires a backend, API costs, and user uploads.\n\nI wanted zero server involvement.\n\nThat meant doing everything with browser APIs.\n\nThe pipeline has four stages:\n\n``` js\nconst arrayBuffer = await file.arrayBuffer()\n// The file never leaves the device.\n// ArrayBuffer is passed directly to Web Audio API.\n```\n\nI use `OfflineAudioContext`\n\nto decode audio faster than real-time,\n\nthen downsample to 8000–11025 Hz before analysis.\n\nThis reduces RAM usage from ~115MB to ~19MB for a 10-minute video.\n\n``` js\n// Decode in a Web Worker so the UI never freezes\nconst tempCtx = new OfflineAudioContext(1, 44100, 44100)\nconst audioBuffer = await tempCtx.decodeAudioData(arrayBuffer)\n\n// Downsample manually — OfflineAudioContext does NOT resample automatically\nfunction downsample(channelData, originalRate, targetRate) {\n  const ratio = originalRate / targetRate\n  const output = new Float32Array(Math.floor(channelData.length / ratio))\n  for (let i = 0; i < output.length; i++) {\n    const start = Math.floor(i * ratio)\n    const end = Math.min(Math.floor((i + 1) * ratio), channelData.length)\n    let sum = 0\n    for (let j = start; j < end; j++) sum += channelData[j]\n    output[i] = sum / (end - start)\n  }\n  return output\n}\n```\n\nFor each 500ms window I compute:\n\nThen I do relative normalization so a quiet podcast\n\nand a loud gaming stream are scored fairly against themselves:\n\n``` js\n// Relative normalization — key insight\nconst normalizedRms = (seg.rms - globalMinRms) / (globalMaxRms - globalMinRms)\n```\n\nDifferent content types use different weights:\n\n| Mode | RMS | ZCR | Peak |\n|---|---|---|---|\n| Gaming | 0.20 | 0.35 | 0.20 |\n| Podcast | 0.50 | 0.05 | 0.20 |\n| Funny | 0.15 | 0.20 | 0.35 |\n| General | 0.30 | 0.20 | 0.25 |\n\nThe selector groups high-scoring segments into zones,\n\nfinds the peak moment in each zone, and centers a 30–90 second\n\nclip around it. A diversity radius of 12 seconds prevents\n\nthree clips from covering the same moment.\n\n``` js\nconst combinedSignal =\n  (seg.score ?? 0) +\n  (seg.energyChange ?? 0) * 2.0 +\n  (seg.volumePeak ?? 0) * 1.5\n\n// Center the clip around the strongest combined signal,\n// not just the loudest sustained section\n```\n\nSafari on iOS can’t decode video containers\n\nvia `AudioContext.decodeAudioData()`\n\n.\n\nIt only accepts clean audio files.\n\nThe fix: detect iOS and pre-extract audio with FFmpeg.wasm\n\nbefore passing it to the Web Audio API:\n\n``` js\nconst isIOS = /iPhone|iPad|iPod/i.test(navigator.userAgent)\n\nif (isIOS) {\n  await ffmpeg.exec([\n    '-i', 'input_video',\n    '-vn',\n    '-acodec', 'pcm_s16le',  // WAV — guaranteed to work on all iOS versions\n    '-ar', '44100',\n    '-ac', '1',\n    'audio.wav'\n  ])\n  // Pass audio.wav to Web Audio instead of the original video\n}\n```\n\nWAV/PCM is uncompressed and works reliably on every iOS version.\n\nAAC containers are not.\n\nOnce highlights are found, FFmpeg.wasm cuts the clips:\n\n```\n// Fast path: H.264 + AAC + MP4 = stream copy, no re-encoding\n// A 90-second clip exports in ~2–3 seconds\nawait ffmpeg.exec([\n  '-ss', String(clip.start),\n  '-i', 'input',\n  '-t', String(clip.end - clip.start),\n  '-c', 'copy',               // copy bytes, don't re-encode\n  '-avoid_negative_ts', 'make_zero',\n  '-movflags', '+faststart',\n  outputName\n])\n```\n\nNon-standard formats (MOV, MKV, AV1) get converted to MP4 first\n\nbefore the analysis pipeline runs. This also fixed all the\n\n“file won’t export” bugs from iPhone footage.\n\n**OfflineAudioContext doesn’t resample.**\n\nI assumed `new OfflineAudioContext(1, length, 8000)`\n\nwould give me 8kHz audio. It doesn’t.\n\nYou get whatever sample rate the source file has.\n\nDownsampling has to be manual.\n\n**Transfer, don’t copy ArrayBuffers.**\n\n`worker.postMessage({ arrayBuffer }, [arrayBuffer])`\n\ntransfers ownership with zero memory copy.\n\nWithout the second argument you’re doubling RAM usage.\n\n**-ss before -i for stream copy, after for re-encode.**\n\nThis one cost me an hour. For `-c copy`\n\n, seek before input\n\nfor speed. For re-encoding, seek after input for frame accuracy.\n\nThe tool is live and free at:\n\n👉 [https://clipgg.uk/en/ai-video-highlights](https://clipgg.uk/en/ai-video-highlights)\n\nDrop a video, pick a mode (Gaming / Podcast / Funny / General),\n\nand get three highlight clips with timestamps in about 30 seconds.\n\nNo account. No upload. Works on desktop Chrome, Firefox,\n\nand now iOS Safari too.\n\nCurious what others think about the audio scoring approach —\n\nwould love feedback on the algorithm in the comments.", "url": "https://wpnews.pro/news/i-built-an-ai-video-clip-finder-that-runs-100-in-your-browser-no-uploads-no-api", "canonical_source": "https://dev.to/__a570829a/i-built-an-ai-video-clip-finder-that-runs-100-in-your-browser-no-uploads-no-api-no-gpu-costs-101o", "published_at": "2026-06-16 06:00:00+00:00", "updated_at": "2026-06-16 06:17:06.563168+00:00", "lang": "en", "topics": ["artificial-intelligence", "developer-tools", "ai-tools"], "entities": ["ClipGG", "Opus Clip", "Vidyo.ai", "FFmpeg.wasm", "Web Audio API", "Whisper", "GPT-4", "Safari"], "alternates": {"html": "https://wpnews.pro/news/i-built-an-ai-video-clip-finder-that-runs-100-in-your-browser-no-uploads-no-api", "markdown": "https://wpnews.pro/news/i-built-an-ai-video-clip-finder-that-runs-100-in-your-browser-no-uploads-no-api.md", "text": "https://wpnews.pro/news/i-built-an-ai-video-clip-finder-that-runs-100-in-your-browser-no-uploads-no-api.txt", "jsonld": "https://wpnews.pro/news/i-built-an-ai-video-clip-finder-that-runs-100-in-your-browser-no-uploads-no-api.jsonld"}}