Show HN: VibeClip – open-source AI video editor you control by chatting VibeClip, an open-source AI video editor, now allows users to transform long-form videos like podcasts and interviews into vertical shorts by chatting with the tool. The software transcribes video on-device, scores for the strongest moments, and lets users refine clips with natural language commands such as "make clip 2 punchier" or "add a zoom at 0:05." Designed for self-hosting with a single Docker command, VibeClip keeps speech-to-text and rendering local while requiring only a user-provided LLM key for its editing "brain. Drop in a long video — podcast, interview, talk, stream — and VibeClip cuts it into vertical, captioned, ready-to-post shorts. Then you refine every clip by chatting : “make clip 2 punchier,” “bigger captions,” “add a zoom at 0:05,” “undo.” Quick start -quick-start · Features -what-it-does · How it works -how-it-works · Bring your own key -bring-your-own-key-byok · Configuration -configuration · Contributing -contributing Left: the raw clip. Right: after one sentence — “make it mrbeast style and add gameplay underneath” — captioned, reframed to 9:16, and split-screened. Real pipeline output, not a mockup. Footage: Andy Dickinson CC-BY · gameplay: Orbital - No Copyright Gameplay CC-BY · Minecraft © Mojang. Spin up a private instance in three commands. All you add is one LLM key. git clone https://github.com/oktaydbk54/vibeclip.git cd vibeclip cp .env.example .env add ONE line: OPENAI API KEY=sk-... docker compose up -d --build → open http://localhost:8765 With the defaults EMAIL MODE=console , REQUIRE EMAIL VERIFICATION=false sign-up logs you straight in — no email provider needed. Bring an OpenAI or DeepSeek key DeepSeek is the cheap one , or point LLM BASE URL at any OpenAI-compatible server Ollama, LM Studio, OpenRouter… . Prefer no Docker? See local install run-without-docker . 🎬 Long → shorts, automatically | Transcribes on-device, scores the strongest moments hook / flow / value — not a dumb keyword scan , reframes to 9:16 around the speaker, and burns word-synced captions. | 💬 Edit by chatting | A tool-calling agent turns plain language into real edits — trims, filler-word removal “uhh”/“ee” , zooms, styles, music, b-roll, brand overlays. One undo reverts a whole multi-step plan. | 🎨 Styles in one shot | hormozi , mrbeast , podcast minimal , kinetic — captions, pace, zoom, music and SFX applied together. Drop in your own preset as a JSON file. | 🖥️ A real studio UI | Web app with a live 9:16 preview, clip cards, a CapCut-style timeline, and the chat copilot right beside it. | 🔑 Your key, your data | Bring your own LLM key OpenAI · Gemini · Claude · DeepSeek · any compatible endpoint . Nothing is proxied through us — there is no “us.” | 🏠 Self-host first | One Docker command. Speech-to-text and every render run locally via faster-whisper + ffmpeg. AGPL-3.0, no SaaS lock-in. | upload │ ┌──────▼───────┐ faster-whisper local, no API key │ transcribe │ └──────┬───────┘ ┌──────▼────────────┐ LLM "brain" your key — structure + scored moments │ analyze structure │ │ find highlights │ └──────┬────────────┘ ┌──────▼───────┐ per clip, replayed from cached intermediates ~2–4s/edit │ auto edit │ jumpcut → 9:16 reframe → captions → music+ambience ducked │ │ → SFX → fades · then your chat commands layer on top └──────┬───────┘ export → vertical MP4, publish-ready Only two things ever hit the network: your chosen LLM to understand intent and score moments and, optionally, Pexels stock b-roll . Speech-to-text and all rendering stay on your machine. VibeClip never ships with a key and never proxies your prompts anywhere except the provider you choose. Two ways to supply one: Per instance — set OPENAI API KEY or DEEPSEEK API KEY , or any OpenAI-compatible endpoint via LLM BASE URL in .env . Per user — each account pastes its own key on the in-app Settings page, with a live test-connection . Keys are encrypted at rest and never sent back to the browser. | Provider | Routed via | Notes | |---|---|---| OpenAI | native | Default, best-supported. | DeepSeek | native | The budget pick — a typical short costs a few cents. | Google Gemini | OpenAI-compat endpoint | gemini-2.5-flash / pro . | Anthropic Claude | OpenAI-compat endpoint | claude-haiku / sonnet . | Anything else | LLM BASE URL | Ollama, LM Studio, OpenRouter, your own proxy… | Speech-to-text runs locally and needs no key. Everything is driven by .env see .env.example for the full, commented list . The ones that matter most: | Variable | Default | Purpose | |---|---|---| OPENAI API KEY | — | Your LLM key preferred . | DEEPSEEK API KEY | — | Cheaper fallback, used if no OpenAI key. | LLM BASE URL | — | Any OpenAI-compatible endpoint local models, proxies . | EMAIL MODE | console | console prints OTP to the log; resend sends real email. | REQUIRE EMAIL VERIFICATION | false | true enforces email confirmation public instances . | HOSTED STUDIO | true | true = the landing offers login/signup use your own instance . false = a public marketing site that points everyone to GitHub to self-host no login . | GA MEASUREMENT ID | — | Empty = no analytics injected self-host default . | SITE URL | http://localhost:8765 | Public base URL for blog canonical/OG/sitemap. | VIDEO ENCODER | libx264 | Use h264 videotoolbox on Apple Silicon. | VIBECLIP BIND | 127.0.0.1 | docker-compose publish address 0.0.0.0 to expose . | MAX UPLOAD SECONDS | 0 | Longest uploadable video, seconds. 0 = no limit self-host . | MAX PROJECTS PER USER | 0 | Projects per account. 0 = unlimited; cap it on a public instance. | Requirements: Python 3.12+ , ffmpeg , and the DejaVu fonts for caption rendering . cp .env.example .env add your LLM key uv sync or: pip install -e . python -m chat.app → http://127.0.0.1:8765 First run downloads the Whisper model. Prefer the terminal? python -m chat.cli