Build an AI Video Editing Agent with Claude and FFmpeg Micro MCP

A developer built an AI video editing agent using Claude and the FFmpeg Micro MCP server that translates natural language instructions into FFmpeg commands. The agent can perform operations like transcoding, trimming, and concatenating videos without manual terminal intervention. The system uses six MCP tools, with transcode_and_wait being the primary tool for agent workflows.

Originally published at ffmpeg-micro.com Most developers using AI for video editing are still doing it manually. They prompt Claude or ChatGPT to generate an FFmpeg command, copy it, paste it into a terminal, debug the errors, and repeat. That works for one-off jobs. But if you're building a product or workflow where users describe video edits in plain English and get results back automatically, you need something different. You need an agent. An AI agent that understands video operations, calls the right API, and returns the finished video without anyone touching a terminal. This tutorial builds exactly that using Claude and the FFmpeg Micro MCP server. By the end, you'll have a working agent that takes natural language instructions like "trim this video to the first 30 seconds and convert it to 720p" and executes the entire job. The flow looks like this: No local FFmpeg installation. No shell commands. No manual intervention after the initial prompt. You need three things: Add this to your MCP configuration file: { "mcpServers": { "ffmpeg-micro": { "type": "http", "url": "https://mcp.ffmpeg-micro.com" } } } The first time you use it, your browser opens for OAuth sign-in. After that, the token is cached. Config file locations: ~/Library/Application Support/Claude/claude desktop config.json Mac .mcp.json or ~/.claude.json If your tool doesn't support HTTP MCP yet, use the stdio transport instead: { "mcpServers": { "ffmpeg-micro": { "command": "npx", "args": "-y", "@ffmpeg-micro/mcp-server" , "env": { "FFMPEG MICRO API KEY": "your api key here" } } } } Once connected, Claude can call six MCP tools: | Tool | What It Does | |---|---| transcode video | Creates a transcode job returns immediately | transcode and wait | Creates a job and polls until it finishes | get transcode | Checks the status of a specific job | list transcodes | Lists recent jobs with optional filters | get download url | Generates a signed download link | cancel transcode | Cancels a queued or in-progress job | For most agent workflows, transcode and wait is the one you want. It creates the job and blocks until the video is ready, so the agent can return the download link in a single turn. Open Claude or whichever tool you connected and try this: Take this video and convert it to 720p MP4 with medium quality: https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4 Claude reads your request, picks the right tool transcode and wait , constructs the API call with the correct parameters, and waits for the result. You get back a download URL for the processed video. Behind the scenes, the MCP call looks like this: { "inputs": { "url": "https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4" } , "outputFormat": "mp4", "preset": { "quality": "medium", "resolution": "720p" } } You didn't write that JSON. Claude figured it out from your plain English instruction. The agent handles more than basic transcoding. Try these: Extract audio from a video: Extract the audio from this video as an MP3: https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4 Compress for web delivery: Compress this video for web. Target quality CRF 28, use H.265 encoding: https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4 Claude translates that into the right FFmpeg options: { "inputs": { "url": "https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4" } , "outputFormat": "mp4", "options": { "option": "-c:v", "argument": "libx265" }, { "option": "-crf", "argument": "28" }, { "option": "-preset", "argument": "medium" } } Trim a section: Trim this video from 5 seconds to 15 seconds: https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4 Stitch multiple videos together: Concatenate these two videos into one MP4: 1. https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4 2. https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4 Every operation goes through the same flow: natural language in, processed video out. If you're building this into a product, give Claude a system prompt that constrains the agent to video operations: You are a video editing assistant. You have access to the FFmpeg Micro MCP server for video processing. When a user describes a video edit: 1. Identify the operation transcode, trim, compress, extract audio, etc. 2. Use the transcode and wait tool with the correct parameters 3. Return the download URL to the user 4. If the job fails, explain what went wrong Always use preset mode for simple operations quality + resolution . Use options mode for specific codecs, filters, or advanced FFmpeg flags. Never ask the user for technical parameters — infer them from the request. This turns Claude from a general-purpose assistant into a focused video editing agent that handles the technical details automatically. Using a non-public video URL. The FFmpeg Micro API needs to fetch your video. If the URL requires authentication or is behind a firewall, the job will fail. For private files, use the upload flow first presigned URL, PUT, confirm to get a cloud storage URL. Forgetting to check job status. If you use transcode video instead of transcode and wait , the job runs asynchronously. You'll need to call get transcode to check when it's done. For agent workflows, transcode and wait avoids this entirely. Asking for unsupported output formats. The API supports mp4, webm, avi, mov, mkv, gif, mp3, wav, and more. But if you ask Claude to output a format that FFmpeg doesn't support for your input, the job will fail with a clear error message. The traditional way to automate video editing with AI is prompting for FFmpeg commands, then running them locally. That works, but it has problems: With the MCP agent approach, Claude handles the complexity and FFmpeg Micro handles the infrastructure. Your code or your users just describes what they want in plain English. FFmpeg Micro processes over 50,000 API calls per month. A 1-minute video transcode takes about 3 seconds. Pricing starts with a free tier, so you can prototype without spending anything. The MCP protocol is currently supported by Claude Desktop, Claude Code, Cursor, Windsurf, and other MCP-compatible tools. For OpenAI or other providers, you'd call the FFmpeg Micro REST API directly instead of going through MCP. The API itself works with any HTTP client. No. That's the point. The agent translates natural language to the right API calls. You say "compress this video" and Claude figures out the codec, CRF, and preset. If you do know FFmpeg, you can be more specific "use libx265 with CRF 24" , and the agent will use those exact settings. FFmpeg Micro supports files up to 2GB. For larger files, you'll need to split them first. The free tier has usage limits. Check ffmpeg-micro.com/pricing https://www.ffmpeg-micro.com/pricing for details on each plan. Each API call handles one operation. But the agent can chain them automatically. Ask Claude "trim this video to 30 seconds, then convert to 720p WebM" and it will run two sequential transcode jobs, using the output of the first as the input for the second. Yes. The FFmpeg Micro MCP server is published as @ffmpeg-micro/mcp-server on npm. You can inspect the source, fork it, or contribute.