Originally published at ffmpeg-micro.com
Most developers using AI for video editing are still doing it manually. They prompt Claude or ChatGPT to generate an FFmpeg command, copy it, paste it into a terminal, debug the errors, and repeat. That works for one-off jobs. But if you're building a product or workflow where users describe video edits in plain English and get results back automatically, you need something different.
You need an agent. An AI agent that understands video operations, calls the right API, and returns the finished video without anyone touching a terminal.
This tutorial builds exactly that using Claude and the FFmpeg Micro MCP server. By the end, you'll have a working agent that takes natural language instructions like "trim this video to the first 30 seconds and convert it to 720p" and executes the entire job.
The flow looks like this:
No local FFmpeg installation. No shell commands. No manual intervention after the initial prompt.
You need three things:
Add this to your MCP configuration file:
{
"mcpServers": {
"ffmpeg-micro": {
"type": "http",
"url": "https://mcp.ffmpeg-micro.com"
}
}
}
The first time you use it, your browser opens for OAuth sign-in. After that, the token is cached.
Config file locations:
~/Library/Application Support/Claude/claude_desktop_config.json
(Mac).mcp.json
or ~/.claude.json
If your tool doesn't support HTTP MCP yet, use the stdio transport instead:
{
"mcpServers": {
"ffmpeg-micro": {
"command": "npx",
"args": ["-y", "@ffmpeg-micro/mcp-server"],
"env": {
"FFMPEG_MICRO_API_KEY": "your_api_key_here"
}
}
}
}
Once connected, Claude can call six MCP tools:
| Tool | What It Does |
|---|---|
transcode_video |
|
| Creates a transcode job (returns immediately) | |
transcode_and_wait |
|
| Creates a job and polls until it finishes | |
get_transcode |
|
| Checks the status of a specific job | |
list_transcodes |
|
| Lists recent jobs with optional filters | |
get_download_url |
|
| Generates a signed download link | |
cancel_transcode |
|
| Cancels a queued or in-progress job |
For most agent workflows, transcode_and_wait
is the one you want. It creates the job and blocks until the video is ready, so the agent can return the download link in a single turn.
Open Claude (or whichever tool you connected) and try this:
Take this video and convert it to 720p MP4 with medium quality:
https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4
Claude reads your request, picks the right tool (transcode_and_wait
), constructs the API call with the correct parameters, and waits for the result. You get back a download URL for the processed video.
Behind the scenes, the MCP call looks like this:
{
"inputs": [
{ "url": "https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4" }
],
"outputFormat": "mp4",
"preset": {
"quality": "medium",
"resolution": "720p"
}
}
You didn't write that JSON. Claude figured it out from your plain English instruction.
The agent handles more than basic transcoding. Try these:
Extract audio from a video:
Extract the audio from this video as an MP3:
https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4
Compress for web delivery:
Compress this video for web. Target quality CRF 28, use H.265 encoding:
https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4
Claude translates that into the right FFmpeg options:
{
"inputs": [
{ "url": "https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4" }
],
"outputFormat": "mp4",
"options": [
{ "option": "-c:v", "argument": "libx265" },
{ "option": "-crf", "argument": "28" },
{ "option": "-preset", "argument": "medium" }
]
}
Trim a section:
Trim this video from 5 seconds to 15 seconds:
https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4
Stitch multiple videos together:
Concatenate these two videos into one MP4:
1. https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4
2. https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4
Every operation goes through the same flow: natural language in, processed video out.
If you're building this into a product, give Claude a system prompt that constrains the agent to video operations:
You are a video editing assistant. You have access to the FFmpeg Micro
MCP server for video processing. When a user describes a video edit:
1. Identify the operation (transcode, trim, compress, extract audio, etc.)
2. Use the transcode_and_wait tool with the correct parameters
3. Return the download URL to the user
4. If the job fails, explain what went wrong
Always use preset mode for simple operations (quality + resolution).
Use options mode for specific codecs, filters, or advanced FFmpeg flags.
Never ask the user for technical parameters — infer them from the request.
This turns Claude from a general-purpose assistant into a focused video editing agent that handles the technical details automatically.
Using a non-public video URL. The FFmpeg Micro API needs to fetch your video. If the URL requires authentication or is behind a firewall, the job will fail. For private files, use the upload flow first (presigned URL, PUT, confirm) to get a cloud storage URL.
Forgetting to check job status. If you use transcode_video
instead of transcode_and_wait
, the job runs asynchronously. You'll need to call get_transcode
to check when it's done. For agent workflows, transcode_and_wait
avoids this entirely.
Asking for unsupported output formats. The API supports mp4, webm, avi, mov, mkv, gif, mp3, wav, and more. But if you ask Claude to output a format that FFmpeg doesn't support for your input, the job will fail with a clear error message.
The traditional way to automate video editing with AI is prompting for FFmpeg commands, then running them locally. That works, but it has problems:
With the MCP agent approach, Claude handles the complexity and FFmpeg Micro handles the infrastructure. Your code (or your users) just describes what they want in plain English.
FFmpeg Micro processes over 50,000 API calls per month. A 1-minute video transcode takes about 3 seconds. Pricing starts with a free tier, so you can prototype without spending anything.
The MCP protocol is currently supported by Claude Desktop, Claude Code, Cursor, Windsurf, and other MCP-compatible tools. For OpenAI or other providers, you'd call the FFmpeg Micro REST API directly instead of going through MCP. The API itself works with any HTTP client.
No. That's the point. The agent translates natural language to the right API calls. You say "compress this video" and Claude figures out the codec, CRF, and preset. If you do know FFmpeg, you can be more specific ("use libx265 with CRF 24"), and the agent will use those exact settings.
FFmpeg Micro supports files up to 2GB. For larger files, you'll need to split them first. The free tier has usage limits. Check ffmpeg-micro.com/pricing for details on each plan.
Each API call handles one operation. But the agent can chain them automatically. Ask Claude "trim this video to 30 seconds, then convert to 720p WebM" and it will run two sequential transcode jobs, using the output of the first as the input for the second.
Yes. The FFmpeg Micro MCP server is published as @ffmpeg-micro/mcp-server
on npm. You can inspect the source, fork it, or contribute.