Build an AI Video Editing Agent with Claude and FFmpeg Micro MCP

wpnews.pro

Originally published at ffmpeg-micro.com

Most developers using AI for video editing are still doing it manually. They prompt Claude or ChatGPT to generate an FFmpeg command, copy it, paste it into a terminal, debug the errors, and repeat. That works for one-off jobs. But if you're building a product or workflow where users describe video edits in plain English and get results back automatically, you need something different.

You need an agent. An AI agent that understands video operations, calls the right API, and returns the finished video without anyone touching a terminal.

This tutorial builds exactly that using Claude and the FFmpeg Micro MCP server. By the end, you'll have a working agent that takes natural language instructions like "trim this video to the first 30 seconds and convert it to 720p" and executes the entire job.

The flow looks like this:

No local FFmpeg installation. No shell commands. No manual intervention after the initial prompt.

You need three things:

Add this to your MCP configuration file:

{
  "mcpServers": {
    "ffmpeg-micro": {
      "type": "http",
      "url": "https://mcp.ffmpeg-micro.com"
    }
  }
}

The first time you use it, your browser opens for OAuth sign-in. After that, the token is cached.

Config file locations:

~/Library/Application Support/Claude/claude_desktop_config.json

(Mac).mcp.json

or ~/.claude.json

If your tool doesn't support HTTP MCP yet, use the stdio transport instead:

{
  "mcpServers": {
    "ffmpeg-micro": {
      "command": "npx",
      "args": ["-y", "@ffmpeg-micro/mcp-server"],
      "env": {
        "FFMPEG_MICRO_API_KEY": "your_api_key_here"
      }
    }
  }
}

Once connected, Claude can call six MCP tools:

Tool	What It Does
`transcode_video`
Creates a transcode job (returns immediately)
`transcode_and_wait`
Creates a job and polls until it finishes
`get_transcode`
Checks the status of a specific job
`list_transcodes`
Lists recent jobs with optional filters
`get_download_url`
Generates a signed download link
`cancel_transcode`
Cancels a queued or in-progress job

For most agent workflows, transcode_and_wait

is the one you want. It creates the job and blocks until the video is ready, so the agent can return the download link in a single turn.

Open Claude (or whichever tool you connected) and try this:

Take this video and convert it to 720p MP4 with medium quality:
https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4

Claude reads your request, picks the right tool (transcode_and_wait

), constructs the API call with the correct parameters, and waits for the result. You get back a download URL for the processed video.

Behind the scenes, the MCP call looks like this:

{
  "inputs": [
    { "url": "https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4" }
  ],
  "outputFormat": "mp4",
  "preset": {
    "quality": "medium",
    "resolution": "720p"
  }
}

You didn't write that JSON. Claude figured it out from your plain English instruction.

The agent handles more than basic transcoding. Try these:

Extract audio from a video:

Extract the audio from this video as an MP3:
https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4

Compress for web delivery:

Compress this video for web. Target quality CRF 28, use H.265 encoding:
https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4

Claude translates that into the right FFmpeg options:

{
  "inputs": [
    { "url": "https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4" }
  ],
  "outputFormat": "mp4",
  "options": [
    { "option": "-c:v", "argument": "libx265" },
    { "option": "-crf", "argument": "28" },
    { "option": "-preset", "argument": "medium" }
  ]
}

Trim a section:

Trim this video from 5 seconds to 15 seconds:
https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4

Stitch multiple videos together:

Concatenate these two videos into one MP4:
1. https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4
2. https://www.ffmpeg-micro.com/samples/quickstart-sample.mp4

Every operation goes through the same flow: natural language in, processed video out.

If you're building this into a product, give Claude a system prompt that constrains the agent to video operations:

You are a video editing assistant. You have access to the FFmpeg Micro
MCP server for video processing. When a user describes a video edit:

1. Identify the operation (transcode, trim, compress, extract audio, etc.)
2. Use the transcode_and_wait tool with the correct parameters
3. Return the download URL to the user
4. If the job fails, explain what went wrong

Always use preset mode for simple operations (quality + resolution).
Use options mode for specific codecs, filters, or advanced FFmpeg flags.
Never ask the user for technical parameters — infer them from the request.

This turns Claude from a general-purpose assistant into a focused video editing agent that handles the technical details automatically.

Using a non-public video URL. The FFmpeg Micro API needs to fetch your video. If the URL requires authentication or is behind a firewall, the job will fail. For private files, use the upload flow first (presigned URL, PUT, confirm) to get a cloud storage URL.

Forgetting to check job status. If you use transcode_video

instead of transcode_and_wait

, the job runs asynchronously. You'll need to call get_transcode

to check when it's done. For agent workflows, transcode_and_wait

avoids this entirely.

Asking for unsupported output formats. The API supports mp4, webm, avi, mov, mkv, gif, mp3, wav, and more. But if you ask Claude to output a format that FFmpeg doesn't support for your input, the job will fail with a clear error message.

The traditional way to automate video editing with AI is prompting for FFmpeg commands, then running them locally. That works, but it has problems:

With the MCP agent approach, Claude handles the complexity and FFmpeg Micro handles the infrastructure. Your code (or your users) just describes what they want in plain English.

FFmpeg Micro processes over 50,000 API calls per month. A 1-minute video transcode takes about 3 seconds. Pricing starts with a free tier, so you can prototype without spending anything.

The MCP protocol is currently supported by Claude Desktop, Claude Code, Cursor, Windsurf, and other MCP-compatible tools. For OpenAI or other providers, you'd call the FFmpeg Micro REST API directly instead of going through MCP. The API itself works with any HTTP client.

No. That's the point. The agent translates natural language to the right API calls. You say "compress this video" and Claude figures out the codec, CRF, and preset. If you do know FFmpeg, you can be more specific ("use libx265 with CRF 24"), and the agent will use those exact settings.

FFmpeg Micro supports files up to 2GB. For larger files, you'll need to split them first. The free tier has usage limits. Check ffmpeg-micro.com/pricing for details on each plan.

Each API call handles one operation. But the agent can chain them automatically. Ask Claude "trim this video to 30 seconds, then convert to 720p WebM" and it will run two sequential transcode jobs, using the output of the first as the input for the second.

Yes. The FFmpeg Micro MCP server is published as @ffmpeg-micro/mcp-server

on npm. You can inspect the source, fork it, or contribute.

source & further reading

dev.to — original article The Generic MCP Toolbox: Tools That Register Themselves The open-source AI platform category that nobody is naming yet Tre aggiornamenti AI che cambiano davvero il lavoro quotidiano nel frontend

Build an AI Video Editing Agent with Claude and FFmpeg Micro MCP

Run your AI side-project on zahid.host