I Built Dusk: Playwright MCP, but for Flutter Apps

wpnews.pro

cd /news/developer-tools/i-built-dusk-playwright-mcp-but-for-… · home › topics › developer-tools › article

[ARTICLE · art-27115] src=dev.to ↗ pub=2026-06-14T17:09Z topic=developer-tools verified=true sentiment=↑ positive

I Built Dusk: Playwright MCP, but for Flutter Apps

A developer built Dusk, an open-source tool that gives AI agents direct access to Flutter app semantics via VM Service extensions, enabling live, unscripted testing without test files or build steps. Dusk provides 32 CLI commands and 31 MCP tools for actions like tap, type, scroll, and screenshot, with a 6-step actionability gate to ensure reliability. The tool is designed for ad hoc driving of running Flutter apps by humans and AI agents, complementing existing test suites.

read4 min views25 publishedJun 14, 2026

Last week I watched my AI agent try to test a Flutter screen. It wrote a test file, ran flutter test

, copied the stack trace back into the prompt, pasted a screenshot, and called it a workflow. It was slow, and it was guessing.

On the web, agents do not work like that anymore. Playwright MCP gives them an accessibility tree to read and stable refs to act on. 33k stars, no screenshot guessing. Flutter never had that layer.

So I built Dusk.

End-to-end testing on Flutter has always been a stitched-together ritual.

flutter_driver

ships a one-off socket protocol and is on the legacy track. integration_test

runs in-process against a simulated WidgetTester

, but you write a test file, build, run, and wait. Maestro is nice but pays around 3 seconds per action. Patrol is powerful but tends to be unstable on CI.

The deeper issue is the loop. An agent that wants to drive your app reaches for ad hoc flutter test

runs, copies stack traces by hand, and pastes screenshots back. There is no live connection between the agent and the running app.

// The old loop: write a test file, build, run, wait, read the failure, repeat.
testWidgets('checkout flow', (tester) async {
  await tester.tap(find.byKey(const Key('checkout')));
  await tester.pumpAndSettle();
  // ...and you still rebuild and rerun the whole thing to see what happened.
});

Dusk attaches to a running Flutter app over VM Service extensions. No test file, no flutter_test

harness, no build step. You start your app, attach, and the agent has eyes and hands.

First it snapshots the Semantics tree:

dart run fluttersdk_dusk dusk:snap

That returns a YAML tree with stable [ref=eN]

tokens. Every action targets a ref, so there is no brittle XPath and no coordinate guessing.

dusk:tap --ref=e7
dusk:type --ref=e3 --text "ada@fluttersdk.com"
dusk:screenshot

The same contracts power your terminal and your AI agent. dusk:tap --ref=e7

on the CLI and dusk_tap

as an MCP tool reach the exact same code path. 32 CLI commands and 31 MCP tools: snap, tap, type, scroll, drag, observe, screenshot, and a hot-reload-and-snap round trip that returns the new tree, a screenshot, and any exceptions in one call.

Every gesture passes a 6-step actionability gate before it runs: not defunct, enabled, non-zero rect, on-viewport (it auto-scrolls), stable across 2 frames, and actually hit-testable. So your agent never taps a button that is not really there yet.

This is the part that turns "drive the app" from a demo into something you trust. The boring check is the whole point.

Dusk does not replace your test suite. It owns a different niche: the unscripted, running app.

Tool	What it is	Where Dusk fits
integration_test	Authored test file via `WidgetTester`

Owns the test file. Dusk owns the live, unscripted app. Use both.
patrol	Native dialogs on integration_test	Owns authored tests with native permissions. Dusk owns ad hoc driving by humans and agents.
flutter_driver	Legacy socket protocol	Dusk is hot-restart safe, one contract for CLI and MCP, no separate isolate.
maestro	YAML DSL over the OS accessibility layer	Dusk drives the Flutter widget tree directly. Zero YAML to author.
playwright-mcp	Browser MCP via the accessibility tree	Dusk is the Flutter-native equivalent, ported to Semantics.

flutter pub add fluttersdk_dusk
dart run fluttersdk_dusk dusk:install

dusk:install

patches lib/main.dart

behind kDebugMode

and scaffolds the CLI. Release builds tree-shake the entire driver across web, desktop, and mobile, so Dusk never ships to production.

Wire it into your agent with one more command:

dart run fluttersdk_dusk mcp:install

That registers the stdio MCP server for Claude Code, Cursor, Windsurf, VS Code Copilot, and any MCP-compatible agent. Dusk also ships its own agent skill, so the agent learns the ref grammar and the tool surface, not just the syntax.

Two things stuck with me.

First, the accessibility tree is the right interface for agents on Flutter just as much as on the web. Semantics nodes are stable, cheap, and already there. Screenshots are the slow, expensive fallback, not the default.

Second, the actionability gate matters more than the tool count. An agent that taps confidently on a widget that has not settled is worse than no automation at all. The 6-step check is what makes the rest usable.

Docs: https://fluttersdk.com/dusk

Agent setup: https://fluttersdk.com/dusk/ai

If you try it with your agent, I would love to hear what breaks. That's all.

source & further reading

dev.to — original article Looking to Collaborate with Developers on AI, Web, or Startup Projects I Wrote Integration Tests for My MCP Failure Library. Here's the Pattern That Caught 3 Hidden Bugs. Why I Believe Vibe Coding Is Becoming a Real Engineering skills

~/api · this article 200

$curl api.wpnews.pro/v1/news/i-built-dusk-playwright-…

Read original on dev.to → dev.to/fluttersdk/i-built-dusk-playwright-mcp-bu…

mentioned entities

Dusk

Flutter

Playwright MCP

Claude Code

Cursor

Windsurf

VS Code Copilot

Maestro

metadata

slugi-built-dusk-playwright-mcp-but-for-flutter-apps

topic#developer-tools

secondary2 topics

sentimentpositive

canonicaldev.to

navigation

← prevI built a 7-figure creator busin…

next →Apple hides third-party Siri Ext…

── more in #developer-tools 4 stories · sorted by recency

dev.to · 30 Jul · #developer-tools

Combined Offense + Defense (Engineering Edition) — Cross-Project Reuse Matrix and When Not to Use

promptcube3.com · 30 Jul · #developer-tools

Claude Code Workflow: Balancing Open Weights and Safety

dev.to · 30 Jul · #developer-tools

Building an MCP server in Python: what I learned about tool design

github.com · 29 Jul · #developer-tools

OpenLore: Deterministic, local-first memory and guardrails for AI coding agents

── more on @dusk 3 stories trending now

wpnews · 29 Jul · #ai-safety

News Summary for July 29, 2026

wpnews · 28 Jul · #large-language-models

How to Download and Run Kimi K3 Open Weights

wpnews · 29 Jul · #ai-agents

Compliance-Ready AI Agents: Logging and Tracing Every MCP Tool Call with Bifrost

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required