How to Write a Flutter Agent Skill That Actually Works: The 2026 Recipe A developer has published a recipe for writing effective Flutter agent skills, synthesizing official guidance from Flutter, Anthropic, Google, and OpenAI into a single copy-pasteable format. The recipe emphasizes that a great skill is a tightly scoped `SKILL.md` file with a description engineered for discovery, ruthless conciseness, anti-patterns stated upfront, a checklist workflow, and a feedback loop. The format is an open standard that works across Claude Code, OpenAI Codex, Google Antigravity, Gemini CLI, and Cursor. TL;DRA great agent skill is not a pile of documentation. It is a tightly scoped SKILL.md with a description engineered for discovery, ruthless conciseness, anti-patterns stated up front, a checklist workflow, and a feedback loop. The format is an open standard that works across Claude Code, OpenAI Codex, Google Antigravity, Gemini CLI, and Cursor. This post synthesizes the official authoring guidance fromFlutter, Anthropic, Google, and OpenAIinto one recipe, hands you a complete copy-pasteable Flutter skill, and shows you how to actually evaluate it instead of guessing. In my last article, I wrote about the official Dart and Flutter Agent Skills and why they stop your AI from writing 2022 Flutter. The most common reply I got was some version of the same question: "Cool. How do I write my own?" So I went and read the actual playbooks. Not the hot takes, the primary sources: Flutter's skill docs and eval framework, Anthropic's skill authoring best practices, Google's Antigravity skill docs, and OpenAI's Codex skill guide. The good news is they agree on almost everything. The better news is that the gap between a skill that works and a skill that gets silently ignored comes down to a handful of decisions, and most people get them wrong. Here is the recipe, Flutter-flavored. AI agents are generalists. They average across years of Flutter code, much of it deprecated, and hand you the most statistically common answer instead of the currently correct one. The Flutter team named this the knowledge gap : the framework ships features faster than language models can update their training data. Skills exist to close that gap by handing the agent a task-specific, expert workflow. But here is what nobody tells you. A poorly written skill does not just fail to help. It actively costs you. Every skill's metadata sits in the agent's context budget at all times. A vague skill that never triggers is dead weight. A skill with a fuzzy description that triggers on the wrong tasks is worse, because now your agent is following the wrong playbook with full confidence. The bar is not "wrote some Markdown." The bar is "the agent reliably finds it, trusts it, and follows it." Everything below is in service of that bar. A skill is the simplest possible thing: a folder with one required file. building-riverpod-async-screens/ ├── SKILL.md Required: metadata + instructions ├── references/ Optional: deep-dive docs loaded on demand ├── examples/ Optional: reference implementations ├── scripts/ Optional: scripts the agent runs, not reads └── assets/ Optional: templates, images The SKILL.md itself is YAML frontmatter plus a Markdown body: --- name: building-riverpod-async-screens description: "Build a Flutter screen that loads async data with Riverpod..." --- Building Riverpod Async Screens instructions go here The magic that makes this scale is progressive disclosure . At startup the agent loads only the lightweight metadata name, description, path of every skill. It reads the full SKILL.md only when a task matches, and it reads anything in references/ or examples/ only when the body points it there. If you write Flutter, you already know this pattern: it is deferred loading for the context window. OpenAI, Anthropic, and Google all describe the exact same mechanism. This is the part that makes writing a skill worth your time. SKILL.md is an open standard published at agentskills.io, originated at Anthropic, since adopted across the ecosystem . One skill works almost everywhere: | Tool | Vendor | Where skills live | |---|---|---| | Claude Code | Anthropic | .claude/skills/ project , ~/.claude/skills/ personal | | OpenAI Codex | OpenAI | .codex/skills/ project , ~/.codex/skills/ or ~/.agents/skills/ | | Antigravity | .agents/skills/ workspace , ~/.gemini/antigravity/skills/ global | | | Gemini CLI | SKILL.md standard locations | | | Cursor / Copilot | Various | supported with manual placement | The Flutter team's installer targets the cross-tool location directly: npx skills add flutter/skills --skill ' ' --agent universal The --agent universal flag drops everything into .agents/skills , the folder compatible agents auto-discover. Write a skill once, and your whole team gets the same expertise regardless of which agent they prefer. Codex adds a distribution layer on top it calls the authoring format a "skill" and the installable package a "plugin" , but the core file is identical. Every official source converges on these. I have ordered them by how much they matter in practice. If your skill does not trigger, it is almost never the instructions. It is the description. This is the single most important line in the entire file, because it is the only part the agent reads when deciding whether to load your skill at all , often choosing from 100+ candidates. Three rules from the official guidance: Compare: Weak: vague, no triggers, will rarely fire correctly description: Helps with Flutter screens. Strong: what + when + triggers + boundary description: Build a Flutter screen that loads async data with Riverpod, handling loading, error, and data states with AsyncValue. Use when fetching from a repository or API and rendering spinners, retry UI, and lists. Do not use for purely static screens with no async data. Anthropic puts it perfectly: the context window is a public good. Your skill shares it with the system prompt, the conversation, every other skill's metadata, and the user's actual request. The default assumption must be that the agent is already very smart . Do not explain what Flutter is. Do not explain what a widget is. Do not define JSON. Challenge every sentence: does the agent really not know this? Keep the SKILL.md body under 500 lines. If it grows past that, split it into references/ files. php < -- Bad: wastes tokens on what the model already knows -- Flutter is Google's UI toolkit. A widget is a building block of the UI. To make a network call, you first need an HTTP client, which is a piece of software that... < -- Good: assumes competence, gets to the point -- Use the http package for REST calls. Wrap responses in a typed model. This framing from Anthropic is the one most people miss. Think of the agent as a robot walking a path: dart run build runner build --delete-conflicting-outputs . Do not modify the flags."Fragile, deterministic Flutter operations code generation, migrations, platform config want low freedom. Architectural and design decisions want high freedom. Most skills need a mix. This is what makes the official Flutter skills so effective, and it is the ingredient that separates a senior skill from a junior one. Do not only say what to do. Ban the wrong instinct explicitly. The official flutter-build-responsive-layout skill does exactly this. It does not just say "be responsive." It says: do NOT switch layouts on MediaQuery.orientationOf , do NOT check for "phone" vs "tablet", do NOT lock orientation. Those negative rules are what stop the model from reaching for the plausible-but-wrong pattern it learned from a thousand old tutorials. Rules - Use AsyncValue.when to render data/loading/error. Never assume data is present. - Do NOT use FutureBuilder for server state. It re-runs on every rebuild and causes duplicate network calls. - Do NOT swallow exceptions or show an infinite spinner on failure. For any multi-step task, give the agent a checklist it can copy into its response and tick off. This prevents skipped steps, which is the most common failure mode on complex work. Both Anthropic and Flutter's own skills use this pattern. Workflow Copy this checklist and track progress: - Define the immutable data model. - Add the repository method returning Future