Claude Code Workflows: Deterministic Multi-Agent Orchestration Anthropic's Claude Code introduced deterministic multi-agent orchestration workflows, enabling developers to script parallel agent execution, fan-out, reduce, and synthesis patterns. A 130-line workflow example demonstrated nine agents independently scanning Vue and Nuxt ecosystem sources, merging findings, and ranking them by impact, showcasing the feature's ability to handle comprehensive, confidence-critical, and large-scale tasks beyond a single context window. Claude Code shipped workflows recently, and the docs describe a lot of machinery: deterministic orchestration, parallel and pipeline , journaling and resume, adversarial verify patterns. I wanted to understand it rather than skim the feature list, and the way I learn a tool is to build the smallest real thing with it. So I picked a task with an obvious fan-out shape: “what happened in the Vue and Nuxt ecosystem this week.” Many independent sources to check, then a merge, then a write-up. I wrote a ~130-line workflow that spawns nine agents in parallel, each scouring a different source, collects their findings into one list, ranks them by impact, and writes a digest. It’s a throwaway, but building it taught me how the whole feature fits together. This post is what I learned. A workflow is the newest piece of Claude Code’s orchestration story. In my post on agent teams /posts/from-tasks-to-swarms-agent-teams-in-claude-code/ Claude Code Agent Teams: How Multiple Sessions Coordinate 2026 Agent teams let multiple Claude Code sessions coordinate, communicate, and self-organize. Here's how they work, when to use them, and what they cost. I traced the progression from subagents to teams. Workflows are the next rung, and they solve a different problem than either: when you want the control flow itself to be deterministic, not decided turn-by-turn by a model. If you want a sense of the ceiling before the toy example, Jarred Sumner credited dynamic workflows and adversarial code review for porting Bun from Zig to Rust in six days: Dynamic workflows and adversarial code review was part of what made it possible to rewrite Bun in Rust in 6 days. New in Claude Code research preview : dynamic workflows. Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks. Use the word "workflow" in a prompt to get started. ✨ TLDR - → A workflow is a plain JavaScript script that orchestrates subagents deterministically: you own the loops and fan-out, agents do the thinking - → The shape that generalizes: fan out → reduce → synthesize - → agent runs one subagent use a schema for validated JSON , parallel is a barrier, pipeline streams items through stages with no barrier - → Default to pipeline ; reach for a parallel barrier only when a stage needs all prior results at once - → Compose verify/judge/loop-until-dry patterns for confidence, not more agents - → It's opt-in and token-hungry, so reach for it when a job needs breadth, verification, or scale a single context can't hold Table of Contents Open Table of Contents Where Workflows Fit Most of the time a single Claude Code session works turn-by-turn: read a file, decide, call a tool, look at the result, decide again. That loop is the right tool for most work. Some jobs don’t fit one head and one context window though: Comprehensive jobs: “review every file in this diff”, “audit all 40 dependencies”. Confidence-critical jobs: “find the bug, then have three independent skeptics try to refute it”. Scale jobs: migrations, sweeps, anything bigger than one context can hold. Subagents and agent teams can attack these, but there’s a subtle difference in who holds the plan . | Subagents | Agent Teams | Workflows | | |---|---|---|---| What it is | A worker Claude spawns | Independent Claude sessions | A script the runtime executes | Who decides what’s next | Claude, turn by turn | Claude and the teammates | The script | Where results live | Claude’s context | Each session’s context | Script variables | What’s repeatable | The worker definition | The team setup | The orchestration itself | Scale | A few per turn | A handful of sessions | Dozens to hundreds of agents | With subagents and skills, Claude is the orchestrator. It decides turn by turn what to spawn, and every result lands back in its context window. A workflow moves the plan into code. The script holds the loop, the branching, and the intermediate results, so Claude’s context only ever sees the final answer. That is what lets a workflow scale to hundreds of agents without drowning the conversation. The Core Idea A normal agent decides the control flow as it goes. A workflow inverts that. You write the control flow as plain code, and each individual step is delegated to a fresh subagent. The orchestration is deterministic; only the work inside each agent call is model-powered. That distinction is the whole point. When you write this: js const results = await parallel files.map f = = agent Review ${f} ; You know exactly one agent runs per file, they all run concurrently, and you get an array back. There are no emergent “the model decided to skip three files” surprises. You get determinism in the orchestration and model judgment inside each step. The shape that keeps showing up is fan out → reduce → synthesize : php graph LR A fan out -- B agent 1 A -- C agent 2 A -- D agent ... A -- E agent N B -- F reduce: dedupe + rank C -- F D -- F E -- F F -- G synthesize: write the result Swap the sources and prompts and the same skeleton becomes a market scan, a dependency audit, a code review, or a research report. The Example I Built to Learn It Here is the workflow I wrote. I picked the newsletter task because it forces you to use every part of the feature: a wide fan-out, a reduce step, and a synthesis step. Every script starts with a meta block that must be a pure literal , then a body using the orchestration primitives. js export const meta = { name: "vue-newsletter", description: "Research Vue/Nuxt sources in parallel and synthesize a newsletter", phases: { title: "Research", detail: "one agent per source" }, { title: "Curate", detail: "dedupe + rank by impact" }, { title: "Write", detail: "synthesize the newsletter" }, , }; 1. Fan out with parallel Nine sources, nine agents, all at once. Each returns structured JSON validated against a schema, so the model retries on mismatch and I never parse free text: js phase "Research" ; const raw = await parallel SOURCES.map s = = agent s.prompt, { label: research:${s.key} , phase: "Research", schema: ITEM SCHEMA, // forces validated structured output agentType: "general-purpose", } , , ; The SOURCES array is just data: one entry per source with a prompt. GitHub core releases, the Nuxt ecosystem, the official blogs, Hacker News, Reddit, dev.to, key people like Evan You and Anthony Fu, and the newsletter/podcast circuit. 2. Reduce with plain JavaScript Flattening, deduping, and filtering is just code. No agent needed: js const collected = raw.filter Boolean ; // skipped/failed agents become null const flatItems = collected.flatMap c = c.items ; log Collected ${flatItems.length} items ; 3. Synthesize with sequential agent calls js phase "Curate" ; const curated = await agent curatePrompt, { phase: "Curate", schema: CURATED SCHEMA } ; phase "Write" ; const newsletter = await agent writePrompt, { phase: "Write" } ; return { newsletter, itemCount: flatItems.length, curated }; The run I did while testing pulled together a Nuxt UI release, a Vue Router v5 minor, a Vue core patch, and a Madrid conference recap: seventeen items across nine sources in about three minutes. Good enough to convince me the orchestration worked, which was the whole point of building it. The Primitives A handful of functions do all the work. agent prompt, opts? spawns one subagent. Without options it returns the agent’s final text. The options worth knowing: schema : a JSON Schema. The subagent is forced to return validated structured data. label : the display name in the progress UI. phase : assigns the agent to a progress group. Use it inside parallel and pipeline to avoid racing on the global phase state. model : override the model for this one call. Default is to omit it so the agent inherits your session model. agentType : use a custom subagent type instead of the default workflow agent. isolation: "worktree" : run the agent in its own git worktree. Only when agents write files in parallel and would otherwise conflict. parallel thunks runs tasks concurrently. It is a barrier : it waits for every thunk before returning. A thunk that throws resolves to null rather than rejecting the whole call, so always .filter Boolean the results. You can pass a hundred thunks and they’ll all complete, but only a handful run at once: concurrency is capped at roughly your core count, and the excess queue. pipeline items, ...stages runs each item through all stages independently, with no barrier between stages . Item A can be in stage 3 while item B is still in stage 1. Each stage callback receives prevResult, originalItem, index . workflow nameOrRef, args? runs another workflow inline as a sub-step and returns whatever it returns. Pass a name to invoke a saved workflow, or { scriptPath } to run a script file. This is composition: a research workflow can call /deep-research as one of its stages instead of reimplementing the fan-out. The child shares the parent’s concurrency cap, agent counter, and token budget, and shows up as its own group in /workflows . Nesting is one level deep: a workflow call inside a child throws. // inside a script: hand a sub-question off to the bundled deep-research workflow const report = await workflow "deep-research", { question: topic } ; The rest are small helpers: phase title starts a progress group, log msg emits a narrator line, args carries the JSON you passed in when launching, and budget exposes the token target so you can scale depth dynamically it’s null when you launch without a target, so guard any loop-until-budget on budget.total or it runs to the agent cap . Warning Date.now , Math.random , and an argless new Date all throw inside a workflow. Workflows journal every agent call so a run can resume, and non-determinism would invalidate that cache. If you need a timestamp, pass it through args . If you need variety across agents, vary the prompt or label by index. pipeline vs parallel : The Decision That Matters This trips people up, so here is the rule I follow. Default to pipeline . Reach for a parallel barrier between stages only when a stage needs all prior results at once. Legitimate reasons for a barrier: - ✅ Dedupe or merge across the full result set before expensive downstream work. - ✅ Early-exit on the total “0 findings, skip verification entirely” . - ✅ A prompt that references “the other findings” for comparison. Not legitimate: - ❌ “I need to flatten or filter first.” Do it inside a pipeline stage. - ❌ “The stages feel conceptually separate.” Separate is not the same as synchronized. - ❌ “It’s cleaner code.” Barrier latency is real wall-clock waste. The smell test: if you wrote parallel → transform → parallel , and that middle transform has no cross-item dependency, you should have used a pipeline. The newsletter example does use a barrier, and correctly: curation has to see every source before it can dedupe and rank across them. Quality Patterns The primitives compose into reusable harnesses. This is the real value over spawning more agents: the structure is what produces confidence. A few I lean on: Adversarial verify : for each finding, spawn N independent skeptics prompted to refute it. Kill it unless a majority survive. Stops plausible-but-wrong findings from shipping. Perspective-diverse verify : give each verifier a distinct lens correctness, security, performance, does-it-reproduce instead of N identical ones. Diversity catches failure modes redundancy can’t. Judge panel : generate N attempts from different angles, score with parallel judges, synthesize from the winner while grafting the best of the runners-up. Loop-until-dry : for unknown-size discovery, keep spawning finders until K consecutive rounds surface nothing new. Here is loop-until-dry with a diverse-lens verify, condensed: js const seen = new Set ; const confirmed = ; let dry = 0; while dry < 2 { const found = await parallel FINDERS.map f = = agent f.prompt, { phase: "Find", schema: BUGS } .filter Boolean .flatMap r = r.bugs ; const fresh = found.filter b = seen.has key b ; if fresh.length { dry++; continue; } dry = 0; fresh.forEach b = seen.add key b ; const judged = await parallel fresh.map b = = parallel "correctness", "security", "repro" .map lens = = agent Judge "${b.desc}" via the ${lens} lens — real? , { phase: "Verify", schema: VERDICT } .then vs = { b, real: vs.filter Boolean .filter v = v.real .length = 2 } ; confirmed.push ...judged.filter v = v.real .map v = v.b ; } One detail makes or breaks this: dedupe against everything seen , not just confirmed results. Otherwise rejected findings reappear every round and the loop never converges. A Shipped Example: How /deep-research Works My newsletter generator is a toy. If you want to see these patterns in a real, bundled workflow, run /deep-research . It takes a question and returns a cited report, and under the hood it’s the same fan out → reduce → synthesize skeleton with an adversarial verify pass bolted on. It’s the quality pattern from the section above, running in production. When you launch it the workflow announces its plan and runs in the background while you keep working: It moves through five phases: Scope: one agent decomposes your question into five distinct search angles, so the searches don’t all chase the same wording. Search: five web searches run in parallel, one per angle. This is the fan-out. Fetch: dedupe the URLs across angles, pull the top ~15 sources, and extract individual claims from them. Verify: the interesting part. Each claim gets an adversarial three-vote check, with skeptics trying to refute it. Claims that don’t survive never reach the report. Synthesize: one final agent writes the cited report from the claims that held up. Map that onto the primitives and you can almost see the script: a single agent for scope, a parallel fan-out for the five searches, plain JavaScript to dedupe in fetch, a per-claim verify pass the same parallel of skeptics from the loop-until-dry example , and a closing agent to synthesize. The phases show up in /workflows as named groups Scope 1/1 , Search 0/5 , Fetch , Verify , Synthesize , each with its own agent count, token total, and elapsed time, so you can drill into any single search or verification and read its prompt and result. This is the difference between “ask Claude to research something” and a workflow. A single agent doing web research holds every half-read source in one context and never checks its own claims. /deep-research decomposes the search so coverage is wide, keeps the intermediate sources out of your conversation, and runs a verification pass a single turn-by-turn agent would never run against itself. Triggering and Watching a Run Worth saying plainly: from Claude Code’s side, a workflow is a tool . There’s a Workflow tool the same way there’s a Read or Bash tool, and “running a workflow” means Claude calls that tool with a script. The runtime executes the script in the background while your session stays responsive, which is why you can keep chatting while dozens of agents churn away. There are a few ways a workflow gets written and launched: Say “workflow” in your prompt. Include the word and Claude writes a workflow script for the task instead of working through it turn by turn. Run a saved or bundled command. A workflow you saved to the project, or the built-in /deep-research covered above. Turn on Claude plans a workflow for every substantial task in the session. ultracode . Run a workflow to audit every API endpoint under src/routes/ for missing auth checks. Spawn one agent per route file, then have a second pass verify each finding before reporting. When a run does what you wanted, you can save it: Claude Code writes the script into .claude/workflows/ in your repo as a