Dynamic Workflows in Opus 4.8: Build a Self-Verifying PR Reviewer Opus 4.8 introduces dynamic workflows that replace the traditional chat-based interaction model with an orchestrator-driven system where plain code spawns and manages hundreds of parallel subagents. The orchestrator controls workflow graphs using two primitives—`parallel()` for independent fan-out and `pipeline()` for sequential dependencies—while each subagent receives a focused prompt, structured-output schema, and individual effort control. A developer demonstrated the system by building a self-verifying pull-request reviewer that fans out across correctness, security, and performance dimensions, then adversarially refutes every finding before presenting results. Most people use Opus 4.8 the way they used every model before it: open a chat, type a request, watch the cursor, correct it, repeat. That's a conversation. A dynamic workflow is something else entirely. The shift is this: you stop being the loop. Instead, an orchestrator — plain code you control — spawns subagents you design, fanning out work in parallel, running steps in sequence, judging and merging results, and reporting back when the whole thing is done. Opus 4.8 can drive hundreds of parallel subagents inside a single workflow, with effort control per node so cheap steps stay cheap and hard steps think harder. In this tutorial you'll learn the core patterns by building one concrete thing: a pull-request reviewer that fans out across correctness, security, and performance, then adversarially verifies every finding before it reaches you. js // You design the shape. The orchestrator runs it. const found = await parallel DIMENSIONS.map d = = agent d.prompt, { schema: FINDINGS } const deduped = dedupeByFileLine found.flatMap r = r.findings const verified = await parallel deduped.map f = = agent refutePrompt f , { schema: VERDICT } const real = verified.filter v = v.refuted === false By the end you'll know when to reach for parallel versus pipeline , how structured output schemas keep subagents composable, and where to set effort per node. Stop thinking "I send a prompt, I get a completion." Start thinking: an orchestrator runs a workflow graph, and each node is an agent call. The orchestrator is plain code. It decides what runs, in what order, and what to do with each result. Subagents are the leaf workers — each gets a focused prompt, a structured-output schema, and its own effort setting. The unit of work is no longer the prompt; it's the graph. Two primitives compose every graph, and the difference between them is entirely about barriers — when the orchestrator blocks and waits. parallel is a barrier parallel fans work out to many subagents at once and resolves only when all of them return. Nothing downstream runs until the slowest node finishes. Use it for independent work that must be fully collected before the next decision — one subagent per review dimension, N-way verification, hundreds of concurrent checks. js // FAN-OUT: dimensions are independent → run them together const found = await parallel DIMENSIONS.map d = = agent d.prompt, { schema: FINDINGS, effort: "medium" } // barrier: every dimension has returned before we continue const deduped = dedupeByFileLine found.flatMap r = r.findings // plain code, no agent Note the = thunks. parallel invokes them itself — it schedules the work; it doesn't receive already-started promises. pipeline enforces order pipeline chains stages where stage N+1 depends on stage N 's output. Each stage blocks until its input exists, so the stages run strictly in sequence and the latencies add up. Reach for it when there's a true data dependency — you can't synthesize a review before findings exist, and you can't verify findings before they're deduplicated. js const review = await pipeline = parallel DIMENSIONS.map d = = agent d.prompt, { schema: FINDINGS } , found = dedupeByFileLine found.flatMap r = r.findings , deduped = parallel deduped.map f = = agent refutePrompt f , { schema: VERDICT } , Notice dedupeByFileLine is not an agent — deterministic work stays in code. You only spend a subagent where judgment is required. The whole grammar: parallel for independence, pipeline for dependency. Real workflows alternate between the two, fanning out for breadth and chaining where order matters. Every agent call above passes a schema . The model returns data shaped to that contract — FINDINGS , VERDICT , REVIEW — so you index fields instead of regexing prose. This is what lets the dedup and filter steps be plain code rather than yet another LLM call: js const real = verified.filter v = v.refuted === false Schemas are the seams that keep subagents composable. A node's output is machine-readable, so the next node — agent or code — consumes it without a parsing layer in between. Most "AI code review" is one model, one prompt, one pass. It finds plausible bugs and reports them with equal confidence — including the ones that aren't real. Dynamic workflows let you do better: fan out across review dimensions in parallel, then make the model attack its own findings before reporting them. Here's the full pipeline. Run one subagent per review dimension. They don't depend on each other, so they execute concurrently behind a barrier. js const DIMENSIONS = { name: "correctness", prompt: correctnessPrompt diff }, { name: "security", prompt: securityPrompt diff }, { name: "performance", prompt: perfPrompt diff }, ; const found = await parallel DIMENSIONS.map d = = agent d.prompt, { schema: FINDINGS } ; Each agent call is an isolated subagent with its own context window — the security reviewer never sees the performance reviewer's noise. { schema: FINDINGS } forces a structured output: an array of { file, line, severity, claim } , not prose you have to regex later. Three reviewers will flag the same line. Merging is deterministic set logic — don't spend a model on it. js const deduped = dedupeByFileLine found.flatMap r = r.findings ; flatMap flattens the per-dimension arrays into one list; dedupeByFileLine collapses entries sharing a file, line key. Use code wherever the answer is mechanical. Agents are for judgment, not joins. This is the step that kills false positives. For each surviving finding, spawn a skeptic subagent whose only job is to refute it. js const verified = await parallel deduped.map f = = agent refutePrompt f , { schema: VERDICT } ; const real = verified.filter v = v.refuted === false ; refutePrompt f instructs the subagent: "Here is a claimed bug. Prove it's wrong — find the guard, the caller, the type that makes it safe." VERDICT is { refuted: boolean, reason: string } . A finding that survives a dedicated attacker is worth reporting; one that doesn't, isn't. For higher-stakes findings, fan out N skeptics per finding and keep only what a majority can't refute — verification scales independently of review: js async function survivesQuorum f, n = 3 { const verdicts = await parallel Array.from { length: n }, = = agent refutePrompt f , { schema: VERDICT } ; const refutals = verdicts.filter v = v.refuted .length; return refutals <= Math.floor n / 2 ; // a majority could not refute it } This is a judge pattern: refutation is adjudication , kept separate from the generation in step 1. Asking a model to merely re-summarize its own findings launders the weak ones into the report. Refutation is a sharper filter than agreement. One agent turns confirmed findings into the review a human reads. js const review = await agent synthesisPrompt real , { schema: REVIEW } ; js const review = await pipeline = parallel DIMENSIONS.map d = = agent d.prompt, { schema: FINDINGS } , found = dedupeByFileLine found.flatMap r = r.findings , deduped = parallel deduped.map f = = agent refutePrompt f , { schema: VERDICT } , verified, deduped = synthesize deduped, verified , // keep only refuted === false, then write ; pipeline is sequential — each stage's output feeds the next. parallel is the barrier inside stages 1 and 3. Not every node deserves the same compute. Set effort per call: skeptics run cheap because refutation is a narrow question; synthesis runs at high effort because it's the artifact a human trusts. agent refutePrompt f , { schema: VERDICT, effort: "low" } ; agent synthesisPrompt real , { schema: REVIEW, effort: "high" } ; You spend reasoning where judgment is hard and conserve it where the work is mechanical — and a human still approves the final review before anything posts. parallel returns when the slowest node finishes; pipeline runs stages in sequence and accumulates their latency. Mismatching them is the most common cost mistake. Your review dimensions are independent, so fan them out — don't chain them. js // Good: 3 dimensions run concurrently, wall-time ≈ slowest dimension const found = await parallel DIMENSIONS.map d = = agent d.prompt, { schema: FINDINGS } // Bad: same work, ~3x the latency for no reason const found = await pipeline = agent DIMENSIONS 0 .prompt, { schema: FINDINGS } , = agent DIMENSIONS 1 .prompt, { schema: FINDINGS } , = agent DIMENSIONS 2 .prompt, { schema: FINDINGS } , Reserve pipeline for true data dependencies — verify needs dedup's output, so that edge stays sequential. Verification is the expensive phase: it can spawn N skeptics per finding. If correctness and security both flag auth.js:42 , verifying twice burns budget for nothing. Collapse duplicates first with plain code — no agent required. The synthesize step is your human-in-the-loop checkpoint. Confirmed findings are a recommendation, not an auto-commit — a person approves before anything lands. Fan-out multiplies whatever your base node produces, so the base node's reliability matters. Anthropic reports Opus 4.8 makes roughly 4x fewer silent code bugs than its predecessor; the more trustworthy each leaf reviewer is, the safer it is to run many of them in parallel. A single agent is the right default. Reach for a dynamic workflow only when the task has structure you can name : independent dimensions that fan out in parallel, a verification step that must be adversarial rather than self-graded, or a synthesis pass that depends on confirmed inputs. The PR-review example earns its workflow because each stage has a different shape — fan out, collapse in code, fan out again to refute, then synthesize. parallel is the barrier; pipeline enforces order; schemas keep the seams machine-readable; effort goes high on synthesis and low on the mechanical passes. Open question: which of your "trust me" agent steps is actually an unverified claim waiting for a skeptic?