{"slug": "agentic-design-patterns-the-shapes-every-coding-agent-reuses", "title": "Agentic Design Patterns: The Shapes Every Coding Agent Reuses", "summary": "Anthropic's guide on agentic design patterns categorizes all agentic systems into workflows and agents, where workflows use predefined code paths and agents let LLMs dynamically direct their own processes. The guide emphasizes starting with the simplest solution—a single augmented LLM call—and escalating complexity only when needed, using patterns like prompt chaining, routing, and verification loops. It provides decision rules and cost tradeoffs for each pattern, grounded in Anthropic's Building Effective Agents and the Claude Agent SDK.", "body_md": "This is an adapted excerpt from a guide in my AI Knowledge Hub. The full interactive version is linked at the end.\n\nAgentic design patterns are named control structures for arranging LLM calls and tools. This post gives you the decision rule for picking one, the exact shape of each pattern, and the cost each adds — so you can match a task to the minimum structure that solves it. Everything here is model-agnostic and grounded in Anthropic's *Building Effective Agents* and the Claude Agent SDK.\n\nAnthropic divides all agentic systems into two categories, and the split decides every downstream tradeoff:\n\n| Category | Definition | Control lives in | Use when |\n|---|---|---|---|\nWorkflow |\nLLMs and tools orchestrated through predefined code paths\n|\nYour code | You can pre-map the decision tree; want accuracy, control, lower cost |\nAgent |\nLLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks |\nThe model | Open-ended task where you can't predict the number of steps |\n\nEvery pattern composes one unit: the **augmented LLM** — an LLM enhanced with retrieval, tools, and memory. It generates its own search queries, selects tools, and decides what to retain. **If a single augmented LLM call solves the task, stop — no pattern required.**\n\nThe escalation rule is the whole game: find \"the simplest solution possible, and only increasing complexity when needed\" — which \"might mean not building agentic systems at all.\" Agentic systems trade latency and cost for better task performance, so only escalate when a specific failure mode forces it.\n\nFor open-ended tasks, every agent runs the same four-beat loop:\n\n`grep`\n\n/ `find`\n\n/ `tail`\n\nto pull relevant slices instead of whole files), or delegate to subagents with isolated context windows.Without ground-truth feedback at each step, the model guesses and compounds errors. Verification is the beat that makes this an agent rather than a script. Here it is wired up with the Claude Agent SDK:\n\n```\n# pip install claude-agent-sdk\n# TS equivalent: npm i @anthropic-ai/claude-agent-sdk\nimport anyio\nfrom claude_agent_sdk import query, ClaudeAgentOptions\n\nasync def main() -> None:\n    options = ClaudeAgentOptions(\n        # Make \"verify\" deterministic: a rule that either passes or fails.\n        allowed_tools=[\"Read\", \"Edit\", \"Bash\", \"Grep\"],\n        system_prompt=(\n            \"Fix the failing test in tests/. After every edit, run \"\n            \"'pytest -q' and only stop when it passes. Do not edit or delete \"\n            \"tests to make them pass.\"\n        ),\n    )\n    async for message in query(\n        prompt=\"The auth test is red after the password-reset change. Make it green.\",\n        options=options,\n    ):\n        print(message)  # gather -> act -> (pytest = verify) -> repeat until green\n\nanyio.run(main)\n```\n\n| Method | How it verifies | When to use it | Cost / caveat |\n|---|---|---|---|\nRules-based (linters, types, tests)\n|\nA defined rule passes or fails; the agent is told which rule failed and why | Anything expressible as a deterministic check — \"the best form of feedback\" | Cheap and fast; needs the rule to exist |\nVisual feedback |\nScreenshots / renders the model inspects | Layout, styling, responsiveness — things a test cannot assert | Needs a render step and a vision-capable model |\nLLM-as-judge |\nA separate model scores against fuzzy criteria | Only when no rule or render can capture the criterion | Heavy latency tradeoffs for marginal gains — last resort |\n\n| Pattern | Shape | When it wins | Example |\n|---|---|---|---|\nPrompt chaining |\nSequence of steps; each LLM call processes the previous output; optional programmatic gates between steps |\nTask can be \"easily and cleanly decomposed into fixed subtasks\" | Outline → gate-check outline meets brief → write doc |\nRouting |\nA classifier (LLM or classical) sorts input, then sends it to a specialized handler | \"Distinct categories that are better handled separately, and where classification can be handled accurately\" | Support desk: general / refund / tech → different flows; easy→cheap model, hard→frontier model |\nParallelization — sectioning |\n\"Breaking a task into independent subtasks run in parallel\" | Subtasks parallelizable for speed | One model answers while another screens for inappropriate content |\nParallelization — voting |\n\"Running the same task multiple times to get diverse outputs\" | Multiple attempts needed for higher-confidence results | Several prompts review code for vulns; vote with a threshold |\n\nBoth fan work across multiple LLM calls. The distinction is *who draws the subtasks*:\n\nUse orchestrator–workers for \"complex tasks where you can't predict the subtasks needed\" — Anthropic's example is \"coding products that make complex changes to multiple files each time.\" If subtasks are fixed, hardcode and parallelize; if they vary per input, let the orchestrator decide.\n\n**Mind the token cost.** In Anthropic's research system, a Claude Opus 4 lead with Claude Sonnet 4 subagents outperformed single-agent Opus 4 by 90.2% on their internal research eval — but \"agents typically use about 4× more tokens than chat interactions, and multi-agent systems use about 15× more tokens\" than chats. Multi-agent only pays off for valuable, parallelizable, breadth-first work exceeding one context window. The scaling rule from the lead prompt: simple fact-finding = 1 agent with 3–10 tool calls; direct comparisons = 2–4 subagents with 10–15 calls each; complex research = more than 10 subagents.\n\n\"One LLM call generates a response while another provides evaluation and feedback in a loop.\" The broader literature calls this **reflection** — the same shape under a different name.\n\nIt is \"particularly effective when we have clear evaluation criteria, and when iterative refinement provides measurable value.\" Two signals it fits: a human articulating feedback demonstrably improves the output, *and* the LLM can produce that critique itself. With fuzzy criteria you get an expensive loop that polishes nothing — prefer deterministic verification first.\n\n**Plan-and-execute:** a planner generates a full multi-step plan up front; executor(s) carry out each step (often smaller, cheaper models); a replanning step decides whether to finish or generate a follow-up plan. Three wins: speed (intermediate steps skip the big model), cost (the large model is only called for re-planning steps), and quality (the planner must explicitly think through all the steps). Footgun: no replanning means a wrong initial plan executes faithfully to a wrong answer.\n\n**ReAct:** the LLM only plans for one sub-problem at a time — think → act → observe, one tool call per turn, adapting continuously. It wins on simple, dynamic tasks solvable in a few tool calls where each next step depends on the last observation.\n\nFor long-running work, Anthropic's harness operationalizes plan/execute/review as a planner / generator / evaluator structure with durable artifacts that survive a context reset: an initializer agent writes an `init.sh`\n\nscript, a `claude-progress.txt`\n\nfile, and an initial git commit; a coding agent makes incremental progress against a feature list (JSON) of 200+ granular, testable features marked passing/failing; and it verifies as an engineer would, marking a feature done only when it actually works. One hard rule: **never let an agent delete its own tests** — \"it is unacceptable to remove or edit tests because this could lead to missing or buggy functionality.\" Treat the test suite as immutable ground truth.\n\nRead top to bottom; stop at the first row that fits.\n\n| If the task… | Use | Because |\n|---|---|---|\n| is solved by one augmented LLM call | No pattern |\nSimplest solution first; patterns add latency and cost |\nsplits into fixed, clean sequential steps |\nPrompt chaining |\nEach easier subtask raises accuracy; gates catch drift |\nhas distinct input categories handled best separately |\nRouting |\nSpecialized prompts per class; cheap model for easy inputs |\nsplits into fixed independent subtasks, or needs many attempts |\nParallelization (section / vote)\n|\nRun them at once for speed, or vote for confidence |\nhas subtasks you cannot predict until you see the input |\nOrchestrator–workers |\nThe model decides the subtasks at runtime |\nhas a clear pass/fail check and improves with iteration |\nEvaluator–optimizer |\nA critique loop measurably refines the output |\nis open-ended with no predictable number of steps |\nAgent (loop / plan-execute)\n|\nYou can't hardcode the path; the model needs ground-truth feedback |\n\nPatterns compose, and frameworks can hide them. A real coding agent might *route* a request, hand hard ones to an *orchestrator* that fans work to *workers* each running a *gather → act → verify* loop with *rules-based* checks. Layer only as far as the task demands. Frameworks \"make it easy to get started\" but \"often create extra layers of abstraction that can obscure the underlying prompts and responses, making them harder to debug\" — start with raw API calls and understand the shape before a framework hides it.\n\nThis is an adapted excerpt. The **full interactive version** — including a pattern explorer that animates how control and data flow through each shape, a live fan-out visualizer with a token meter, and a quiz on matching tasks to patterns — lives at the canonical URL: [hussamahmed.com/ai/foundations/agentic-design-patterns](https://hussamahmed.com/ai/foundations/agentic-design-patterns). It is part of a larger, fact-checked hub on Claude Code, Codex, Gemini CLI, IDE agents, and the first principles beneath them.", "url": "https://wpnews.pro/news/agentic-design-patterns-the-shapes-every-coding-agent-reuses", "canonical_source": "https://dev.to/thesmartdude/agentic-design-patterns-the-shapes-every-coding-agent-reuses-3m15", "published_at": "2026-06-15 19:01:11+00:00", "updated_at": "2026-06-15 19:33:03.726956+00:00", "lang": "en", "topics": ["large-language-models", "ai-agents", "ai-tools", "developer-tools", "ai-research"], "entities": ["Anthropic", "Claude Agent SDK", "Building Effective Agents"], "alternates": {"html": "https://wpnews.pro/news/agentic-design-patterns-the-shapes-every-coding-agent-reuses", "markdown": "https://wpnews.pro/news/agentic-design-patterns-the-shapes-every-coding-agent-reuses.md", "text": "https://wpnews.pro/news/agentic-design-patterns-the-shapes-every-coding-agent-reuses.txt", "jsonld": "https://wpnews.pro/news/agentic-design-patterns-the-shapes-every-coding-agent-reuses.jsonld"}}