{"slug": "the-loadout-pattern-handing-the-wheel-to-an-autonomous-llm", "title": "The Loadout Pattern: Handing the Wheel to an Autonomous LLM", "summary": "A developer introduced the loadout pattern, which inverts conventional LLM integration by letting an autonomous LLM drive a system on its own initiative. The pattern separates a toolbox (all available tools) from a loadout (the curated subset for a specific mission), enabling the model to decide which tools to use at each step. This approach shifts the system from executing fixed procedures to equipping the LLM as a 'suit' that the brain wears.", "body_md": "Conventional automation **executes** a procedure — code runs a fixed sequence of steps and decides\n\nnothing; same input, same path, every time. The loadout pattern keeps the steps but moves the\n\n*deciding* to the model. At each step the **brain** — an autonomous LLM — **judges**: what matters,\n\nwhich tool to reach for, whether to act at all. It's handed a **purpose** and the latitude to pursue\n\nit, and it *drives* — choosing its own tools as it goes. **Code executes; the brain decides.** Those\n\ntools come as a **loadout** — a curated, self-describing set drawn from a shared **toolbox** — and\n\nthe brain is observed at the **interface** it calls, not by the side effects it leaves behind. The\n\nmodel is the driver; your system is the suit it wears. Everything below is how to build that.\n\nMost LLM integrations bolt a model\n\nintoyour code. This is about the opposite: letting the\n\nmodeldriveyour system — equipping itself, on its own initiative, with aloadout: the\n\ncurated, self-describing set of tools it picks for each mission. The system stops being the\n\nprogram that calls an LLM, and becomes thesuitthe LLM wears.\n\n*Audience: engineers building agentic/automation systems. There's code, and there's a bit of\nphilosophy — because the philosophy is what makes the code shaped the way it is.*\n\nTwo words, kept distinct (the whole post hinges on this):\n\natoolbox(or catalog) iseverytool you own — the whole armory.\n\nAloadoutis the curated subset a routine equipsfor one mission— what it actually suits\n\nup with. The entire MCP server is a toolbox; a loadout is the handful of tools one routine is\n\nhanded at wake.\n\nIn a typical LLM integration the model lives **inside** your process. Your code calls it:\n\n```\nanswer = agent.invoke({\"input\": \"What changed in the market overnight?\"})\n```\n\nThis is great for **human-triggered** work: a person asks, the system fetches and answers. The\n\nhuman is the caller; nothing happens until they show up. The LLM is a *component* — a function\n\nyour program calls and pays per token to use.\n\nThis post is about the other mode: **the LLM doing the work on its own initiative.** A routine wakes\n\non a schedule and gets on with it — digesting overnight news every hour, posting a morning briefing,\n\nwatching a queue, reconciling a ledger. No one asked; the routine is its own caller. Wake the model\n\non a cron — say, a headless Claude Code session every hour — and it is no longer a component inside\n\nyour program. It's *outside*, periodically taking the wheel and deciding what to do.\n\nThe line that matters isn't human-vs-cron — and it isn't even steps-or-no-steps. It's **executing\nversus deciding**: a script runs its steps and decides nothing, while the brain — even when it\n\nThat inversion changes what your system should be.\n\nThree layers, and it matters which is which:\n\nHere's the leverage that falls out of this: **you don't hand-author JARVIS's intelligence.** It\n\ncomes from the model — and it improves when you swap in a better model, not when you write more\n\ncode. What you *build* is the *suit* — what the brain can sense, remember, and do. So the central\n\nquestion of the whole system becomes: *how do we equip the brain well — give it the right loadout —\nand let it reach for the right tool at the right moment?*\n\nWhen you first wire a cron-woken routine, you write a prompt (\"skill\") that mixes two very\n\ndifferent things: the **mission** (what to judge, the actual work) and the **mechanics** (raw\n\n`curl`\n\n, database queries, hardcoded IDs). A real before-state:\n\n```\n# news-digest skill (before)\n1. Query Mongo for new headlines since the watermark:\n   docker exec db mongosh app --eval 'db.news.find({publishedAt:{$gt: ...}})...'\n2. Decide which are new stories vs updates vs noise.  ← the actual mission\n3. Post the briefing:\n   curl -X POST http://localhost:9000/notify -d '{\"type\":\"SIGNAL\", ...}'\n   Then create a Notion page: data_source_id \"<your-notion-data-source>\", icon \"📰\", ...\n```\n\nTwo problems compound. First, the mission (step 2 — judgment) is drowned in plumbing. Second,\n\nevery *other* routine that needs to \"post a notification\" re-describes that same `curl`\n\nin its own\n\nprompt. Change the notification URL and you edit five skills. The mechanics are copy-pasted prose.\n\nSplit the system along the seam between **interface** and **implementation**.\n\n**1. Tools are named capabilities — a stable name over a swappable implementation.** Most are small,\n\ndumb, independent scripts, but the *name* is the only thing the brain depends on; what sits behind\n\nit is free to vary. Usually it wraps mechanics (a `curl`\n\n, a DB query, a stubbed no-op, a different\n\nbackend tomorrow). But a tool can just as well **hand off to another agent** — a sub-brain with its\n\nown loadout — or **trigger the next task** in a pipeline. To the brain it's all the same: a name it\n\ncan reach for. So a tool is sometimes an interface over mechanics, and sometimes the *next move* —\n\nanother agent, or the start of the next step. `notify`\n\nsends a notification; `read_news`\n\nreads. They\n\ndon't know about each other. Together, all of them are your **toolbox** (the catalog).\n\n``` bash\n#!/usr/bin/env bash\n# notify.sh — send a notification (hides the URL/payload mechanics)\nset -euo pipefail\n[ \"${1:-}\" = \"--describe\" ] && { echo \"notify|action|send a notification\"; exit 0; }\nTYPE=\"$1\"; TITLE=\"$2\"; MSG=\"$3\"\npayload=\"$(jq -n --arg t \"$TYPE\" --arg ti \"$TITLE\" --arg m \"$MSG\" '{type:$t,title:$ti,message:$m}')\"\ncurl -s -X POST \"${NOTIFY_URL:-http://localhost:9000/notify}\" \\\n     -H 'Content-Type: application/json' -d \"$payload\"\n```\n\n**2. Tools describe themselves.** One line, `--describe`\n\n, is the single source of truth for what\n\nthe tool is. Not the skill, not a wiki — the tool.\n\n**3. A loadout assembler hands the brain its kit.** Given a list of tool\n\n``` bash\n#!/usr/bin/env bash\n# loadout.sh <tool> [tool...] — print the self-descriptions of the named tools (the loadout)\nset -euo pipefail\nDIR=\"$(cd \"$(dirname \"${BASH_SOURCE[0]}\")\" && pwd)\"\necho \"🧰 loadout for this mission\"\nfor t in \"$@\"; do\n  IFS='|' read -r name kind desc <<<\"$(bash \"$DIR/$t.sh\" --describe)\"\n  echo \"  - $name ($kind): $desc\"\ndone\n```\n\n**4. The skill becomes mission only.** It states what to do and names its loadout. The\n\n```\n# news-digest skill (after)\n## Loadout — download at start\nbash tools/loadout.sh read_news write_story notify publish_notion\n\n## Mission\nTurn new headlines into a running ledger of stories: skip repeats, extend ongoing\nstories, open new ones, ignore noise. Each morning, post a briefing from the ledger.\n```\n\n**5. The brain thinks for itself — tools don't auto-chain.** Keep `notify`\n\nand `publish_notion`\n\nseparate; do *not* make \"writing to Notion\" secretly also send a notification. The moment you fuse\n\ntwo tools in the plumbing you've frozen a policy — you can no longer publish quietly, or notify\n\nwithout publishing. Leave the tools independent and let the *brain* reason about whether to call one,\n\nthe other, or both. The thinking is the brain's job; the wiring must not pre-decide it.\n\n**From the model's point of view, this is the whole win.** When the routine wakes, it receives two\n\ncleanly separated things: a **mission** — what to accomplish and how to judge it — and a\n\n**loadout** — the named capabilities it is allowed to use. It never has to excavate the *how* (a\n\nURL, a query, an ID) out of the *what*; the mechanics are simply not in its field of view, leaving\n\nonly the decision and the set of moves available to make it. The skill carries judgment (which\n\nchanges often); the toolbox carries capability (stable, shared); a loadout is just the names a\n\nroutine picks from it. A new routine lists tool names and gets their descriptions for free — change\n\na URL and you edit one tool, not five prompts.\n\nSide effects are not proof. A notification arriving does not establish that the model invoked the\n\ntool, and a tool whose implementation is a no-op stub produces no side effect at all even when the\n\nmodel used it correctly. Verifying behavior therefore means observing the **interface** — the\n\nmoment a tool is called — separately from what its implementation did.\n\nEach tool logs at that boundary:\n\n```\n# _log.sh (sourced by every tool)\ntlog() {  # tlog <event> [detail]   event: INVOKED | OK | DRY | ERR\n  printf '%s | %-12s | %-7s | %s\\n' \\\n    \"$(date -u +%Y-%m-%dT%H:%M:%SZ)\" \"$(basename \"$0\" .sh)\" \"$1\" \"${*:2}\" \\\n    >> \"$LOG_FILE\"\n}\n```\n\nThe log separates two questions that side effects conflate:\n\n```\n... | notify | INVOKED | SIGNAL | Morning briefing   # the interface was called\n... | notify | OK      | SIGNAL HTTP 200             # the implementation sent it\n... | notify | DRY     | SIGNAL                      # called, but did not send (DRY_RUN)\n```\n\n`INVOKED`\n\nrecords that the model used the tool, independent of any outcome; `OK`\n\n/`DRY`\n\n/`ERR`\n\nrecords what the implementation did. Because the model depends on the interface rather than the\n\nimplementation, the same routine can run in a **shadow mode** — where `notify`\n\nonly logs and never\n\nsends — with no change in the model's behavior. The boundary log is also the reliable way to audit\n\na past run: it records what executed, not merely what the skill instructed.\n\nThis is a pattern, not a framework — and that's its honest limit: **nothing enforces it at\nruntime.** There's no base class, no inversion of control, nothing that\n\nGo back to the suit. You upgrade the brain by adopting a better model — that's not code you write,\n\nit's a model you swap in. Your day-to-day engineering goes into the equipment: what the brain can\n\ndiscover, reach for, and be observed using. And because the brain depends on interfaces — a loadout\n\nof named tools — the suit is model-agnostic: change the model and the same loadout still fits. A\n\nself-describing, observable loadout is precisely how the brain *takes the wheel*: it wakes,\n\ndownloads the tools it's allowed, sees what it can do, and acts — and you can watch it do so at the\n\ninterface, not by guessing from side effects. The system stops being a program that occasionally\n\ncalls a model, and becomes a suit a capable model wears.\n\n`--describe`\n\nline and a boundary log (`INVOKED`\n\n+ `OK/DRY/ERR`\n\n). Side-effecting tools support a\n`DRY_RUN`\n\n.Runnable examples are in [ examples/](https://github.com/bighaeil/agent-loadout-pattern/tree/main/examples).", "url": "https://wpnews.pro/news/the-loadout-pattern-handing-the-wheel-to-an-autonomous-llm", "canonical_source": "https://dev.to/bighaeil/the-loadout-pattern-handing-the-wheel-to-an-autonomous-llm-29lj", "published_at": "2026-06-29 07:32:54+00:00", "updated_at": "2026-06-29 07:57:24.380160+00:00", "lang": "en", "topics": ["large-language-models", "ai-agents", "ai-infrastructure", "developer-tools"], "entities": ["Claude Code", "MCP", "MongoDB", "Notion"], "alternates": {"html": "https://wpnews.pro/news/the-loadout-pattern-handing-the-wheel-to-an-autonomous-llm", "markdown": "https://wpnews.pro/news/the-loadout-pattern-handing-the-wheel-to-an-autonomous-llm.md", "text": "https://wpnews.pro/news/the-loadout-pattern-handing-the-wheel-to-an-autonomous-llm.txt", "jsonld": "https://wpnews.pro/news/the-loadout-pattern-handing-the-wheel-to-an-autonomous-llm.jsonld"}}