This post is a TypeScript implementation of the pattern described in "Beyond the Agentic Loop: The Orchestrator Pattern for Multi-Agent Systems" by
Before the pattern, the scene. The demo is a small storefront assistant backed by a
few single-purpose agents:
A customer request might need just one of these, several of them at once, or a few in a
strict order — and deciding which of those shapes a request calls for is exactly what
the orchestrator is for.
while
loop The default way to build a multi-agent system is the agentic loop: you hand the
model a bag of tools and let it drive.
think → call a tool → observe the result → think again → call another tool → …
The LLM is both the brain and the control flow. That's wonderfully flexible, and
it's the right tool when the task is open-ended and you genuinely don't know the
steps in advance. But in production it has three nasty properties:
If you already know which agents exist and what they do, an open-ended reasoning loop
on every request is more freedom than the job needs.
The orchestrator's move is to separate the decision from the execution. Instead
of letting the model loop, you make exactly two LLM calls with plain,
deterministic code in between:
query ──▶ [ROUTE: LLM #1] ──▶ [EXECUTE: agents, no LLM] ──▶ [SYNTHESIZE: LLM #2] ──▶ answer
Two calls, every time, no matter how many agents run. That fixed shape is the whole
point: a plan you can inspect before anything happens, latency that doesn't depend on
the model's mood, and independent work you can fan out. (It's cheaper too — the article
puts the same query at ~2 calls instead of ~7 — but the cost isn't the headline; the
outcomes are.)
An agent is a name, a description (for the router), a JSON-Schema for its arguments,
and an execute
function. Nothing more.
// src/server/orchestrator/types.ts
export type ExecuteFn = (args: AgentArgs, context: AgentContext) => Promise<AgentResult>;
export interface AgentDefinition {
agent: string; // human name, e.g. "Catalog Agent"
description: string; // shown to the router LLM so it can choose this tool
parameters: Record<string, unknown>; // JSON Schema for the args
execute: ExecuteFn;
}
The "registry" is a plain in-process object — agents are registered by hand.
There's deliberately no Redis, no database, no HTTP self-registration. That keeps the
whole thing runnable and testable with zero infrastructure.
// src/server/orchestrator/registry.ts
export const REGISTRY: Record<string, AgentDefinition> = {
catalog_agent__list_categories: catalogCategoriesAgent,
catalog_agent__search_products: catalogAgent,
inventory_agent__check_stock: inventoryAgent,
pricing_agent__get_deals: pricingAgent,
reviews_agent__get_reviews: reviewsAgent,
order_agent__place_order: orderAgent,
};
toolDefinitions()
projects that map into the OpenAI tool format the router sees —
each agent becomes one function tool, plus one meta-tool we'll meet shortly.
The router is given a blunt system prompt: pick tools, do not answer.
// src/server/orchestrator/router.ts
const SYSTEM_PROMPT = `You are a query router. Your ONLY job is to decide which tool(s) to call.
Rules:
- If the query needs ONE agent, call that one tool.
- If the query needs MULTIPLE INDEPENDENT agents, call all of them.
- If the query needs steps IN ORDER (a later step depends on an earlier one), call plan_execution and provide the ordered steps.
Do NOT answer the user's question — just pick tools.`;
We call the model at temperature: 0
with tool_choice: "auto"
, then read its tool
calls back out. The shape of that tool-call list is the execution plan — we never
ask the model to "answer," only to choose:
// src/server/orchestrator/router.ts
export async function route(query: string): Promise<RouteDecision> {
const response = await getOpenAIClient().chat.completions.create({
model: getConfig().ROUTER_MODEL,
temperature: 0,
tools: toolDefinitions(),
tool_choice: "auto",
messages: [
{ role: "system", content: SYSTEM_PROMPT },
{ role: "user", content: query },
],
});
const toolCalls = response.choices[0]?.message.tool_calls ?? [];
// plan_execution present -> sequential. Take its ordered steps.
const planCall = toolCalls.find((c) => c.function.name === PLAN_EXECUTION_TOOL);
if (planCall) {
const parsed = safeParseArgs(planCall.function.arguments) as {
steps?: Array<{ tool: string; args?: AgentArgs; reason?: string }>;
};
const steps = (parsed.steps ?? []).map((s) => ({ tool: s.tool, args: s.args ?? {}, reason: s.reason }));
return { mode: "sequential", steps };
}
const steps = toolCalls.map((c) => ({ tool: c.function.name, args: safeParseArgs(c.function.arguments) }));
return { mode: steps.length > 1 ? "parallel" : "single", steps };
}
So the router collapses to three outcomes:
single
parallel
plan_execution
sequential
This is where parallel and sequential actually diverge — and it's pure TypeScript,
no model involved.
// src/server/orchestrator/executor.ts
export async function* executeStream(mode: Mode, steps: PlanStep[]): AsyncGenerator<ExecEvent, AgentContext> {
const results: AgentContext = {};
if (mode === "parallel") {
for (const step of steps) yield { kind: "agent_start", tool: step.tool, args: step.args };
const settled = await Promise.all(
steps.map(async (step) => [step.tool, await runAgent(step, {})] as const),
);
for (const [tool, result] of settled) {
results[tool] = result;
yield { kind: "agent_result", tool, result };
}
return results;
}
// single + sequential: ordered; each step sees prior results as context.
for (const step of steps) {
yield { kind: "agent_start", tool: step.tool, args: step.args };
const result = await runAgent(step, results);
results[step.tool] = result;
yield { kind: "agent_result", tool: step.tool, result };
}
return results;
}
Read the two branches side by side:
Promise.all
. The agents are independent, so they all fire at
once and you pay for the slowest one, not the sum. for
loop where each step receives the accumulated
results
as its context
. That's how a later agent consumes an earlier one's
output. (The generator yield
s a small event before and after each agent. That's only so a
transport can show progress; it doesn't change the logic.)
plan_execution
: a signal, not an agent How does the router say "do these in order"? With a meta-tool that runs no code:
// src/server/orchestrator/registry.ts
export const PLAN_EXECUTION_TOOL = "plan_execution";
// ...its tool schema asks for { reason, steps: [{ tool, args, reason }] }
When the router selects plan_execution
, the orchestrator switches to sequential
mode. The original article treats it purely as a signal and leaves the ordering and
data-passing unspecified. This repo makes one deliberate addition so the demo
actually works end-to-end: ** plan_execution returns the ordered steps**, and the
results
forward as context. The order agent then resolves theresolveTargetProduct
insrc/server/lib/resolve-product.ts
). That's the difference between a pattern diagramOnce the agents have produced structured data, a second LLM call turns it into an
answer. This is the only step with any "writing" to do, so it runs warmer and streams
its tokens out.
// src/server/orchestrator/synthesizer.ts
export async function* synthesizeStream(query: string, results: AgentContext): AsyncGenerator<string> {
const stream = await getOpenAIClient().chat.completions.create({
model: getConfig().SYNTH_MODEL,
temperature: 0.7,
stream: true,
messages: [
{ role: "system", content: "Summarize the agent results into a clear, helpful answer." },
{ role: "user", content: `User asked: ${query}\nResults: ${JSON.stringify(results)}` },
],
});
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content;
if (delta) yield delta;
}
}
Putting the three phases together, the payoff is exactly the inverse of the loop's
pain points — and these enablements, not the price tag, are the real reason to reach
for it:
RouteDecision
— produced Promise.all
; you didn't have to
teach the model to be concurrent.executeStream
is
an ordinary async function you can unit-test with a stub registry — no API key, no
flakiness.| Query | Mode | Agents |
|---|---|---|
what do you have? |
single | catalog_agent__list_categories |
what's the price, rating and availability of the iPhone 15? |
parallel |
pricing + reviews + inventory (at once) |
find a laptop under $1000, make sure it's in stock, then order it |
sequential |
search → check stock → order
|
Same agents, same data — the router decides the shape of the run.
This isn't "orchestrator good, loop bad." The agentic loop is the right tool when the
task is genuinely exploratory: you don't know the steps ahead of time, the toolset is
open-ended, or the agent needs to re-plan mid-flight based on what it discovers. The
orchestrator trades that adaptability for predictability — and it assumes you can
enumerate your agents up front. Note too that the router here is itself a single LLM
call, so a truly novel multi-hop plan it has never seen is out of scope by design.
The article's framing is the one to keep: loop for exploration, orchestrator for production. If you already know your agents and you need bounded latency, parallel
Please feel free to reach out on twitter @roamingcode