Where the agent decides, and where the tools actually run A real Node.js team using LangGraph and TypeScript needed to integrate Microsoft's Squad agent framework without rewriting their product in C#, leading to a three-layer architecture where a LangGraph orchestrating brain decides the flow, Azure Container Apps run dangerous and stateful work, and graph state carries evidence between steps. The agent demos all look beautiful. You ask the friendly chatbot a question, it thinks for a moment, it gives you an answer. Sometimes the answer is even right. Then someone says the thing that ruins the demo: “Can we get it to actually do things—and trust it to do them?” The moment “do things” is in scope, the architecture problem changes. Now the agent needs to run code. It needs files. It needs network access. It needs a credential to call the next model. It needs to remember what it did yesterday. The friendly little chat suddenly has a workspace, a shell, a token, and a very real ability to break something expensive. This post outlines the architecture I want around that agent before I let it loose: a LangGraph https://www.langchain.com/langgraph factory that talks to a Squad https://commandline.microsoft.com/squad-github-copilot-agent-teams-architecture-durable-memory/ coordinator for judgment, then dispatches the dangerous parts to two different Azure Container Apps https://azure.microsoft.com/en-us/products/container-apps/ primitives: one for one-shot work, one for stateful work. The whole thing proved itself end to end last week, which is why I’m writing it down now. And this isn’t a hypothetical I built to have something to write about. The triggering event was a real team that turned up wanting to use Squad in production—and, interestingly, they were a Node.js shop. They had TypeScript. They had LangGraph. They had a package-lock.json that had clearly earned the right to be respected. What they did not have was the Microsoft Agent Framework https://commandline.microsoft.com/agent-framework-layered-sdk-loops-workflows-harnesses/ , which is inconvenient for my C heart. They wanted Squad inside their existing application, and the answer couldn’t be, “Please rewrite your product in C first.” So the question stopped being whether Squad is nice and became a harder one: Where exactly does a judgment step go inside an app that already has a deterministic state machine, a tool surface, a CI pipeline, and a product manager who would prefer the demo not catch fire? That’s a different question from the last piece Brady Gaster and I wrote for Command Line https://commandline.microsoft.com/squad-github-copilot-agent-teams-architecture-durable-memory/ , which was about what survives an agent session—make the agents disposable, keep the memory in Git. This post is about where the agents go and where their tools actually run. The shape I backed into has three layers: a brain that decides, two different pairs of hands that do the work, and a memory that carries the evidence from one step to the next. That is the first-order picture. The second-order detail—and it’s the one that actually matters—is that one of those two pairs of hands can hold a brain of its own. The three-layer shape: a LangGraph orchestrating brain that decides the flow, an ACA tool plane that runs the dangerous and stateful work, and graph state that carries the evidence between them. The orchestrating brain stays deterministic and never holds a shell; when a model needs to think with one, that thinking is sealed inside a sandbox. The three problems agents create the moment they get hands A chat agent has none of these problems, which is why chat agents are easy while agents that do things are hard. Problem one is non-determinism. A model is great when you want it to weigh tradeoffs in a design document. It’s terrible when you want it to decide whether step three of a workflow should happen before or after step four. Workflows are product decisions. The order of operations doesn’t need a creative reinterpretation on every run. Problem two is dangerous code. The moment the agent can run a shell, it can also run the wrong shell. It can wipe the wrong directory. It can pip-install a package it found on a sketchy index. It can pull a token from an environment variable and quietly post it somewhere that isn’t yours. None of this is malice. It is what happens when a probabilistic process gets a deterministic side effect. Problem three is state across steps. A useful agent for non-trivial work needs a workspace. It checks out a repo, installs a toolchain, opens files, runs an analysis. The result of step one is the input to step three. If the workspace dies with the call, nothing accumulates. If it survives across calls, you have a different set of problems—but at least the right shape for the work. Three problems. They don’t all want the same solution. The trick is to give each one its own. The brain: A deterministic graph with one judgment node LangGraph is the brain in this design because it is deterministic where I want determinism. It decides what runs, in what order, with what state, and what happens if a node fails. It does not invent steps. It does not improvise the workflow on each run. It is boring in exactly the right way. Everything below runs on a sample I keep calling the factory, so it’s worth 30 seconds on what it models. The use case is an internal software factory: the shared platform team a large enterprise stands up so its business groups don’t each invent their own stack from scratch. A group shows up with an idea and a rough set of requirements. The factory reviews them against the organization’s approved technologies and best practices, rewrites the parts that don’t comply, folds in the operational signals the team supplied, and hands back a tech design the group can actually build from. It is small enough to read end to end and real enough to exercise every layer in this post. The factory sample has seven nodes in a straight line: intake normalization, a standalone reviewer agent, deterministic stack fixes, the Squad design step, the Dynamic Sessions signal-analysis step, the ACA Sandbox workspace step, and a final assembler that produces one markdown design document. One run through the seven-node factory graph, from intake to the final design document. Six nodes are plain TypeScript or a single bounded AI call; the one in the middle—squadTechDesign—is where judgment is allowed to drive. Six of those nodes are uncontroversial—plain TypeScript or a single bounded SDK call. The interesting one is the design step. Wiring the seven nodes is the boring half. LangGraph’s StateGraph takes a typed annotation and a list of addNode / addEdge calls and gives you back a compiled, synchronous-looking graph the rest of the app can invoke . The graph below is the whole orchestration layer; there is no other dispatcher anywhere in the codebase. // src/graph.ts squad-langgraph-aca:wip return new StateGraph FactoryStateAnnotation .addNode "intakeNormalize", intakeNormalize .addNode "reviewerAgent", reviewerAgent .addNode "applyApprovedStackFixes", applyApprovedStackFixes .addNode "squadTechDesign", squadTechDesign .addNode "runDynamicSession", runDynamicSessionNode .addNode "runSandboxWorkspace", runSandboxWorkspaceNode .addNode "assembleDesign", assembleDesign .addEdge START, "intakeNormalize" // ... linear edges in the same order ... .addEdge "runSandboxWorkspace", "assembleDesign" .addEdge "assembleDesign", END .compile ; The state every node reads and writes is one typed shape. Each node’s return value is a partial update—LangGraph merges it into the running state for the next node to read. No globals. No shared mutable singletons. The whole “memory between steps” story lives in one annotation declaration: js // src/graph.ts squad-langgraph-aca:wip const FactoryStateAnnotation = Annotation.Root { request: Annotation