{"slug": "my-ai-assistant-needed-a-control-plane-not-a-bigger-loop", "title": "\"My AI Assistant Needed a Control Plane, Not a Bigger Loop\"", "summary": "A developer building a local AI assistant found that adding more tools to a single processing loop created an unmanageable system, as the assistant struggled to distinguish between new tasks, follow-ups, status checks, and cancellations while coordinating with other running processes. The solution was to replace the monolithic loop with a \"local control plane\" architecture called CliGate, separating the system into distinct planes for user experience, assistant control, runtime execution, and model access. By introducing an observation plane that provides compact state facts instead of raw logs, and a file-based memory system that stores reusable procedures, the assistant can now handle concurrent runs, status queries, and task delegation without growing its context window or becoming less understandable.", "body_md": "I kept trying to make my AI assistant smarter by adding more tools to the same loop.\n\nThat worked for a while. Then the assistant had to do normal user things: continue a Codex task from chat, answer a status question from DingTalk, remember how a desktop workflow succeeded, wait behind another run that was using the mouse, and still route Claude Code traffic through the same localhost server.\n\nAt that point the problem was no longer \"how many tools can one agent call?\"\n\nThe problem was architecture.\n\nThe first shape was simple:\n\n``` php\nuser message -> assistant loop -> tools -> answer\n```\n\nThat is fine for a demo. It is not fine for a resident assistant.\n\nA resident assistant has to know whether a message is a new task, a follow-up, a status check, a correction, or a cancellation. It has to avoid stealing the desktop from another running task. It has to remember procedures without shoving every old transcript into context. It has to delegate coding work to Codex or Claude Code without pretending it is the executor.\n\nThose are different jobs. When I kept them inside one loop, every fix made the loop more capable and less understandable.\n\nSo I stopped thinking about the assistant as one agent and started treating it as a local control plane.\n\nIn CliGate, the architecture now looks more like this:\n\n``` php\nExperience Plane\n  -> Assistant Control Plane\n  -> Runtime Execution Plane\n  -> Proxy / Model Access Plane\n\nObservation Plane + Memory / Policy Plane sit across the side.\n```\n\nThe names sound formal, but the boundaries are practical.\n\nThe **experience plane** owns where the user is talking from: dashboard chat, assistant tasks, Telegram, Feishu, DingTalk, scheduled jobs.\n\nThe **assistant control plane** decides what kind of work this is. Should it answer from state? Should it start a task? Should it continue an existing one? Should it wait because the desktop is already held by another run?\n\nThe **runtime execution plane** is where Codex and Claude Code live. They do the actual coding work. The assistant can dispatch, continue, summarize, and coordinate them, but it does not need to become a worse version of them.\n\nThe **proxy/model access plane** handles the boring but necessary provider work: protocol translation, account pools, API keys, routing, model mapping, request logs, and usage.\n\nThe side planes are what keep the assistant sane:\n\nThe biggest improvement came from making the assistant consume observations instead of raw logs.\n\nIf a Codex run is waiting for approval, the assistant should not read a giant transcript to rediscover that. It should see a compact fact:\n\n\"Task X is waiting for approval to run command Y.\"\n\nIf another assistant run is currently driving the desktop, a new run should not guess from chat history. It should see a resource holder:\n\n\"desktop is held by run R.\"\n\nThat one change made status questions, cancellation, follow-ups, and concurrent runs much less fragile. The assistant no longer has to infer the system state from the last few messages. The system gives it a state model.\n\nI also learned that \"remembering\" is not the same as stuffing more chat history into a prompt.\n\nFor this assistant, memory is file-based and scoped. It can store a workflow, a fact, a standing directive, or a reference. On the next similar request, the prompt only gets a small memory index. If the assistant thinks one entry matters, it explicitly recalls the body.\n\nThat keeps the default context small while still letting the assistant learn things like:\n\nFor procedure memories, the rule is verify-then-trust. Try the remembered steps, but confirm the UI still matches. If it changed, explore again and update the memory after success.\n\nThat is closer to how I want a practical assistant to evolve: not by growing a huge transcript, but by distilling successful work into reusable units.\n\nLocal AI tooling is messy in a specific way.\n\nThe user may have Claude Code, Codex CLI, Gemini CLI, OpenClaw, a browser session, a desktop app, a Telegram channel, and several provider accounts. The hard part is not only making one model call. The hard part is keeping all of those pieces coordinated without turning the assistant into an opaque supervisor that hijacks every message.\n\nThat is why CliGate still keeps a direct runtime path. If the user is already talking to a Codex session, the message can go straight there. The assistant control plane is for explicit coordination, background tasks, memory, policy, desktop work, and cross-channel workflows.\n\nThe split is not glamorous, but it is the difference between an impressive demo and a tool I can leave running.\n\nI used to ask: how do I make the assistant loop smarter?\n\nNow I ask: which plane should own this responsibility?\n\nThat question has prevented a lot of accidental complexity. It keeps provider routing out of the assistant loop, execution inside dedicated runtimes, observations out of raw logs, and memory out of unbounded chat history.\n\nThe project is open source here: [CliGate](https://github.com/codeking-ai/cligate).\n\nIf you are building agents around existing tools, are you putting everything inside one loop, or are you starting to split control, execution, observation, and memory too?", "url": "https://wpnews.pro/news/my-ai-assistant-needed-a-control-plane-not-a-bigger-loop", "canonical_source": "https://dev.to/codekingai/my-ai-assistant-needed-a-control-plane-not-a-bigger-loop-15aa", "published_at": "2026-06-03 08:34:33+00:00", "updated_at": "2026-06-03 08:42:44.946307+00:00", "lang": "en", "topics": ["ai-agents", "ai-infrastructure", "ai-tools", "ai-products", "artificial-intelligence"], "entities": ["Codex", "DingTalk", "Claude Code", "CliGate"], "alternates": {"html": "https://wpnews.pro/news/my-ai-assistant-needed-a-control-plane-not-a-bigger-loop", "markdown": "https://wpnews.pro/news/my-ai-assistant-needed-a-control-plane-not-a-bigger-loop.md", "text": "https://wpnews.pro/news/my-ai-assistant-needed-a-control-plane-not-a-bigger-loop.txt", "jsonld": "https://wpnews.pro/news/my-ai-assistant-needed-a-control-plane-not-a-bigger-loop.jsonld"}}