cd /news/ai-agents/my-ai-assistant-needed-a-control-pla… · home topics ai-agents article
[ARTICLE · art-20145] src=dev.to pub= topic=ai-agents verified=true sentiment=· neutral

"My AI Assistant Needed a Control Plane, Not a Bigger Loop"

A developer building a local AI assistant found that adding more tools to a single processing loop created an unmanageable system, as the assistant struggled to distinguish between new tasks, follow-ups, status checks, and cancellations while coordinating with other running processes. The solution was to replace the monolithic loop with a "local control plane" architecture called CliGate, separating the system into distinct planes for user experience, assistant control, runtime execution, and model access. By introducing an observation plane that provides compact state facts instead of raw logs, and a file-based memory system that stores reusable procedures, the assistant can now handle concurrent runs, status queries, and task delegation without growing its context window or becoming less understandable.

read4 min publishedJun 3, 2026

I kept trying to make my AI assistant smarter by adding more tools to the same loop.

That worked for a while. Then the assistant had to do normal user things: continue a Codex task from chat, answer a status question from DingTalk, remember how a desktop workflow succeeded, wait behind another run that was using the mouse, and still route Claude Code traffic through the same localhost server.

At that point the problem was no longer "how many tools can one agent call?"

The problem was architecture.

The first shape was simple:

user message -> assistant loop -> tools -> answer

That is fine for a demo. It is not fine for a resident assistant.

A resident assistant has to know whether a message is a new task, a follow-up, a status check, a correction, or a cancellation. It has to avoid stealing the desktop from another running task. It has to remember procedures without shoving every old transcript into context. It has to delegate coding work to Codex or Claude Code without pretending it is the executor.

Those are different jobs. When I kept them inside one loop, every fix made the loop more capable and less understandable.

So I stopped thinking about the assistant as one agent and started treating it as a local control plane.

In CliGate, the architecture now looks more like this:

Experience Plane
  -> Assistant Control Plane
  -> Runtime Execution Plane
  -> Proxy / Model Access Plane

Observation Plane + Memory / Policy Plane sit across the side.

The names sound formal, but the boundaries are practical.

The experience plane owns where the user is talking from: dashboard chat, assistant tasks, Telegram, Feishu, DingTalk, scheduled jobs.

The assistant control plane decides what kind of work this is. Should it answer from state? Should it start a task? Should it continue an existing one? Should it wait because the desktop is already held by another run?

The runtime execution plane is where Codex and Claude Code live. They do the actual coding work. The assistant can dispatch, continue, summarize, and coordinate them, but it does not need to become a worse version of them.

The proxy/model access plane handles the boring but necessary provider work: protocol translation, account pools, API keys, routing, model mapping, request logs, and usage.

The side planes are what keep the assistant sane:

The biggest improvement came from making the assistant consume observations instead of raw logs.

If a Codex run is waiting for approval, the assistant should not read a giant transcript to rediscover that. It should see a compact fact:

"Task X is waiting for approval to run command Y."

If another assistant run is currently driving the desktop, a new run should not guess from chat history. It should see a resource holder:

"desktop is held by run R."

That one change made status questions, cancellation, follow-ups, and concurrent runs much less fragile. The assistant no longer has to infer the system state from the last few messages. The system gives it a state model.

I also learned that "remembering" is not the same as stuffing more chat history into a prompt.

For this assistant, memory is file-based and scoped. It can store a workflow, a fact, a standing directive, or a reference. On the next similar request, the prompt only gets a small memory index. If the assistant thinks one entry matters, it explicitly recalls the body.

That keeps the default context small while still letting the assistant learn things like:

For procedure memories, the rule is verify-then-trust. Try the remembered steps, but confirm the UI still matches. If it changed, explore again and update the memory after success.

That is closer to how I want a practical assistant to evolve: not by growing a huge transcript, but by distilling successful work into reusable units.

Local AI tooling is messy in a specific way.

The user may have Claude Code, Codex CLI, Gemini CLI, OpenClaw, a browser session, a desktop app, a Telegram channel, and several provider accounts. The hard part is not only making one model call. The hard part is keeping all of those pieces coordinated without turning the assistant into an opaque supervisor that hijacks every message.

That is why CliGate still keeps a direct runtime path. If the user is already talking to a Codex session, the message can go straight there. The assistant control plane is for explicit coordination, background tasks, memory, policy, desktop work, and cross-channel workflows.

The split is not glamorous, but it is the difference between an impressive demo and a tool I can leave running.

I used to ask: how do I make the assistant loop smarter?

Now I ask: which plane should own this responsibility?

That question has prevented a lot of accidental complexity. It keeps provider routing out of the assistant loop, execution inside dedicated runtimes, observations out of raw logs, and memory out of unbounded chat history.

The project is open source here: CliGate.

If you are building agents around existing tools, are you putting everything inside one loop, or are you starting to split control, execution, observation, and memory too?

── more in #ai-agents 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/my-ai-assistant-need…] indexed:0 read:4min 2026-06-03 ·