{"slug": "building-an-outbound-only-websocket-bridge-for-local-ai-agents", "title": "Building an outbound-only WebSocket bridge for local AI agents", "summary": "The article describes the development of CTRL NODE, a browser-based control plane that enables remote management of local AI agents through an outbound-only WebSocket bridge. The Bridge is a lightweight Node.js daemon that connects outward to the cloud, allowing commands to be pushed down the connection without the local machine ever accepting an inbound connection or exposing a public port. Key technical features include heartbeat messages every 20 seconds to prevent cloud load balancers from killing idle connections, exponential backoff for reconnection, and an in-memory queue to prevent data loss during reconnection events.", "body_md": "I work with AI agents every day. Claude Code, Copilot, Gemini CLI — running locally, with access to my filesystem, my repos, my tools. The results are genuinely good. But there's a wall: **the moment you leave your desk, you lose control**. There's no real way to kick off an agent task from your phone, monitor a long-running pipeline from a coffee shop, or schedule something to run overnight.\n\nEvery solution I found had the same trade-off: you either open a port, install a tunnel daemon, or upload your code to someone's cloud. None of those felt right for infrastructure that has access to your local filesystem.\n\nSo I built [CTRL NODE](https://ctrlnode.ai) — a browser-based control plane for local AI agents. The key piece is a process called the **Bridge**: a lightweight Node.js daemon that runs on your machine and connects to the cloud without ever accepting an inbound connection.\n\nThis article is about how that works, why the design choices matter, and what the actual code looks like.\n\n## Why outbound-only?\n\nThe naive approach is to expose your local agent runtime on a port and let the cloud reach in. Tools like ngrok do exactly this — they create a reverse proxy to your localhost. It works, but it has real costs:\n\n-\n**Open port = attack surface.** Every ngrok tunnel is a publicly reachable endpoint. If auth breaks, someone else can talk to your agent. -\n**Third-party traffic relay.** Your prompts, file paths, and agent responses travel through ngrok's infrastructure. -\n**Daemon complexity.** You're running persistent infrastructure that you didn't write and can't audit easily.\n\nThe alternative: flip the connection direction. The Bridge connects *out* to the cloud. The cloud pushes commands *down* that connection. The local machine never listens on a public port.\n\n```\nYour machine                     ctrlnode.ai cloud\n──────────────────────────────────────────────────\nBridge ──── ws:// connect() ────▶ WebSocket server\n       ◀─── {action: \"run_task\", ...} ────────────\n       ─────  stdout/stderr events ──────────────▶\n```\n\nThis is the same pattern used by IoT devices, CI agents (like the GitHub Actions runner), and remote desktop clients. The cloud doesn't initiate — it waits.\n\n## The connection lifecycle\n\nHere's the core of `websocket.ts`\n\n:\n\n``` js\nexport function connect(): void {\n  const url = buildWsUrl();\n  ws = new WebSocket(url, { headers: buildAuthHeaders() });\n\n  ws.on(\"open\", () => {\n    logger.info(\"Bridge connected to SAAS\");\n    flushPendingQueue();\n    startHeartbeat();\n  });\n\n  ws.on(\"message\", (data: WebSocket.RawData) => {\n    const message = JSON.parse(data.toString()) as InboundMessage;\n    handleInboundMessage(message);\n  });\n\n  ws.on(\"close\", (code: number, reason: Buffer) => {\n    stopHeartbeat();\n    if (isAuthError(code, reason.toString())) {\n      logger.warn(`Auth error (${code}), retrying in ${AUTH_RETRY_MS / 1000}s`);\n      setTimeout(connect, AUTH_RETRY_MS);\n    } else {\n      scheduleReconnect();\n    }\n  });\n\n  ws.on(\"error\", (err: Error) => {\n    logger.error(`WebSocket error: ${err.message}`);\n  });\n}\n```\n\nThree things to notice:\n\n**Auth errors get a longer timeout.** If the server returns 1008 (Policy Violation) or 1002, or the reason string contains`\"401\"`\n\n/`\"403\"`\n\n/`\"Unauthorized\"`\n\n, we wait 30 seconds before retrying. Hammering an auth-rejected endpoint is pointless and noisy.**Normal closes trigger exponential backoff.**`scheduleReconnect()`\n\nuses a standard backoff so a transient network blip doesn't flood logs.**On open, we flush the queue.** More on this below.\n\n## Keeping the connection alive through load balancers\n\nCloud load balancers will kill idle WebSocket connections after 30–60 seconds. The fix is a heartbeat:\n\n``` js\nconst HEARTBEAT_INTERVAL_MS = 20_000;\nlet heartbeatTimer: NodeJS.Timeout | null = null;\n\nfunction startHeartbeat(): void {\n  heartbeatTimer = setInterval(() => {\n    sendToSaas({ type: \"heartbeat\", timestamp: Date.now() });\n  }, HEARTBEAT_INTERVAL_MS);\n}\n\nfunction stopHeartbeat(): void {\n  if (heartbeatTimer) {\n    clearInterval(heartbeatTimer);\n    heartbeatTimer = null;\n  }\n}\n```\n\nEvery 20 seconds, a small message goes up. The server acknowledges it (or doesn't — we don't care, the goal is just to keep TCP active). This is cheap and it works reliably with AWS ALB, Cloudflare, and most managed WebSocket proxies.\n\n## Buffering outbound messages during disconnection\n\nWhen the Bridge is reconnecting, agent output still arrives. If we drop those events, the user watching a pipeline in their browser sees a gap in the live log. The solution is a small in-memory queue:\n\n``` js\nconst PENDING_QUEUE_MAX = 100;\nconst pendingQueue: OutboundMessage[] = [];\n\nexport function sendToSaas(message: OutboundMessage): void {\n  if (!ws || ws.readyState !== WebSocket.OPEN) {\n    if (pendingQueue.length < PENDING_QUEUE_MAX) {\n      pendingQueue.push(message);\n    }\n    return;\n  }\n  ws.send(JSON.stringify(message));\n}\n\nfunction flushPendingQueue(): void {\n  while (pendingQueue.length > 0) {\n    const msg = pendingQueue.shift()!;\n    ws!.send(JSON.stringify(msg));\n  }\n}\n```\n\nCap at 100 messages, flush on reconnect. Simple, and it handles the common case of a 2–3 second reconnect window without losing events.\n\n## Multi-agent routing via the filesystem\n\nHere's the part that took the most thought: how do you run multiple agents — Claude, Copilot, Gemini — on the same machine, routing tasks to the right one?\n\nThe answer isn't a routing layer in the WebSocket code. It's the **filesystem**.\n\nEach pipeline task gets an isolated directory:\n\n```\nworkspace/\n  tasks/\n    task-abc123/\n      input/\n        TASK.md          ← instructions for the agent\n        context-files/   ← any files the user attached\n      output/\n        TASK.md          ← agent writes progress here\n        artifacts/       ← anything the agent produces\n```\n\nThe Bridge watches these directories. When a `run_task`\n\ncommand arrives:\n\n```\ncase \"run_task\": {\n  const { taskId, agentProvider, workspacePath } = message.payload;\n  const provider = getProvider(agentProvider); // Claude | Copilot | Gemini | ...\n  await provider.executeTask(taskId, workspacePath);\n  break;\n}\n```\n\nEach provider implementation knows how to invoke its agent CLI with the right arguments and working directory. Claude Code gets `claude --print`\n\nwith the task directory. Copilot gets its own invocation. They never share context — each runs in its own subprocess, reading from and writing to its own task folder.\n\nThis means:\n\n-\n**No prompt pollution.** Agent A's context doesn't leak into Agent B. -\n**Parallel execution.** Two agents can run simultaneously without coordination overhead. -\n**Auditability.** Every task leaves a paper trail on disk. -\n**Portability.** The cloud control plane never sees your file contents. It only sees task metadata and status events.\n\n## Provider selection and gating\n\nSome actions only make sense for certain providers. The message handler maintains an explicit set:\n\n``` js\nconst OPENCLAW_ONLY_ACTIONS = new Set([\n  \"openclaw_configure\",\n  \"openclaw_stream_chunk\",\n  \"openclaw_reset_context\",\n]);\n\nfunction handleInboundMessage(message: InboundMessage): void {\n  if (OPENCLAW_ONLY_ACTIONS.has(message.action) && activeProvider !== \"openclaw\") {\n    logger.warn(`Received ${message.action} but provider is ${activeProvider} — ignoring`);\n    return;\n  }\n  // ... dispatch to handler\n}\n```\n\nThis prevents misconfigured cloud deployments from accidentally sending the wrong command type to the wrong agent. The Bridge is the last line of defense before your filesystem.\n\n## The startup sequence\n\n`index.ts`\n\nties it together:\n\n``` js\nasync function main(): Promise<void> {\n  const providers = await createProviders(config);\n  const multi = new MultiProvider(providers);\n\n  connect(); // start WebSocket, non-blocking\n\n  const keepaliveInterval = setInterval(() => {}, 1 << 30);\n  keepaliveInterval.unref(); // don't prevent process exit\n\n  process.on(\"SIGINT\", gracefulShutdown);\n  process.on(\"SIGTERM\", gracefulShutdown);\n\n  await multi.runSyncAgents(); // provider-specific background sync\n}\n```\n\nThe `keepaliveInterval`\n\ntrick (`unref()`\n\n) is worth noting: it keeps the event loop alive when nothing else is pending, but doesn't prevent a clean `SIGINT`\n\n/`SIGTERM`\n\nfrom shutting the process down. Without it, `connect()`\n\nis async and Node exits immediately after starting.\n\n## What this enables\n\nWith the Bridge running, the CTRL NODE web app can:\n\n-\n**Launch tasks** against any connected agent from any browser, anywhere -\n**Watch live output** streamed back over the same WebSocket -\n**Schedule routines**— the cloud scheduler wakes the Bridge at the configured time, no cron job needed on the local machine -\n**Run multi-step pipelines** where each node can use a different agent\n\nNone of your code leaves your machine. The cloud only sees: \"task started\", \"task output line\", \"task completed\". The actual file contents, prompts, and agent context stay local.\n\n## Why open source?\n\nThe Bridge is MIT licensed ([github.com/ctrlnode-ai/ctrlnode](https://github.com/ctrlnode-ai/ctrlnode)). You can read every line of the WebSocket handler, every message type, every auth check. If you don't trust the binary, build it yourself.\n\nThe rest of CTRL NODE — the cloud scheduler, the web app, the real-time pipeline view — runs as a hosted service. The Bridge is the trust boundary: it's the piece that runs with access to your local system, and it needs to be auditable.\n\n## Try it\n\nIf you work with AI agents and want a way to control them remotely without sacrificing privacy:\n\n- Install the Bridge:\n`npm install -g @ctrlnode/bridge && ctrlnode bridge start`\n\n- Sign up at\n[ctrlnode.ai](https://ctrlnode.ai)— it's free - Open the web app from anywhere and connect\n\nQuestions, issues, or PRs: [github.com/ctrlnode-ai/ctrlnode](https://github.com/ctrlnode-ai/ctrlnode) or reply here.\n\n*Javier Vil — Creator of CTRL NODE*", "url": "https://wpnews.pro/news/building-an-outbound-only-websocket-bridge-for-local-ai-agents", "canonical_source": "https://dev.to/ctrlnodeai/building-an-outbound-only-websocket-bridge-for-local-ai-agents-2o6g", "published_at": "2026-05-23 00:25:34+00:00", "updated_at": "2026-05-23 00:31:26.599413+00:00", "lang": "en", "topics": ["developer-tools", "artificial-intelligence", "cloud-computing", "cybersecurity", "open-source"], "entities": ["Claude Code", "Copilot", "Gemini CLI", "CTRL NODE", "Bridge", "Node.js", "ngrok"], "alternates": {"html": "https://wpnews.pro/news/building-an-outbound-only-websocket-bridge-for-local-ai-agents", "markdown": "https://wpnews.pro/news/building-an-outbound-only-websocket-bridge-for-local-ai-agents.md", "text": "https://wpnews.pro/news/building-an-outbound-only-websocket-bridge-for-local-ai-agents.txt", "jsonld": "https://wpnews.pro/news/building-an-outbound-only-websocket-bridge-for-local-ai-agents.jsonld"}}