OTTO: Give your AI agent a real browser without running a browser farm

wpnews.pro

cd /news/developer-tools/otto-give-your-ai-agent-a-real-brows… · home › topics › developer-tools › article

[ARTICLE · art-29283] src=dev.to ↗ pub=2026-06-16T10:02Z topic=developer-tools verified=true sentiment=↑ positive

OTTO: Give your AI agent a real browser without running a browser farm

A developer built Otto, an open-source tool that lets AI agents control real Chrome tabs from a CLI or LLM over a secure WebSocket relay, eliminating the need for headless browser farms or cloud-browser rentals. Otto splits responsibilities between deterministic code for browser mechanics and the model for high-level decisions, reducing token costs and latency. The tool includes an MCP server for agent integration and supports site-scoped command bundles.

read4 min views34 publishedJun 16, 2026

Otto controls real Chrome tabs from your CLI or your LLM over a secure relay — no headless farm, no cloud-browser rental. Here's how it works and why interaction should be code, not tokens:

Every time I needed an agent (or a test, or a monitor) to touch a real website, I hit the same wall: the browser. Not the automation logic — the infrastructure under it. You end up choosing between two bad options, and I got tired of both, so I built a third. It's open source and it's called Otto.

Option one: run a headless farm. Spin up Docker, manage a pool of Puppeteer/Playwright instances, keep them patched, scale them, and pray your IPs don't get flagged. It works, but it's a standing operational cost, and headless Chrome is not the same as the browser you actually use. Different fingerprint, different behavior, no logged-in session. Plenty of sites quietly serve headless traffic a worse — or broken — experience.

Option two: rent cloud browsers. Pay a per-session or per-minute fee to a hosted-browser service. Convenient until the bill scales with your usage, and you're still not running in your real session with your cookies.

What I actually wanted was dumb and simple: drive the real Chrome tab already sitting on my machine — logged in, warmed up, indistinguishable from me — from a script or an agent running somewhere else. No farm. No rental. Just the tab.

Otto is three pieces connected over authenticated WebSocket:

Controller (otto CLI / script)
        |  WebSocket (authenticated)
        v
  Relay daemon  (:8787)
        |  WebSocket (authenticated, node)
        v
  Extension node (Chrome)
        |  chrome.tabs / chrome.scripting
        v
  Browser tab (managed, site-scoped)

A lightweight Chrome extension turns a live tab into a node. A relay daemon authenticates connections, authorizes commands by scope, and routes them to the right node. A controller — the otto

CLI, a script, or an agent — sends command envelopes to the relay.

The nice consequence: the controller and the browser don't have to live on the same machine. Your agent can run on a server and drive a Chrome tab open on your laptop, because the relay handles routing and auth in between. Execution is serial per tab and parallel across tabs, with replay protection on every command.

This is the design decision I care about most, so let me be blunt about it.

If you wire an LLM directly to a browser and let it perform every interaction — "find the button," "click at these coordinates," "type this character" — you pay for all of that in tokens and latency. The model becomes a very expensive mouse.

Otto splits the responsibilities. Deterministic code handles the mechanics: opening tabs, navigating, querying the DOM, extracting text/markdown/clean HTML, screenshots, clicking, typing, intercepting network traffic. The model only decides what to do next. It calls a command, gets a structured result, and reasons about strategy — not about pixel coordinates.

Concretely, the surface looks like primitives plus site-scoped commands:

otto extract-content https://news.ycombinator.com   # → markdown

otto commands list --site reddit.com

Those site command bundles are versioned, shareable, and testable — you author a reusable command for a domain once instead of re-deriving the DOM dance on every run. And because there's an MCP server, an agent (Claude, an OpenAI tool-use loop, whatever speaks MCP) can call these directly as tools.

Requirements: Node.js 20+, Chrome, npm.

npm install -g @telepat/otto

otto setup


otto client register --name "my-laptop" --description "Local controller"
otto client login

otto commands list

For headless/CI use, otto setup --non-interactive

emits deterministic JSON with no TTY prompts, most commands take --json

, and otto logs follow --source all

streams structured events by requestId

so you can correlate relay, controller, and node in real time while debugging an agent run.

Handing automation a real logged-in browser is powerful, so the defaults are conservative:

Otto is built for developers and automation teams who need real browser context — integration tests against logged-in flows, uptime/monitoring on pages that gate headless traffic, scraping that has to look human, and agent workflows that read and act on live sites. If your use case is genuinely fine with headless, you may not need this. If headless keeps lying to you, this is the alternative.

It's MIT-licensed and on npm as @telepat/otto. Repo and docs:

It's early. I'd genuinely rather hear where the design is wrong than collect polite stars — so if you run it and something feels off (the relay model, the security assumptions, the command-bundle approach), tell me in the comments. What would you point it at first?

source & further reading

dev.to — original article Claude Code + OpenRouter: The Setup Guide That Actually Explains Things AI Roundup Jul 31: OpenAI's 80% Price Cut, Whole-Body Robotics, and the Pacing-the-Frontier Letter The Silent Squeeze 🤫: How the AI Infrastructure Boom is Breaking the Gaming Industry

~/api · this article 200

$curl api.wpnews.pro/v1/news/otto-give-your-ai-agent-…

Read original on dev.to → dev.to/sebivaduva/otto-give-your-ai-agent-a-real…

mentioned entities

Otto

Chrome

WebSocket

MCP

Claude

OpenAI

Puppeteer

Playwright

metadata

slugotto-give-your-ai-agent-a-real-browser-without-running-a-browser-farm

topic#developer-tools

secondary2 topics

sentimentpositive

canonicaldev.to

navigation

← prevThe same ride on Uber and Lyft, …

next →Agent Series (21): Harness Testi…

── more in #developer-tools 4 stories · sorted by recency

dev.to · 31 Jul · #developer-tools

Debugging a black box: 36 renders against Claude, and the part where my own data was wrong

dev.to · 31 Jul · #developer-tools

I Wired 11 WordPress Sites to an LLM With MCP. Here’s Some Things That Broke.

dev.to · 31 Jul · #developer-tools

Series reading guide — How to pick which of these 9 Trace Lock posts to read

dev.to · 31 Jul · #developer-tools

The July Model Wave Is Not a Race You Need to Win

── more on @otto 3 stories trending now

wpnews · 30 Jul · #artificial-intelligence

Microsoft and Meta Earnings Show Different AI Spending Pressures

wpnews · 31 Jul · #artificial-intelligence

Rewriting a Six-Year-Old Personal Project with AI

wpnews · 31 Jul · #artificial-intelligence

Microsoft doubles down on multi-model AI as it builds a Copilot super app

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required