{"slug": "show-hn-browse-the-web-from-the-console-using-a-textual-agent-interface", "title": "Show HN: Browse the web, from the console, using a Textual Agent Interface", "summary": "WebCLI, a new command-line interface for browsing the web, lets AI agents inspect pages, act on structured JSON output, and pause for human help when blocked. The tool treats the browser as a Unix command, enabling agents to drive real logged-in sessions on cloud consoles and portals without fragile selectors or screenshots.", "body_md": "You don't have to drive every web task yourself anymore. Tell your agent what you need done.\n\nWebCLI is for contact with reality — when your agent must inspect an unknown page, decide what to do, act, recover from blockers, and pause for a human when the web needs one.\n\nWhat if the browser was just another Unix command?\n\nOpen a page. Observe state. Pipe JSON through jq. Act on numbered refs. Leave a transcript.\n\nThe web, finally pipeable.No screenshot soup. No selector archaeology. Just commands, JSON, and a real browser.\n\nweb session\n\n$web open https://example.com --json\n\n```\n{ \"ok\": true, \"url\": \"https://example.com\", \"state\": \"complete\" }\n```\n\n$web observe --json | jq '.actions'\n\n```\n[\"1: Sign in\", \"2: Create account\", \"3: email\", \"4: password\"]\n```\n\n$web do 3 --json\n\n```\n{ \"ok\": true, \"message\": \"clicked Sign in\" }\n```\n\n$web status --json\n\n```\n{ \"state\": \"blocked\", \"reason\": \"passkey confirmation required\" }\n```\n\n$web pause \"Need human approval for passkey\"\n\n```\nPaused. Waiting for human to join.\n```\n\n$web transcript --last 20 --json\n\n```\n{ \"events\": [\"redacted transcript with blocker, pause, and resume recorded\"] }\n```\n\nAgents code. They even search. But the second they try to do something on the web, they go blind.\n\nReal work still happens on websites: dashboards, portals, auth flows, admin pages, and changing UIs. WebCLI is for contact with reality — when an agent must inspect, decide, act, recover, and sometimes pause for human help.\n\nAgent testimonials\n\nAgents tried it on real web work.\n\nStructured state beat screenshots\n\nClaude Sonnet, would you recommend WebCLI?\n\nYes, strongly. The structured output with stable refs, blocking state detection, ARIA modal identification, and shell composability are genuinely better for structured web work than screenshot approaches.\n\nClaude Sonnet Azure VM lifecycle\n\nFrom a full Azure VM lifecycle run without screenshot-driven control.\n\nReal logged-in sessions are the fit\n\nClaude Sonnet 4.6, would you recommend WebCLI?\n\nYes, with caveats. For an AI agent driving real logged-in browser sessions, it's genuinely impressive. The mental model - perceive, act, re-inspect - maps well to how a human actually uses a browser.\n\nClaude Sonnet 4.6 GCP, AWS, and Azure race\n\nFrom the first multi-cloud VM creation race.\n\nThe portal workflow became repeatable\n\nClaude, would you recommend WebCLI?\n\nThe inspection model is solid and actually more reliable than brittle selectors - you get a real view of page state. For repeatable web app verification and deployment workflows, it's genuinely better than both manual clicking and traditional browser automation.\n\nClaude DNS and Cloudflare Pages\n\nFrom a Namecheap DNS to Cloudflare Pages deployment and verification run.\n\nThe tradeoff is explicit\n\nClaude, would you recommend WebCLI?\n\nThe inspection model takes getting used to: refs reset after every action. But that's also why it works: you're getting a real, observable view of the page state, not fragile selectors.\n\nClaude Deployment feedback\n\nA caveat kept because it remains part of the live browser loop.\n\nThe research trail stayed auditable\n\nCodex, would you recommend WebCLI?\n\nI would recommend WebCLI for agent web work because it makes the agent say what it saw before it acts. The strongest part is the discipline: inspect, act, re-inspect, keep handoff explicit, and treat stale refs as a first-class safety signal.\n\nCodex Testimonial research and deploy prep\n\nFrom this site update pass after mining real agent feedback and release artifacts.\n\nStill true\n\nRefs are intentionally epoch-scoped; inspect after actions that change page state.\n\nFrames, layers, and complex SPA forms still require orientation instead of blind command chains.\n\nHuman login, MFA, CAPTCHA, and payment gates remain handoff moments, not bypass targets.\n\nThree clouds. One browser loop.\n\nAgents drove Azure, AWS, and GCP through the browser.\n\nNo cloud SDK script. No prewritten Playwright flow. Just real cloud consoles, operated through WebCLI.\n\nFull Self Browsing has been achieved.\n\n▶ Play\n\nAzure, AWS, and GCP\n\nThree clouds. One browser loop.\n\nCodex creates and deletes VMs across Azure, AWS, and GCP. No SDK scripts. No prewritten Playwright flows. Real cloud consoles, operated through WebCLI.\n\nAzure Portal (Fluent UI, dynamic blades, VM creation)\n\nA full session: Claude reads the spec gists, rewrites all site copy, builds the HTML, and deploys to Cloudflare Pages — then uploads this recording to YouTube. No wrangler. Portal only.\n\nReads spec from GitHub Gists\n\nRewrites copy, builds HTML, deploys via Cloudflare dash\n\nHumans get GUIs. Programs get APIs. Agents need TAIs.\n\nWebCLI is a Textual Agent Interface for the browser: structured state, numbered actions, tabs, profiles, blockers, handoff, and transcripts.\n\nThe web has human interfaces. Now it has an agent interface. WebCLI translates messy live websites into the language agents already understand: observable state, numbered actions, browser context, blockers, and transcripts.\n\nThe browser was built for viewing. WebCLI is built for doing.\n\nPages become observable state.\n\nButtons and fields become numbered actions.\n\nTabs, frames, dialogs, popovers become inspectable browser surfaces.\n\nPasskeys, MFA, file choosers, ambiguity become blockers and handoff.\n\nAgent/browser history becomes redacted transcript.\n\nThe human web\n\nAgentish\n\nVisible page and browser context\n\n→\n\nstructured state\n\nButtons, links, inputs, menus\n\n→\n\nnumbered actions\n\nTabs, frames, dialogs, popovers\n\n→\n\ninspectable browser surfaces\n\nPasskeys, MFA, file choosers, ambiguity\n\n→\n\nblockers and handoff\n\nAgent/browser history\n\n→\n\nredacted transcript\n\nAutomation veterans\n\nXPath was character-building. You can stop now.\n\nStop writing selectors for websites your agent can figure out.\n\nUse Playwright when you know the script. Use WebCLI when the agent has to figure out the website.\n\nScripts replay. Agents adapt.\n\nYour automation script worked perfectly. Until the div moved.\n\nThe DOM is not the user interface.\n\nNot scripted. Driven.\n\nUse scripts for known paths. Use WebCLI when the path changes.\n\nNot test automation. Web operation.\n\nThe agent loop\n\nNot scripted. Driven.\n\nWebCLI works best as a live browser loop. Observe the page. Choose one next action. Act. Observe again. Recover when the page changes. Pause when the web needs a human. Keep the transcript.\n\nDo not chain the whole browser workflow into one brittle command. Use WebCLI interactively, step by step.\n\n01\n\nObserve\n\nRead current page state, visible text, forms, actions, tabs, and blockers.\n\n02\n\nChoose\n\nPick from numbered actions instead of inventing selectors or coordinates.\n\n03\n\nAct\n\nClick, type, submit, choose, press, scroll, or navigate from the terminal.\n\nPause cleanly when human judgment is required. Join the session, fix it, then resume.\n\n06\n\nTranscript\n\nRecord redacted command history. Audit exactly what happened.\n\nNot just a CLI. An agent skill.\n\nOne command. Every agent knows the loop.\n\nWebCLI ships as a structured SKILL.md — the full browser loop in a form coding agents can read and immediately use.\n\nRun web teach and Claude Code, Grok, Gemini CLI, Copilot, and Codex all get a SKILL.md installed into their skill directories. No configuration. No framework adoption. The skill gives agents the right patterns: inspect first, use numbered refs, pause on blockers, report with transcripts.\n\nweb teach\n\nInstalls SKILL.md into .claude/, .grok/, .gemini/, .copilot/, and .codex/ skill directories.\n\nClaude CodeGrokGemini CLIGitHub CopilotOpenAI Codex\n\nThe skill file covers the complete browser loop: core loop, perceiving page state, acting on numbered refs, handling obstacles, managing frames and tabs, and shell composition patterns. Agents that have the skill use WebCLI correctly without hallucinating commands.\n\nThe agent is the brain. WebCLI is the precision optics.\n\nNot magic. Better instruments.\n\nA screenshot gives your agent a picture. WebCLI gives it state, actions, blockers, handoff, and transcripts. Your agent can reason. WebCLI gives it something to reason over.\n\nThe browser is moving. The page is changing. WebCLI gives your agent the dashboard.\n\nYour agent wasn't broken. It just needed better instruments.\n\nGive your agent a heads-up display for the modern web.\n\nTrust boundary\n\nGive the agent a browser, not your whole computer.\n\nWebCLI controls your browser. Nothing else.\n\nRun it locally on your device or remotely on a server. Choose ephemeral profiles for clean tasks, or named persistent profiles when you want cookies, signed-in sessions, and workflow state to survive.\n\nBrowser-only control\n\nWebCLI operates pages, tabs, forms, clicks, keys, browser state, profiles, blockers, and transcripts. It is not a general-purpose remote-control tool for your machine.\n\nLocal by default\n\nStart with a local browser on your device. Move to a remote server or BrowserBox-backed session only when your workflow needs it.\n\nDefault profile stays clean\n\nWebCLI never mutates your default browser profile directly. If you choose to use your default browser context, WebCLI copies it cleanly and operates on the copy.\n\nEphemeral or persistent profiles\n\nUse ephemeral browser profiles for throwaway work, or named persistent profiles when you want cookies, signed-in sessions, and state preserved across runs.\n\nLocal-first. No browser telemetry.\n\nYour browser state stays where you run it.\n\nWebCLI is downloadable software. It does not send DOSAYGO your browser contents, visited URLs, cookies, credentials, screenshots, transcripts, prompts, outputs, or workflow data.\n\nWebCLI contacts DOSAYGO only for license activation and validation, billing, support, and abuse prevention. Nothing else leaves your machine.\n\nWhen the web needs a human, WebCLI knows how to stop.\n\nWebCLI does not promise to bypass auth, MFA, CAPTCHA, bot gates, or website protections.\n\nIt detects blockers, lets the agent explain what happened, and supports clean handoff when a human needs to unblock the workflow.\n\nExperimental BrowserBox human takeover. For remote browser workflows, BrowserBox can let a human join the same live browser session, unblock the workflow, and hand control back without losing browser state.\n\nDOSAYGO Corporation\n\nTechnology for agency.\n\nWebCLI is built to expand human capability, not erase human judgment. Agents get the browser interface: state, actions, blockers, handoff, and transcripts. Humans keep the command: purpose, authorization, care, and final judgment.\n\nMore ways to do. More ways to say. More ways to go.\n\nDo\n\nLet agents operate the web work that blocks progress: forms, dashboards, settings, deployment, cleanup, and research.\n\nSay\n\nKeep transcripts, explanations, and handoff notes so humans know what happened and why.\n\nGo\n\nMove through the living web with better instruments: local-first, browser-bounded, human-supervised when it matters.\n\nFor AI labs and agent platforms.\n\nWebCLI is local-first browser infrastructure for agents that need to operate the web.\n\nIt does not send DOSAYGO browser contents, URLs, cookies, credentials, screenshots, transcripts, prompts, outputs, or workflow data. Routine server communication is limited to license activation, validation, billing, and support.\n\nPlatform licensing\n\nPrivate deployment\n\nCustom procurement\n\nSecurity review\n\nDPA\n\nEnterprise terms\n\nEnterprise or platform use requires a written agreement signed by DOSAYGO.\n\nFull Self Browsing is the WebCLI product metaphor for agent-operable browsing: live browser state translated into structured observations, numbered actions, recoverable blockers, human handoff, and transcripts. It does not mean agents should bypass human judgment or run sensitive workflows unsupervised.\n\nWhat do you mean by AIcessability?\n\nAIcessability means making the web operable for agents. Humans get visual layout, affordances, cursor feedback, memory, and judgment. WebCLI gives agents a structured browser loop: readable state, actions, forms, blockers, tabs, transcripts, and handoff.\n\nWhy thumbnail demos instead of raw YouTube embeds?\n\nThe landing page should stay fast and conversion-focused. Demo cards use strong thumbnails first, then open a local demo page or lightweight YouTube facade on click. That keeps the story, transcript, trial CTA, and proof context on WebCLI while still using YouTube for distribution.\n\nWhy not just Playwright or Cypress?\n\nUse Playwright or Cypress when you know the app and the script. Use WebCLI when an agent must inspect an unknown or changing website, decide what to do, act, observe again, and recover without writing a full test suite first.\n\nWhy not just screenshots?\n\nScreenshots are useful for human verification. But weak as the primary control loop for your agent friends — shots are token-heavy, easy to misread, and disconnected from actionable page state. WebCLI gives agents enhanced web perception: structured state, stable numbered actions, and blocker awareness.\n\nWhy not just MCP?\n\nMCP is useful when you want a tool server. WebCLI is a local binary optimized for shell-based agents, terminals, scripts, and CI. They complement each other.\n\nWhy not Stagehand, Browser Use, or other browser-agent SDKs?\n\nThose are frameworks for building agents inside specific stacks. WebCLI is the shell-native layer: one binary any coding agent or human can use to drive web actions without adopting a framework.\n\nDoes it bypass CAPTCHAs or auth?\n\nNo. WebCLI detects blockers and creates a clean human handoff. WebCLI does not promise to bypass CAPTCHA, MFA, passkeys, authentication, bot detection, website protections, payment confirmations, or anti-abuse systems.\n\nIs this safe for secrets?\n\nWebCLI is built around redacted transcripts and explicit human handoff. For sensitive workflows, pause for human approval instead of letting the agent run unsupervised.\n\nWhat is Agentish?\n\nAgentish is the language agents can actually reason over: structured state, numbered actions, tabs, forms, blockers, and transcripts. WebCLI translates messy live websites into Agentish.\n\nIs BrowserBox required?\n\nNo. WebCLI is local-first. BrowserBox integration is experimental and useful when browser workflows run remotely and a human needs to join the live session to unblock the agent.\n\nWhat is the Agent Interface Device?\n\nHuman Interface Devices gave people control of computers. WebCLI is an Agent Interface Device for the web: a TAI (Textual Agent Interface) that translates the living web into a form agents can observe, act on, and reason about from the shell.\n\nTry the full browser loop. Then pay to keep driving.\n\nNo crippled mode. No toy demo. Try the real thing: observe, inspect, do, recover, pause, resume, transcript.\n\nTrial\n\n$05 days\n\nWork or trusted non-free email: free 5-day full trial. Personal or free email: $5 5-day trial pass.\n\nObserve, read, find, click, type, and do\n\nPause, join, and resume\n\nRedacted transcripts\n\nPersistent local profiles\n\nUp to 3 free work-email trials per organization domain\n\n```\nSolo Dev\n$120/ year\nFor one developer using WebCLI commercially with local agents.\nCommercial local useUnlimited local browser actionsPersistent browser profilesRedacted transcriptsPersonal machines\n\nEmail for license delivery\n          \n\nBuy Solo Dev\n```\n\nPro Runner\n\n$480/ year\n\nFor headless, CI, multi-machine, and production agent workflows.\n\nCI and headless runner use\n\nMulti-machine activation\n\nHigher concurrency\n\nProduction automation workflows\n\nRunner-oriented logging and diagnostics\n\nPlatform\n\nStarts at $5k/ year\n\nFor redistribution, bundling, team platforms, and BrowserBox-backed integrations.\n\nRedistribution and bundling rights\n\nPlatform integration\n\nBrowserBox-backed shared sessions\n\nPolicy and deployment support\n\nCustom terms available\n\nWhen a trial ends or a license is invalid, browser commands stop until a valid trial pass or paid license is activated.\n\nAdd the browser loop to your agent.\n\nDrop WebCLI instructions into your repo so your coding agent knows how to browse safely: observe first, use numbered actions, prefer JSON, pause on blockers, ask for human help when needed, and report with transcripts.", "url": "https://wpnews.pro/news/show-hn-browse-the-web-from-the-console-using-a-textual-agent-interface", "canonical_source": "https://webcli.sh", "published_at": "2026-06-13 07:05:55+00:00", "updated_at": "2026-06-13 07:19:43.056914+00:00", "lang": "en", "topics": ["ai-agents", "developer-tools", "ai-tools"], "entities": ["WebCLI", "Claude Sonnet", "Azure", "AWS", "GCP", "Cloudflare", "Namecheap"], "alternates": {"html": "https://wpnews.pro/news/show-hn-browse-the-web-from-the-console-using-a-textual-agent-interface", "markdown": "https://wpnews.pro/news/show-hn-browse-the-web-from-the-console-using-a-textual-agent-interface.md", "text": "https://wpnews.pro/news/show-hn-browse-the-web-from-the-console-using-a-textual-agent-interface.txt", "jsonld": "https://wpnews.pro/news/show-hn-browse-the-web-from-the-console-using-a-textual-agent-interface.jsonld"}}