Browser CLI for Agents A new open-source Playwright CLI tool called 'brow' achieves 82% task success at approximately $0.22 per task, outperforming browser-use, playwright-mcp, and agent-browser on 22 benchmark tasks. The tool provides a standalone Chromium instance with an agent-friendly API, persistent profiles, and structured commands for common browser actions. The browser tool agents win with: 82% task success at ~$0.22/task — beating browser-use, playwright-mcp, and agent-browser on 22 benchmark tasks. A standalone Playwright CLI that gives your agent a real Chromium instance with an agent-friendly API — structured commands for common actions, plus an eval escape hatch for full power. Homebrew: brew tap detrin/tap brew install brow pip: pip install brow-cli Then install Chromium once either method above : brow setup ~150MB, one-time Agent skill: For most agents Cline, Cursor, Amp, Gemini CLI, etc. npx -y skills add detrin/brow For OpenCode manual install git clone https://github.com/detrin/brow.git ln -s "$ pwd /brow/skills/brow" ~/.opencode/skills/brow OpenCode A real use case: use your Google account to search Maps in a city you've never visited, and extract structured results. Open a headed browser with a persistent profile and sign in manually: brow session new --profile personal --headed brow navigate -s 1 "https://accounts.google.com" Sign in manually in the browser window... brow session delete 1 Your login is saved in ~/.brow/profiles/personal/ -you won't need to sign in again. Paste this into Claude Code: Open a brow session with my personal profile, go to Google Maps, and search for bars near Times Square in New York. Return the names, Google Maps URLs, ratings, and number of reviews in a markdown table. Claude Code runs: brow session new --profile personal --headed → 1 already logged in brow navigate -s 1 "https://www.google.com/maps/search/bars+near+Times+Square+New+York" brow screenshot -s 1 brow eval -s 1 " results = await page.evaluate ''' = { const items = document.querySelectorAll 'div.Nv2PK' ; return Array.from items .slice 0, 8 .map el = { const name = el.querySelector '.fontHeadlineSmall, .qBF1Pd' ; const rating = el.querySelector '.MW4etd' ; const reviews = el.querySelector '.UY7F9' ; const link = el.querySelector 'a href =\"/maps/place\" ' ; return { name: name?.innerText || '', rating: rating?.innerText || '', reviews: reviews?.innerText.replace / /g, '' || '', url: link?.href || '' }; } ; }''' import json result = json.dumps results, indent=2 " brow session delete 1 | Bar | Rating | Reviews | Link | |---|---|---|---| | The Riff Raff Club | 4.4 | 60 | | Maps https://www.google.com/maps/place/Ascent+Lounge/ Maps https://www.google.com/maps/place/Jimmy's+Corner/ Maps https://www.google.com/maps/place/O'Donoghue's+Times+Square/ Maps https://www.google.com/maps/place/The+Dickens/ Maps https://www.google.com/maps/place/The+Woo+Woo/ Because the google profile persists your login, you get personalized results -no cookie banners, no sign-in walls, just data. 22 tasks total 16 fixture + 6 new , Claude Sonnet via AWS Bedrock. Compared against playwright-cli, MCP Playwright, agent-browser Rust/CDP , and browser-use full-stack agent framework . | Metric | brow | agent-browser | browser-use | playwright-cli | MCP Playwright | |---|---|---|---|---|---| | Success rate 16 fixture | 88% 14/16 | 63% 10/16 | 63% 10/16 | 50% 8/16 | 44% 7/16 | | Success rate 22 total | 82% 18/22 | 64% 14/22 | 64% 14/22 | 55% 12/22 | 36% 8/22 | | Avg tokens/task 16 fixture | 68K | 73K | 75K | 113K | 118K | | Avg tokens/task 22 total | 88K | 69K | 81K | 96K | 132K | | Avg tool calls | 9.6 | 11.2 | 5.8 | 9.6 | 11.6 | | Avg wall-clock fixture | 41s | 36s | 73s | 44s | 50s | | Est. cost/task | $0.22 | $0.23 | $0.27 | $0.35 | $0.37 | brow leads on success rate across both suites. On token efficiency, brow leads the 16-task fixture suite 68K avg but agent-browser is most efficient across all 22 tasks 69K avg — brow's average is inflated by one live task github-trending-python: 383K tokens, agent didn't use snapshot filtering . browser-use runs its own agent loop — included for completeness. Per-task success grid, token breakdown, and analysis: benchmarks/README.md /detrin/brow/blob/main/benchmarks/README.md brow daemon start --port 19987 brow daemon stop brow daemon status brow session new --profile