A few days ago, I received an email from the BrowserAct team after they came across one of my articles. They introduced BrowserAct as a browser automation CLI built for AI agents and invited me to try it out.
Browser automation usually means selectors, waits, scripts, browser state management, and debugging.
Whether you're using Playwright, Selenium, or Puppeteer, even simple browser workflows often require writing and maintaining automation code.
When the BrowserAct team reached out and invited me to try their browser automation CLI, I was curious about one thing:
Can an AI-friendly CLI handle real browser workflows without requiring me to write Playwright or Selenium scripts?
Instead of writing a high-level overview, I decided to install BrowserAct, test it against real websites, create browser sessions, automate interactions, and evaluate how it performs in practical scenarios.
This article documents my hands-on experience and first impressions after testing BrowserAct on a real-world workflow.
BrowserAct is an open-source browser automation CLI that provides:
The interesting part is that BrowserAct tries to abstract browser automation into simple commands rather than requiring developers to write Playwright or Selenium scripts.
I installed BrowserAct using uv:
uv tool install browser-act-cli --python 3.12
After installation, I verified the available commands:
browser-act --help
The CLI immediately exposed commands for:
At this point it was clear that BrowserAct was much more than a simple web scraping utility.
I started with the simplest possible example.
browser-act stealth-extract https://example.com
Output:
This domain is for use in documentation examples without needing permission. Avoid use in operations.
[Learn more](https://iana.org/domains/example)
The result was returned as clean Markdown.
No browser scripting.
No selectors.
No parsing logic.
Just a single command.
Next, I wanted to see how BrowserAct handled a modern JavaScript-driven website.
I chose the Whale TV careers page.
browser-act stealth-extract \ https://www.whaletv.com/careers \ --content-type markdown
Output:
BrowserAct successfully extracted:
Some of the positions extracted included:
The extraction was surprisingly clean and readable.
Next, I tested a specific job posting.
browser-act stealth-extract \ https://www.whaletv.com/open-positions/smart-tv-app-engineer-smart-tv-app-specialist \ --content-type markdown
Output:
BrowserAct extracted:
For example, the extracted technical stack included:
This was more than simple HTML retrieval. The content was structured and immediately usable.
The next feature I wanted to evaluate was browser creation.
First, I listed the available browser profiles.
browser-act browser list-profiles
Output:
local_profile_101257381414961177 Your Chrome local shouriearyandev@gmail.com Chrome
browser:chrome_local_101261645860307073 whale-tv-evaluation managed - whale-tv-evaluation
Total: 2 profiles
Tip: Use "browser-act browser create --source-profile <PROFILE_ID>" to create a browser from a profile.
I then created a browser using my existing Chrome profile.
browser-act browser create \ --type chrome \ --name "whale-tv-evaluation-test" \ --desc "Testing BrowserAct browser automation features" \ --source-profile <profile-id>
Output:
id=chrome_local_101361762646884363 name="whale-tv-evaluation-test" type=chrome
desc="Testing BrowserAct browser automation features, navigation, interaction, and content extraction for Smart TV and developer tooling evaluation."
imported_cookies=1606
imported_ls_domains=160
BrowserAct reported:
imported_cookies=1592
imported_ls_domains=157
The browser was created successfully with existing browser state imported automatically.
To verify that the browser had been created successfully, I listed the available browsers.
browser-act browser list
Output:
The newly created browser appeared in the list and was ready to be used for automation.
After creating a browser, I opened a new browser session.
browser-act \ --session whale-test \ browser open \ <browser-id> \ https://www.google.com
Output:
session_name=whale-test
browser_type=chrome
url=https://www.google.com/
title=Google
BrowserAct immediately created a session and navigated to Google.
The output included:
session_name=whale-test
browser_type=chrome
url=https://www.google.com
title=Google
This confirmed that the browser was operational and ready for interaction.
One of BrowserAct's most interesting features is the ability to inspect a page and receive a structured representation of its interactive elements.
browser-act \ --session whale-test \ state
Output:
Instead of exposing raw HTML, BrowserAct generated an interaction tree.
For example:
[14] Search Box
[17] Google Search
[18] I'm Feeling Lucky
This makes interaction significantly easier because actions can be performed using element indexes rather than CSS selectors.
Using the index returned by the state command, I entered a search query.
browser-act \ --session whale-test \ input 14 "Whale TV careers"
Output:
input="Whale TV careers" element=14
BrowserAct successfully entered the text into the search box.
Next, I triggered the search.
browser-act \ --session whale-test \ click 17
Output:
clicked=17
The click was executed successfully.
To allow the page to finish , I waited for the browser to become stable.
browser-act \ --session whale-test \ wait stable
Output:
wait completed: page is stable
This demonstrated how BrowserAct handles browser interactions through a simple command-driven workflow.
Next, I wanted to test direct navigation to a website.
browser-act \ --session whale-test \ navigate https://www.whaletv.com/careers
Output:
url=https://www.whaletv.com/careers
title=whaletv.com/careers
new_tab=False
BrowserAct immediately navigated to the requested URL and reported:
url=https://www.whaletv.com/careers
title=whaletv.com/careers
At this point I was interacting with a real-world website rather than a simple test page.
After the careers page, I inspected the page state again.
browser-act \ --session whale-test \ state
Output:
BrowserAct identified actionable elements such as:
[11] Head of Ad Sales, Emerging Markets
[12] Ad Operations Specialist
[15] Accept Cookies
The conversion of page content into actionable elements is one of the most interesting aspects of BrowserAct's workflow.
Next, I clicked one of the available job listings.
browser-act \ --session whale-test \ click 11
Output:
clicked=11
BrowserAct opened the job posting and navigated to the detailed job description page.
Once again, I waited for the page to finish .
browser-act \ --session whale-test \ wait stable
Output:
wait completed: page is stable
The page was now ready for content extraction.
Finally, I extracted the content from the job details page.
browser-act \ --session whale-test \ get markdown
Output:
BrowserAct returned the complete job description including:
At this point I had completed an end-to-end workflow:
Create Browser
β Open Session
β Navigate
β Inspect State
β Click Elements
β Wait
β Extract Content
without writing a single line of browser automation code.
During my evaluation, I successfully tested the following BrowserAct capabilities:
Content Extraction
browser-act stealth-extract <url>
Browser Profile Discovery
browser-act browser list-profiles
Browser Creation
browser-act browser create
Browser Sessions
browser-act browser open
Page Inspection
browser-act state
Browser Interaction
browser-act input
browser-act click
browser-act navigate
browser-act wait stable
Content Extraction From Active Sessions
browser-act get markdown
Through this testing, I was able to evaluate BrowserAct across both extraction and browser automation workflows.
During testing, I couldn't help comparing BrowserAct to tools such as Playwright and Selenium.
Traditional browser automation usually involves:
BrowserAct takes a different approach.
Instead of building automation through code, many workflows can be performed directly through CLI commands.
For quick tasks such as:
the workflow feels significantly lighter.
That doesn't replace full automation frameworks, but it does make many common browser tasks much faster to execute.
Based on my testing, BrowserAct could be useful for:
AI Agent Development
Developers building AI agents that need browser capabilities.
Research Workflows
Collecting information from websites and extracting structured content.
Browser Automation
Automating simple browser interactions without building a complete automation framework.
Developer Tooling
Internal tools that need browser-based capabilities.
Content Extraction
Extracting structured content from modern websites.
My goal was simple: install BrowserAct, test it on real websites, and determine whether it could handle practical browser automation workflows.
During testing I was able to:
β Install BrowserAct
β Extract content from multiple websites
β Extract detailed job descriptions
β Import an existing Chrome profile
β Create a browser
β Open browser sessions
β Navigate websites
β Inspect page structure
β Interact with page elements
β Extract content after navigation
Most importantly, I was able to complete an entire workflow:
Create Browser
β Open Session
β Navigate
β Inspect
β Click
β Wait
β Extract
using CLI commands instead of writing browser automation code.
For developers interested in browser automation, AI agents, content extraction, or developer tooling, BrowserAct is definitely worth exploring.
This was my first round of testing, and I focused primarily on installation, content extraction, browser creation, session management, navigation, and page interaction.
There are still several capabilities I haven't explored in depth yet, including verification handling, remote human handoff, screenshots, network inspection, multi-session workflows, and some of BrowserAct's more advanced browser management features.
Those areas deserve their own dedicated evaluation, which I'll likely cover in a follow-up article after additional testing.
This article focused on validating the core BrowserAct workflow:
In future testing, I plan to explore:
remote-assist
If those tests are successful, I'll publish a follow-up article documenting the results.
Try BrowserAct:
https://browseract.com?fpr=aryan21
BrowserAct GitHub Skills:
https://github.com/browser-act/skills
BrowserAct Documentation:
This article includes an affiliate link. If you decide to try BrowserAct through my referral link, I may earn a commission at no additional cost to you.