BrowserAct Hands-On: Real Browser Automation from the CLI

wpnews.pro

A few days ago, I received an email from the BrowserAct team after they came across one of my articles. They introduced BrowserAct as a browser automation CLI built for AI agents and invited me to try it out.

Browser automation usually means selectors, waits, scripts, browser state management, and debugging.

Whether you're using Playwright, Selenium, or Puppeteer, even simple browser workflows often require writing and maintaining automation code.

When the BrowserAct team reached out and invited me to try their browser automation CLI, I was curious about one thing:

Can an AI-friendly CLI handle real browser workflows without requiring me to write Playwright or Selenium scripts?

Instead of writing a high-level overview, I decided to install BrowserAct, test it against real websites, create browser sessions, automate interactions, and evaluate how it performs in practical scenarios.

This article documents my hands-on experience and first impressions after testing BrowserAct on a real-world workflow.

BrowserAct is an open-source browser automation CLI that provides:

The interesting part is that BrowserAct tries to abstract browser automation into simple commands rather than requiring developers to write Playwright or Selenium scripts.

I installed BrowserAct using uv:

uv tool install browser-act-cli --python 3.12

After installation, I verified the available commands:

browser-act --help

The CLI immediately exposed commands for:

At this point it was clear that BrowserAct was much more than a simple web scraping utility.

I started with the simplest possible example.

browser-act stealth-extract https://example.com

Output:

This domain is for use in documentation examples without needing permission. Avoid use in operations.
[Learn more](https://iana.org/domains/example)

The result was returned as clean Markdown.

No browser scripting.

No selectors.

No parsing logic.

Just a single command.

Next, I wanted to see how BrowserAct handled a modern JavaScript-driven website.

I chose the Whale TV careers page.

browser-act stealth-extract \ https://www.whaletv.com/careers \ --content-type markdown

Output:

BrowserAct successfully extracted:

Some of the positions extracted included:

The extraction was surprisingly clean and readable.

Next, I tested a specific job posting.

browser-act stealth-extract \ https://www.whaletv.com/open-positions/smart-tv-app-engineer-smart-tv-app-specialist \ --content-type markdown

Output:

BrowserAct extracted:

For example, the extracted technical stack included:

This was more than simple HTML retrieval. The content was structured and immediately usable.

The next feature I wanted to evaluate was browser creation.

First, I listed the available browser profiles.

browser-act browser list-profiles

Output:

local_profile_101257381414961177 Your Chrome     local      shouriearyandev@gmail.com      Chrome
  browser:chrome_local_101261645860307073 whale-tv-evaluation managed    -                              whale-tv-evaluation

Total: 2 profiles

Tip: Use "browser-act browser create --source-profile <PROFILE_ID>" to create a browser from a profile.

I then created a browser using my existing Chrome profile.

browser-act browser create \ --type chrome \ --name "whale-tv-evaluation-test" \ --desc "Testing BrowserAct browser automation features" \ --source-profile <profile-id>

Output:

id=chrome_local_101361762646884363 name="whale-tv-evaluation-test" type=chrome
  desc="Testing BrowserAct browser automation features, navigation, interaction, and content extraction for Smart TV and developer tooling evaluation."
  imported_cookies=1606
  imported_ls_domains=160

BrowserAct reported:

imported_cookies=1592
imported_ls_domains=157

The browser was created successfully with existing browser state imported automatically.

To verify that the browser had been created successfully, I listed the available browsers.

browser-act browser list

Output:

The newly created browser appeared in the list and was ready to be used for automation.

After creating a browser, I opened a new browser session.

browser-act \ --session whale-test \ browser open \ <browser-id> \ https://www.google.com

Output:

session_name=whale-test
browser_type=chrome
url=https://www.google.com/
title=Google

BrowserAct immediately created a session and navigated to Google.

The output included:

session_name=whale-test
browser_type=chrome
url=https://www.google.com
title=Google

This confirmed that the browser was operational and ready for interaction.

One of BrowserAct's most interesting features is the ability to inspect a page and receive a structured representation of its interactive elements.

browser-act \ --session whale-test \ state

Output:

Instead of exposing raw HTML, BrowserAct generated an interaction tree.

For example:

[14] Search Box

[17] Google Search

[18] I'm Feeling Lucky

This makes interaction significantly easier because actions can be performed using element indexes rather than CSS selectors.

Using the index returned by the state command, I entered a search query.

browser-act \ --session whale-test \ input 14 "Whale TV careers"

Output:

input="Whale TV careers" element=14

BrowserAct successfully entered the text into the search box.

Next, I triggered the search.

browser-act \ --session whale-test \ click 17

Output:

clicked=17

The click was executed successfully.

To allow the page to finish , I waited for the browser to become stable.

browser-act \ --session whale-test \ wait stable

Output:

wait completed: page is stable

This demonstrated how BrowserAct handles browser interactions through a simple command-driven workflow.

Next, I wanted to test direct navigation to a website.

browser-act \ --session whale-test \ navigate https://www.whaletv.com/careers

Output:

url=https://www.whaletv.com/careers
title=whaletv.com/careers
new_tab=False

BrowserAct immediately navigated to the requested URL and reported:

url=https://www.whaletv.com/careers

title=whaletv.com/careers

At this point I was interacting with a real-world website rather than a simple test page.

After the careers page, I inspected the page state again.

browser-act \ --session whale-test \ state

Output:

BrowserAct identified actionable elements such as:

[11] Head of Ad Sales, Emerging Markets

[12] Ad Operations Specialist

[15] Accept Cookies

The conversion of page content into actionable elements is one of the most interesting aspects of BrowserAct's workflow.

Next, I clicked one of the available job listings.

browser-act \ --session whale-test \ click 11

Output:

clicked=11

BrowserAct opened the job posting and navigated to the detailed job description page.

Once again, I waited for the page to finish .

browser-act \ --session whale-test \ wait stable

Output:

wait completed: page is stable

The page was now ready for content extraction.

Finally, I extracted the content from the job details page.

browser-act \ --session whale-test \ get markdown

Output:

BrowserAct returned the complete job description including:

At this point I had completed an end-to-end workflow:

Create Browser

→ Open Session

→ Navigate

→ Inspect State

→ Click Elements

→ Wait

→ Extract Content

without writing a single line of browser automation code.

During my evaluation, I successfully tested the following BrowserAct capabilities:

Content Extraction

browser-act stealth-extract <url>

Browser Profile Discovery

browser-act browser list-profiles

Browser Creation

browser-act browser create

Browser Sessions

browser-act browser open

Page Inspection

browser-act state

Browser Interaction

browser-act input 
browser-act click 
browser-act navigate 
browser-act wait stable

Content Extraction From Active Sessions

browser-act get markdown

Through this testing, I was able to evaluate BrowserAct across both extraction and browser automation workflows.

During testing, I couldn't help comparing BrowserAct to tools such as Playwright and Selenium.

Traditional browser automation usually involves:

BrowserAct takes a different approach.

Instead of building automation through code, many workflows can be performed directly through CLI commands.

For quick tasks such as:

the workflow feels significantly lighter.

That doesn't replace full automation frameworks, but it does make many common browser tasks much faster to execute.

Based on my testing, BrowserAct could be useful for:

AI Agent Development

Developers building AI agents that need browser capabilities.

Research Workflows

Collecting information from websites and extracting structured content.

Browser Automation

Automating simple browser interactions without building a complete automation framework.

Developer Tooling

Internal tools that need browser-based capabilities.

Content Extraction

Extracting structured content from modern websites.

My goal was simple: install BrowserAct, test it on real websites, and determine whether it could handle practical browser automation workflows.

During testing I was able to:

✅ Install BrowserAct

✅ Extract content from multiple websites

✅ Extract detailed job descriptions

✅ Import an existing Chrome profile

✅ Create a browser

✅ Open browser sessions

✅ Navigate websites

✅ Inspect page structure

✅ Interact with page elements

✅ Extract content after navigation

Most importantly, I was able to complete an entire workflow:

Create Browser

→ Open Session

→ Navigate

→ Inspect

→ Click

→ Wait

→ Extract

using CLI commands instead of writing browser automation code.

For developers interested in browser automation, AI agents, content extraction, or developer tooling, BrowserAct is definitely worth exploring.

This was my first round of testing, and I focused primarily on installation, content extraction, browser creation, session management, navigation, and page interaction.

There are still several capabilities I haven't explored in depth yet, including verification handling, remote human handoff, screenshots, network inspection, multi-session workflows, and some of BrowserAct's more advanced browser management features.

Those areas deserve their own dedicated evaluation, which I'll likely cover in a follow-up article after additional testing.

This article focused on validating the core BrowserAct workflow:

In future testing, I plan to explore:

remote-assist

If those tests are successful, I'll publish a follow-up article documenting the results.

Try BrowserAct:

https://browseract.com?fpr=aryan21

BrowserAct GitHub Skills:

https://github.com/browser-act/skills

BrowserAct Documentation:

This article includes an affiliate link. If you decide to try BrowserAct through my referral link, I may earn a commission at no additional cost to you.

source & further reading

dev.to — original article Publishers Blocking AI Crawlers Are Reshaping the Economics of Training Data Clive — a friendly CLI for local LLMs I handed AI agents almost the whole product. Except one part - and that part is the job

BrowserAct Hands-On: Real Browser Automation from the CLI

Run your AI side-project on zahid.host