# BrowserAct Hands-On: Real Browser Automation from the CLI

> Source: <https://dev.to/aryan_shourie/browseract-hands-on-real-browser-automation-from-the-cli-41n6>
> Published: 2026-06-17 08:37:21+00:00

A few days ago, I received an email from the BrowserAct team after they came across one of my articles. They introduced BrowserAct as a browser automation CLI built for AI agents and invited me to try it out.

Browser automation usually means selectors, waits, scripts, browser state management, and debugging.

Whether you're using Playwright, Selenium, or Puppeteer, even simple browser workflows often require writing and maintaining automation code.

When the BrowserAct team reached out and invited me to try their browser automation CLI, I was curious about one thing:

Can an AI-friendly CLI handle real browser workflows without requiring me to write Playwright or Selenium scripts?

Instead of writing a high-level overview, I decided to install BrowserAct, test it against real websites, create browser sessions, automate interactions, and evaluate how it performs in practical scenarios.

This article documents my hands-on experience and first impressions after testing BrowserAct on a real-world workflow.

BrowserAct is an open-source browser automation CLI that provides:

The interesting part is that BrowserAct tries to abstract browser automation into simple commands rather than requiring developers to write Playwright or Selenium scripts.

I installed BrowserAct using uv:

```
uv tool install browser-act-cli --python 3.12
```

After installation, I verified the available commands:

```
browser-act --help
```

The CLI immediately exposed commands for:

At this point it was clear that BrowserAct was much more than a simple web scraping utility.

I started with the simplest possible example.

```
browser-act stealth-extract https://example.com
```

Output:

```
# Example Domain
This domain is for use in documentation examples without needing permission. Avoid use in operations.
[Learn more](https://iana.org/domains/example)
```

The result was returned as clean Markdown.

No browser scripting.

No selectors.

No parsing logic.

Just a single command.

Next, I wanted to see how BrowserAct handled a modern JavaScript-driven website.

I chose the Whale TV careers page.

```
browser-act stealth-extract \ https://www.whaletv.com/careers \ --content-type markdown
```

Output:

BrowserAct successfully extracted:

Some of the positions extracted included:

The extraction was surprisingly clean and readable.

Next, I tested a specific job posting.

```
browser-act stealth-extract \ https://www.whaletv.com/open-positions/smart-tv-app-engineer-smart-tv-app-specialist \ --content-type markdown
```

Output:

BrowserAct extracted:

For example, the extracted technical stack included:

This was more than simple HTML retrieval. The content was structured and immediately usable.

The next feature I wanted to evaluate was browser creation.

First, I listed the available browser profiles.

```
browser-act browser list-profiles
```

Output:

```
local_profile_101257381414961177 Your Chrome     local      shouriearyandev@gmail.com      Chrome
  browser:chrome_local_101261645860307073 whale-tv-evaluation managed    -                              whale-tv-evaluation

Total: 2 profiles

Tip: Use "browser-act browser create --source-profile <PROFILE_ID>" to create a browser from a profile.
```

I then created a browser using my existing Chrome profile.

```
browser-act browser create \ --type chrome \ --name "whale-tv-evaluation-test" \ --desc "Testing BrowserAct browser automation features" \ --source-profile <profile-id>
```

Output:

```
id=chrome_local_101361762646884363 name="whale-tv-evaluation-test" type=chrome
  desc="Testing BrowserAct browser automation features, navigation, interaction, and content extraction for Smart TV and developer tooling evaluation."
  imported_cookies=1606
  imported_ls_domains=160
```

BrowserAct reported:

```
imported_cookies=1592
imported_ls_domains=157
```

The browser was created successfully with existing browser state imported automatically.

To verify that the browser had been created successfully, I listed the available browsers.

```
browser-act browser list
```

Output:

The newly created browser appeared in the list and was ready to be used for automation.

After creating a browser, I opened a new browser session.

```
browser-act \ --session whale-test \ browser open \ <browser-id> \ https://www.google.com
```

Output:

```
session_name=whale-test
browser_type=chrome
url=https://www.google.com/
title=Google
```

BrowserAct immediately created a session and navigated to Google.

The output included:

```
session_name=whale-test
browser_type=chrome
url=https://www.google.com
title=Google
```

This confirmed that the browser was operational and ready for interaction.

One of BrowserAct's most interesting features is the ability to inspect a page and receive a structured representation of its interactive elements.

```
browser-act \ --session whale-test \ state
```

Output:

Instead of exposing raw HTML, BrowserAct generated an interaction tree.

For example:

[14] Search Box

[17] Google Search

[18] I'm Feeling Lucky

This makes interaction significantly easier because actions can be performed using element indexes rather than CSS selectors.

Using the index returned by the state command, I entered a search query.

```
browser-act \ --session whale-test \ input 14 "Whale TV careers"
```

Output:

```
input="Whale TV careers" element=14
```

BrowserAct successfully entered the text into the search box.

Next, I triggered the search.

```
browser-act \ --session whale-test \ click 17
```

Output:

```
clicked=17
```

The click was executed successfully.

To allow the page to finish loading, I waited for the browser to become stable.

```
browser-act \ --session whale-test \ wait stable
```

Output:

```
wait completed: page is stable
```

This demonstrated how BrowserAct handles browser interactions through a simple command-driven workflow.

Next, I wanted to test direct navigation to a website.

```
browser-act \ --session whale-test \ navigate https://www.whaletv.com/careers
```

Output:

```
url=https://www.whaletv.com/careers
title=whaletv.com/careers
new_tab=False
```

BrowserAct immediately navigated to the requested URL and reported:

url=[https://www.whaletv.com/careers](https://www.whaletv.com/careers)

title=whaletv.com/careers

At this point I was interacting with a real-world website rather than a simple test page.

After loading the careers page, I inspected the page state again.

```
browser-act \ --session whale-test \ state
```

Output:

BrowserAct identified actionable elements such as:

[11] Head of Ad Sales, Emerging Markets

[12] Ad Operations Specialist

[15] Accept Cookies

The conversion of page content into actionable elements is one of the most interesting aspects of BrowserAct's workflow.

Next, I clicked one of the available job listings.

```
browser-act \ --session whale-test \ click 11
```

Output:

```
clicked=11
```

BrowserAct opened the job posting and navigated to the detailed job description page.

Once again, I waited for the page to finish loading.

```
browser-act \ --session whale-test \ wait stable
```

Output:

```
wait completed: page is stable
```

The page was now ready for content extraction.

Finally, I extracted the content from the job details page.

```
browser-act \ --session whale-test \ get markdown
```

Output:

BrowserAct returned the complete job description including:

At this point I had completed an end-to-end workflow:

Create Browser

→ Open Session

→ Navigate

→ Inspect State

→ Click Elements

→ Wait

→ Extract Content

without writing a single line of browser automation code.

During my evaluation, I successfully tested the following BrowserAct capabilities:

**Content Extraction**

```
browser-act stealth-extract <url>
```

**Browser Profile Discovery**

```
browser-act browser list-profiles
```

**Browser Creation**

```
browser-act browser create
```

**Browser Sessions**

```
browser-act browser open
```

**Page Inspection**

```
browser-act state
```

**Browser Interaction**

```
browser-act input 
browser-act click 
browser-act navigate 
browser-act wait stable
```

**Content Extraction From Active Sessions**

```
browser-act get markdown
```

Through this testing, I was able to evaluate BrowserAct across both extraction and browser automation workflows.

During testing, I couldn't help comparing BrowserAct to tools such as Playwright and Selenium.

Traditional browser automation usually involves:

BrowserAct takes a different approach.

Instead of building automation through code, many workflows can be performed directly through CLI commands.

For quick tasks such as:

the workflow feels significantly lighter.

That doesn't replace full automation frameworks, but it does make many common browser tasks much faster to execute.

Based on my testing, BrowserAct could be useful for:

**AI Agent Development**

Developers building AI agents that need browser capabilities.

**Research Workflows**

Collecting information from websites and extracting structured content.

**Browser Automation**

Automating simple browser interactions without building a complete automation framework.

**Developer Tooling**

Internal tools that need browser-based capabilities.

**Content Extraction**

Extracting structured content from modern websites.

My goal was simple: install BrowserAct, test it on real websites, and determine whether it could handle practical browser automation workflows.

During testing I was able to:

✅ Install BrowserAct

✅ Extract content from multiple websites

✅ Extract detailed job descriptions

✅ Import an existing Chrome profile

✅ Create a browser

✅ Open browser sessions

✅ Navigate websites

✅ Inspect page structure

✅ Interact with page elements

✅ Extract content after navigation

Most importantly, I was able to complete an entire workflow:

Create Browser

→ Open Session

→ Navigate

→ Inspect

→ Click

→ Wait

→ Extract

using CLI commands instead of writing browser automation code.

For developers interested in browser automation, AI agents, content extraction, or developer tooling, BrowserAct is definitely worth exploring.

This was my first round of testing, and I focused primarily on installation, content extraction, browser creation, session management, navigation, and page interaction.

There are still several capabilities I haven't explored in depth yet, including verification handling, remote human handoff, screenshots, network inspection, multi-session workflows, and some of BrowserAct's more advanced browser management features.

Those areas deserve their own dedicated evaluation, which I'll likely cover in a follow-up article after additional testing.

This article focused on validating the core BrowserAct workflow:

In future testing, I plan to explore:

`remote-assist`

If those tests are successful, I'll publish a follow-up article documenting the results.

Try BrowserAct:

[https://browseract.com?fpr=aryan21](https://browseract.com?fpr=aryan21)

BrowserAct GitHub Skills:

[https://github.com/browser-act/skills](https://github.com/browser-act/skills)

BrowserAct Documentation:

This article includes an affiliate link. If you decide to try BrowserAct through my referral link, I may earn a commission at no additional cost to you.
