Security researchers convinced six AI browsers they were playing a game. The browsers then handed over their users’ passwords and treated it as a win.
The firm behind it, LayerX, calls the technique BioShocking, and says it worked on every agent it tried. The list reads like a roll-call of the new AI browser market: OpenAI’s ChatGPT Atlas, Perplexity’s Comet, Anthropic’s Claude extension for Chrome, and three smaller players, Fellou, Genspark, and Sigma.
The name nods to the video game BioShock, in which a brainwashed character obeys the trigger phrase “Would you kindly?” The attack runs on the same idea. Convince the AI that the normal rules do not apply, and it stops applying them.
How a maths puzzle breaks the rules #
An AI browser in agent mode does not just read pages. It clicks, types, and reaches into any site you have already logged into. That access is the point, and it is also the danger.
The attack starts on a booby-trapped web page built as a puzzle. To fit its dystopian theme, the puzzle rewards wrong answers, insisting that two plus two equals five. Once the agent accepts that “wrong” is the winning move, it switches from safety logic to game logic. From there, the researchers simply made stealing credentials the next level.
The final step told the agent to fetch a hidden “code” from another page. That page redirected to the victim’s work GitHub repository, where the agent pulled SSH login details and passed them to the attacker. Not one of the six agents flagged the theft. Afterward, they reported it as a completed objective.
LayerX used a harmless plaintext file in its test. In a real attack, the same trick could point the agent at anything it can reach in that session: open tabs, signed-in accounts, internal tools, a password manager.
Why the guardrails fold #
The root cause is old, and it has no easy fix. To an AI browser, the web page and your own instructions arrive as one stream of text. The agent cannot reliably tell a genuine command from a malicious one buried in a page. Researchers call this indirect prompt injection, and it has already hijacked AI agents from the biggest labs.
Guardrails exist to stop an AI from doing harm. Those rules assume the agent knows it sits in the real world. Change that assumption, and the rules lose their grip. No one hacks the AI in the traditional sense. The attacker simply talks it into the theft.
LayerX has pulled this off before. As The Hacker News noted, the firm previously showed that a single click could hijack Comet and quietly steal data. It is the same weakness that has let attackers steal credentials through AI coding tools and slip malicious skills past security scanners. Handing an agent the keys to your accounts turns a clever jailbreak into real access.
The vendors’ patchy response #
LayerX told each vendor between October 2025 and January 2026, before going public. The responses varied, and the gaps should worry anyone who uses these tools.
OpenAI fixed the flaw in ChatGPT Atlas. Anthropic tried to patch its Claude extension, but LayerX says the fix did not hold. Perplexity closed the report without acting on it. Fellou, Genspark, and Sigma never replied at all. So most of the browsers tested may still fall for the trick today.
LayerX wants vendors to add a simple check before an agent reads from a logged-in account. One prompt, “I’m about to copy data from your GitHub repository. Continue?”, would break the chain. It also wants agents to notice when a page tells them the usual rules no longer apply, and to let users cap what an agent can touch. Winning a game is no reason to open a private repository.
What to do now #
For users, the advice is short. Treat agent mode with care. Whatever you have logged into is fair game, so decide what the browser should see and revoke that access when you finish. Companies now securing fleets of AI agents face the same problem at scale. The wider lesson goes beyond one puzzle. The industry is racing to put AI agents in charge of real tasks on real accounts, from email to code. BioShocking is a reminder that these agents trust the world they are shown. Change the story they are told, and they will follow it, right out of their own guardrails.
Get the TNW newsletter #
Get the most important tech news in your inbox each week.