The risks of rogue AI agents are well-documented. Yet for many companies, the potential rewards are too great to ignore. Now, firms like Cursor and 1Password are in a race to build solutions that help users harness that power without losing control.
In an exclusive interview for The Deep View, I spoke with 1Password CTO Nancy Wang and Cursor Security Lead Travis McPeak, two practitioners building the infrastructure that AI agents run on, to understand what safe deployment actually looks like in practice. Their verdict: the risks are real, the margin for error is slim, and the difference between a useful agent and a dangerous one often comes down to oversight architecture.
"Agents are tools, and if you break something [like] a hammer, you don't blame the hammer," said McPeak. "These are all tools, and they should be driven by a responsible human in the business who's trying to get the job done and understands both the strengths and weaknesses of the system."
As to weaknesses, both McPeak and Wang explained that these agents, when stuck, are so determined to solve a problem that they become creative in dangerous ways, taking actions they are not authorized to take and beginning to "spiral" out of control.
The unintended actions agents can take include deleting a database, as seen when Amazon's internal AI coding agent Kiro autonomously deleted a production environment, causing a 13-hour service disruption, or OpenClaw deleting over 200 emails from the inbox of Meta Superintelligence Labs' director of alignment, even as she repeatedly commanded it to stop.
However, both explain that there are actions you can take to mitigate the risks:
Continuous monitoring: "Govern the stay, don't just govern the access. Traditional access management systems really give you access, and then they walk away… but with agents and just given their blast radius, you really have to monitor what they're doing with that access," said Wang.Add intelligence for approvals:"You can use a completely separate intelligent model that has its own goals and really get to a good spot with the primary, I think that's a much more durable model, and then the net impact of that is, if it's implemented well, a human will get much [fewer] approvals to send, and when they actually get one, it's because something interesting is happening," McPeak said.Keep credentials private: "Don't give credentials to your agents. As tempting as it is to just copy your API key right into your context window, seriously, don't do it, because it might come out in another context, it might get used against your wishes," said Wang.
Despite the possibility that agents could go rogue, McPeak encourages everyone to give agents a shot, with the right precautions in place. "My biggest advice [is] you should probably look into it. It's quite powerful, and it's really transformational," he said. If you want to hear more insights from Wang and McPeak, you can find them in the Zero-Shot Learning show, dropping Tuesday at 9:00 a.m. Eastern.
Our Deeper View #
Understanding the risks of agents is foundational to deploying them efficiently in businesses and organizations. Once the risks are identified, users can more accurately determine the best use cases, which risks are worth taking, and which mitigations are best to implement. For instance, for someone starting, it may make more sense to manually approve actions before adopting this article's proposed solution of having a separate agent monitor that process for you.