# Microsoft identifies seven new ways AI agents can be hacked

> Source: <https://www.infoworld.com/article/4181844/microsoft-identifies-seven-new-ways-ai-agents-can-be-hacked.html>
> Published: 2026-06-05 17:14:24+00:00

Microsoft has identified seven new failure modes in agentic AI systems, in addition to those it identified last year in its first [Taxonomy of Failure Modes in Agentic AI Systems](https://cdn-dynmedia-1.microsoft.com/is/content/microsoftcorp/microsoft/final/en-us/microsoft-brand/documents/Taxonomy-of-Failure-Mode-in-Agentic-AI-Systems-Whitepaper.pdf).

Four things contributed to the growing list of [ways agentic AI can go wrong](https://www.infoworld.com/article/4040909/why-ai-fails-at-business-context-and-what-to-do-about-it.html): the speed at which the technology went mainstream, the growing maturity of the Model Context Protocol (MCP) ecosystem, the rise of computer-use agents, and finally the gathering of more empirical evidence as researchers obtained more real-life findings.

The [seven new failure modes](https://www.microsoft.com/en-us/security/blog/2026/06/04/updating-taxonomy-failure-modes-agentic-ai-systems-year-red-teaming-taught-us/) it has identified are:

- Agentic Supply Chain Compromise —agent behavior can be affected by natural language rather than malicious code;
- Goal Hijacking — adversarial instructions appear aligned with legitimate task completion, while silently redirecting the agent’s terminal goal;
- Inter-Agent Trust Escalation —a compromised agent asserts false identity or inflates claimed permissions to an orchestrator;
- Computer Use Agent (CUA) Visual Attack — agents operating through graphical interfaces can be manipulated through content that carries adversarial instructions for the agent;
- Session Context Contamination —an adversary introduces data that biases the agent’s reasoning in subsequent steps, without triggering safety controls at any individual step;
- MCP / Plugin Abuse — an update on the original taxonomy’s coverage of function compromise around MCP and plugin protocols, specifically attack surfaces specific to those protocols;
- Capability / Architecture Disclosure —an agent reveals internal implementation details such as tool names and schemas, system-prompt structure, memory interfaces, or consent/human-in-the-loop trigger logic.

Microsoft advises security teams using these definitions to influence their planning to inventory their your supply chain, generating a software bill of materials (SBOM) for every deployed agent, to verify agent identity cryptographically, not positionally, by issuing [attestable credentials](https://www.csoonline.com/article/4163365/what-cisos-need-to-get-right-as-identity-enters-the-agentic-era.html) at provisioning, to add the seven new failure modes to their red-team coverage matrix, and to audit the human-in-the-loop user experience as a security control.
