Agentjacking: your AI agent is now a privileged attack surface A new class of attack called agentjacking emerged in mid-2026, where attackers hide instructions in data that AI agents read, causing the agent to execute malicious actions with its own privileges. The attack bypasses perimeter security because every step is technically authorized. The Athreix team provides a hardening checklist including separating data from instructions, least agency, confirmation gates, short-lived credentials, auditing, and prompt-injection tests in CI. TL;DR: If an AI agent can read external data and also take actions, an attacker can hide instructions inside the data it reads. The agent cannot reliably tell a real instruction from a poisoned one, so it runs the attacker's intent with the agent's own privileges. Perimeter tools never see it because every step is authorized. Here is the attack model and a concrete hardening checklist. A new class of attack surfaced in mid-2026, often called agentjacking. The setup is mundane: an agent reads an error report, a support ticket, a webpage, or a tool result to do its job. An attacker plants text in that source with hidden instructions. When the agent ingests it, the model treats the attacker's text as guidance and acts on it, with whatever access the agent already had. No firewall fires. No endpoint scanner flags it. Every call in the chain is technically legitimate. This is the agentic version of an old truth: an LLM cannot reliably separate instructions from data. The moment you give that model tools and standing access, the blast radius stops being a bad answer and becomes a real action. A chatbot produces text. An agent produces effects: it queries a database, moves a file, approves a transaction, calls an API. The numbers around production deployments are not reassuring. Most organizations running agents have already had a confirmed or suspected security incident, and only a small fraction went live with full security sign-off. The deployment velocity is far ahead of the controls. Treat the agent like a powerful new hire you do not fully trust yet. 1. Separate the data plane from the instruction plane. Content retrieved from tools is information, never commands. Make that explicit in how you assemble context. Wrap untrusted tool output so it is clearly data, not instructions. def as evidence source: str, content: str - str: return f"