MCP: defending the runtime layer of agent security Agent security has four layers — identity, pre-deploy testing, observability, and runtime defense — but only the runtime layer can stop a malicious tool call from executing. While identity providers, observability platforms, and testing tools operate before or after an action, the runtime hot path between an agent deciding to invoke a tool and the tool executing remains a defensive gap that no existing SaaS solution can fill without introducing latency. Arcis's v1.6.0 release targets this gap with vector V32, making runtime defense feasible by porting HTTP-style request protections — allowlisting, sanitization, and refusal — to the agent tool-call boundary. Agent identity tells you who. Observability tells you what happened. Pre-deploy testing tells you what could happen in dev. None of them stop the agent from actually firing a poisoned tool call at runtime. That gap is where defense lives now. The takeaways. - Agent security has four layers: identity, pre-deploy testing, observability, and runtime defense. Only one of them can refuse a request. - The runtime layer is structurally underserved. Identity and observability companies cannot sit in the request hot path without becoming inline middleware, which they are not. - MCP's explicit tool-call contract makes runtime defense feasible. A tool with a known argument schema is a request with known shape. - The same techniques that protect HTTP request boundaries allowlist, sanitize, refuse port directly to the agent boundary. Six months ago, "agent security" was mostly a research problem. Today it is a category. There are now five-plus YC-backed companies whose pitch decks are built on some version of "protect the agent." Identity providers for agents. Observability for agent traces. Pre-deploy red-teaming for agent harnesses. Offensive testing of agentic systems. Each one stakes out a layer of a stack that did not exist eighteen months ago. What none of them sit in is the request hot path. The moment between an agent deciding to invoke a tool and the tool actually executing is a defensive gap. It is also the only gap where you can stop something bad from happening, because every other layer either runs before testing or after observability the action itself. This is the layer that Arcis has been building toward since the start, and the layer that v1.6.0 made explicit with vector V32: agent toolcall injection. The agent stack, by layer Before the gap argument, the stack itself. Four layers, top to bottom, in the order they touch a request. Identity. Who is this agent. Is it authenticated, is it the same one we issued credentials to last week, has its private key been compromised. Companies in this space include the Auth0-shaped incumbents and a new generation aimed at agent identity specifically, where the entity being authenticated is not a human and may not have a stable session. Pre-deploy testing. Before this agent goes to production, what could it do under hostile prompts. Automated red-teaming, fuzz testing of tool combinations, adversarial prompt generation. Mostly a CI-time concern: run the test harness, find the dangerous tool combinations, patch the prompts or remove the tools. Observability. The agent is running. What did it do. Which tools did it call, with which arguments, in which sequence. How long did it take. Did it error. This is the Datadog or Honeycomb of agent runs. Mostly a post-hoc concern: you find the bad call by analyzing traces, which means the bad call already happened. Defense. The agent is mid-execution. It just decided to call a tool. Should this specific call, with these specific arguments, at this specific moment, be allowed to proceed. This is the runtime hot path. It is also the layer that is structurally hardest to do as a SaaS, because it has to run in the same process as the agent's execution loop or you have introduced a network round-trip into every tool call, which kills latency. The gap argument Three of those four layers are advisory. Identity tells you who the agent is, but does not constrain what they can do. Observability tells you what happened, but only after it happened. Pre-deploy testing tells you what could happen in dev, but the production prompt is not the dev prompt, and the production tool set is not the dev tool set. The defense layer is the only one that can prevent. The others can detect, alert, audit, and triage. Only defense can refuse. This is the same argument that has played out at every layer of web security for thirty years. WAFs detect, IDS systems alert, SIEMs audit. The inline middleware that runs in your handler is the only thing that can refuse a request that should not have been honored. The agent story is the same story, set inside a different system. What toolcall injection looks like The shape of the attack, at a conceptual level, is straightforward. An agent has access to a set of tools. The agent is given a user prompt. The agent decides which tools to call and with what arguments. The model itself is the decision-maker, and the model can be manipulated by content in the prompt. Consider a customer-support agent with access to two tools: refund order order id, amount and send email recipient, body . A user message arrives. It says something polite about a damaged product. Embedded in the message, in a way that the agent may or may not parse as part of the user instruction, is a sequence of characters that resembles a tool-call output marker, followed by an instruction to refund a different order to a different account. Concretely, the request that hits your agent might look like this: { "role": "user", "content": "Hi, my package arrived damaged.\n\n