{"slug": "stop-trusting-the-agent-bind-tool-call-approvals-to-the-exact-call", "title": "Stop trusting the agent: bind tool-call approvals to the exact call", "summary": "A developer argues that using a boolean flag for tool-call approvals in agentic systems is insecure and proposes binding approvals to specific calls via HMAC-signed tokens. The approach prevents replay, argument drift, and principal-swap attacks by including call ID, canonical argument digest, principal, and expiry in the token. The solution requires deterministic key generation for replay-based durable execution engines.", "body_md": "Agentic systems gate dangerous tool calls — file writes, money movement, deploys — behind an \"approval\": a human-in-the-loop click, or a policy check. Look at how that approval is usually represented and you'll often find a boolean sitting in the run/session state: `approved: true`\n\n.\n\nA boolean is the wrong primitive, and it fails in three ways that prompt injection is happy to exploit.\n\n`false`\n\ninto `true`\n\n.`report.csv`\n\n\". The approval is just `true`\n\n, so the same flag is honored for the `prod.db`\n\n\". The boolean doesn't know which call it approved.`approved`\n\n.The root cause is the same in all three: the approval is modeled as a **property of the run**, when it should be **evidence for one specific call**.\n\nWhen approval is granted, mint a tag over the things that must not change: the tool-call id, a digest of the canonical arguments, the principal, and an expiry. Verify it at dispatch, against a per-run secret.\n\n``` php\nimport hmac, hashlib, json, time\n\ndef canon(args: dict) -> bytes:\n    # canonical serialization so benign reserialization doesn't invalidate a token.\n    # (production: RFC 8785 JCS, which also normalizes numbers — 10 vs 10.0)\n    return json.dumps(args, sort_keys=True, separators=(\",\", \":\")).encode()\n\ndef mint(key: bytes, call_id: str, args: dict, principal: str, ttl: int = 300) -> dict:\n    exp = int(time.time()) + ttl\n    digest = hashlib.sha256(canon(args)).hexdigest()\n    msg = f\"{call_id}|{digest}|{principal}|{exp}\".encode()\n    tag = hmac.new(key, msg, hashlib.sha256).hexdigest()\n    return {\"call_id\": call_id, \"principal\": principal, \"exp\": exp, \"tag\": tag}\n\ndef verify(key: bytes, tok: dict, call_id: str, args: dict, principal: str) -> bool:\n    if tok.get(\"call_id\") != call_id:      return False   # replay onto another call\n    if tok.get(\"principal\") != principal:  return False   # wrong principal\n    if tok.get(\"exp\", 0) < time.time():    return False   # expired\n    digest = hashlib.sha256(canon(args)).hexdigest()\n    msg = f\"{call_id}|{digest}|{principal}|{tok['exp']}\".encode()\n    expect = hmac.new(key, msg, hashlib.sha256).hexdigest()\n    return hmac.compare_digest(expect, tok[\"tag\"])         # forged / flipped / arg-drift\n```\n\nRun the three attacks against it (plus principal-swap and a forged tag):\n\n```\nKEY = b\"per-run-secret-not-a-global-one\"\ntok = mint(KEY, \"call-1\", {\"amount\": 10, \"to\": \"alice\"}, \"user:42\")   # approve $10 to alice\n\nverify(KEY, tok, \"call-1\", {\"amount\": 10,    \"to\": \"alice\"}, \"user:42\")  # True   legit\nverify(KEY, tok, \"call-2\", {\"amount\": 10,    \"to\": \"alice\"}, \"user:42\")  # False  replay\nverify(KEY, tok, \"call-1\", {\"amount\": 10000, \"to\": \"alice\"}, \"user:42\")  # False  arg drift\nverify(KEY, tok, \"call-1\", {\"amount\": 10,    \"to\": \"alice\"}, \"user:99\")  # False  wrong principal\nverify(KEY, {**tok, \"tag\": \"00\"*32}, \"call-1\", {\"amount\": 10, \"to\": \"alice\"}, \"user:42\")  # False  forged\n```\n\nThe flag can no longer be flipped (no valid tag), replayed (call-id is in the MAC), or drifted (args digest is in the MAC). An attacker who fully controls the transported state still can't manufacture a token without the key.\n\n`10`\n\nvs `10.0`\n\nvs `1e1`\n\nmust agree) — RFC 8785 (JSON Canonicalization Scheme) is the off-the-shelf answer. Put the canonicalization recipe id inside the hashed bytes so the two sides can't silently disagree about the rules.`AUTO_FUNCTION_INVOCATION`\n\nfilter (don't call `next`\n\n⇒ the call is skipped), ADK's `before_tool`\n\ncallback, or the MCP tool-call boundary. Tools that need approval are classified as such; anything unclassified is denied, not allowed through.If your agent runs on a replay-based durable-execution engine (Temporal and friends), the per-run secret **must survive replay**. Workflow code is re-executed from history on recovery, so a key minted with a non-deterministic call won't match the token already in history — approvals verify fine in dev and then **fail closed after the first worker restart**, which is the worst possible time to discover it. Derive the key deterministically (`HKDF(server_secret, run_id)`\n\n) or establish it once via a recorded side-effect, and make the expiry deterministic too rather than reading wall-clock inside workflow code.\n\nAuthorization in an agent system shouldn't be ambient, mutable state that travels with the run. It should be **evidence bound to a single call envelope** — this principal, this tool, these exact arguments, until this time — that the executor re-verifies at the moment of dispatch. The boolean isn't a simplification of that; it's the bug.\n\n*I work on reliability and verification for AI and numerical systems — agent authorization, determinism, and \"prove the thing that claims to be authorized actually was.\" The snippet above is runnable as-is. Happy to compare notes if you're hardening an agent's tool boundary — GitHub.*", "url": "https://wpnews.pro/news/stop-trusting-the-agent-bind-tool-call-approvals-to-the-exact-call", "canonical_source": "https://dev.to/whatsonyourmind/stop-trusting-the-agent-bind-tool-call-approvals-to-the-exact-call-5080", "published_at": "2026-06-17 15:11:04+00:00", "updated_at": "2026-06-17 15:21:27.747798+00:00", "lang": "en", "topics": ["ai-agents", "ai-safety", "developer-tools"], "entities": ["Temporal", "MCP", "ADK", "RFC 8785", "HMAC", "SHA-256"], "alternates": {"html": "https://wpnews.pro/news/stop-trusting-the-agent-bind-tool-call-approvals-to-the-exact-call", "markdown": "https://wpnews.pro/news/stop-trusting-the-agent-bind-tool-call-approvals-to-the-exact-call.md", "text": "https://wpnews.pro/news/stop-trusting-the-agent-bind-tool-call-approvals-to-the-exact-call.txt", "jsonld": "https://wpnews.pro/news/stop-trusting-the-agent-bind-tool-call-approvals-to-the-exact-call.jsonld"}}