cd /news/ai-agents/stop-trusting-the-agent-bind-tool-ca… · home topics ai-agents article
[ARTICLE · art-31280] src=dev.to ↗ pub= topic=ai-agents verified=true sentiment=↑ positive

Stop trusting the agent: bind tool-call approvals to the exact call

A developer argues that using a boolean flag for tool-call approvals in agentic systems is insecure and proposes binding approvals to specific calls via HMAC-signed tokens. The approach prevents replay, argument drift, and principal-swap attacks by including call ID, canonical argument digest, principal, and expiry in the token. The solution requires deterministic key generation for replay-based durable execution engines.

read3 min views1 publishedJun 17, 2026

Agentic systems gate dangerous tool calls — file writes, money movement, deploys — behind an "approval": a human-in-the-loop click, or a policy check. Look at how that approval is usually represented and you'll often find a boolean sitting in the run/session state: approved: true

.

A boolean is the wrong primitive, and it fails in three ways that prompt injection is happy to exploit.

false

into true

.report.csv

". The approval is just true

, so the same flag is honored for the prod.db

". The boolean doesn't know which call it approved.approved

.The root cause is the same in all three: the approval is modeled as a property of the run, when it should be evidence for one specific call.

When approval is granted, mint a tag over the things that must not change: the tool-call id, a digest of the canonical arguments, the principal, and an expiry. Verify it at dispatch, against a per-run secret.

import hmac, hashlib, json, time

def canon(args: dict) -> bytes:
    return json.dumps(args, sort_keys=True, separators=(",", ":")).encode()

def mint(key: bytes, call_id: str, args: dict, principal: str, ttl: int = 300) -> dict:
    exp = int(time.time()) + ttl
    digest = hashlib.sha256(canon(args)).hexdigest()
    msg = f"{call_id}|{digest}|{principal}|{exp}".encode()
    tag = hmac.new(key, msg, hashlib.sha256).hexdigest()
    return {"call_id": call_id, "principal": principal, "exp": exp, "tag": tag}

def verify(key: bytes, tok: dict, call_id: str, args: dict, principal: str) -> bool:
    if tok.get("call_id") != call_id:      return False   # replay onto another call
    if tok.get("principal") != principal:  return False   # wrong principal
    if tok.get("exp", 0) < time.time():    return False   # expired
    digest = hashlib.sha256(canon(args)).hexdigest()
    msg = f"{call_id}|{digest}|{principal}|{tok['exp']}".encode()
    expect = hmac.new(key, msg, hashlib.sha256).hexdigest()
    return hmac.compare_digest(expect, tok["tag"])         # forged / flipped / arg-drift

Run the three attacks against it (plus principal-swap and a forged tag):

KEY = b"per-run-secret-not-a-global-one"
tok = mint(KEY, "call-1", {"amount": 10, "to": "alice"}, "user:42")   # approve $10 to alice

verify(KEY, tok, "call-1", {"amount": 10,    "to": "alice"}, "user:42")  # True   legit
verify(KEY, tok, "call-2", {"amount": 10,    "to": "alice"}, "user:42")  # False  replay
verify(KEY, tok, "call-1", {"amount": 10000, "to": "alice"}, "user:42")  # False  arg drift
verify(KEY, tok, "call-1", {"amount": 10,    "to": "alice"}, "user:99")  # False  wrong principal
verify(KEY, {**tok, "tag": "00"*32}, "call-1", {"amount": 10, "to": "alice"}, "user:42")  # False  forged

The flag can no longer be flipped (no valid tag), replayed (call-id is in the MAC), or drifted (args digest is in the MAC). An attacker who fully controls the transported state still can't manufacture a token without the key.

10

vs 10.0

vs 1e1

must agree) — RFC 8785 (JSON Canonicalization Scheme) is the off-the-shelf answer. Put the canonicalization recipe id inside the hashed bytes so the two sides can't silently disagree about the rules.AUTO_FUNCTION_INVOCATION

filter (don't call next

⇒ the call is skipped), ADK's before_tool

callback, or the MCP tool-call boundary. Tools that need approval are classified as such; anything unclassified is denied, not allowed through.If your agent runs on a replay-based durable-execution engine (Temporal and friends), the per-run secret must survive replay. Workflow code is re-executed from history on recovery, so a key minted with a non-deterministic call won't match the token already in history — approvals verify fine in dev and then fail closed after the first worker restart, which is the worst possible time to discover it. Derive the key deterministically (HKDF(server_secret, run_id)

) or establish it once via a recorded side-effect, and make the expiry deterministic too rather than reading wall-clock inside workflow code.

Authorization in an agent system shouldn't be ambient, mutable state that travels with the run. It should be evidence bound to a single call envelope — this principal, this tool, these exact arguments, until this time — that the executor re-verifies at the moment of dispatch. The boolean isn't a simplification of that; it's the bug.

I work on reliability and verification for AI and numerical systems — agent authorization, determinism, and "prove the thing that claims to be authorized actually was." The snippet above is runnable as-is. Happy to compare notes if you're hardening an agent's tool boundary — GitHub.

── more in #ai-agents 4 stories · sorted by recency
── more on @temporal 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/stop-trusting-the-ag…] indexed:0 read:3min 2026-06-17 ·