{"slug": "traces-show-what-your-agent-did-a-decision-ledger-shows-what-it-was-allowed-to", "title": "Traces show what your agent did - a decision ledger shows what it was allowed to do", "summary": "A developer introduces a decision ledger for agent observability that records not just what happened but what was authorized. The ledger uses hash-bound records and a chain structure to enable verifiers to prove that executed actions match authorized decisions, catching failures like dropped entries or swapped outcomes. The approach also enforces a bijection between execution traces and decision records to ensure no tool runs off-ledger.", "body_md": "Agent observability has gotten good at answering **what happened**: OpenTelemetry spans for each model call and tool execution, structured event logs, replayable traces. If a run misbehaves, you can reconstruct the sequence.\n\nBut for anything that has to stand up to an incident review or a compliance ask, \"what happened\" isn't the question. The question is **what was authorized**:\n\nEvery one of those passes through a decision point in your agent runtime — a policy callback, a confirmation gate, a per-tool auth check. But traces describe **execution**; almost nothing writes down the **authority**. That's the gap a decision ledger fills.\n\nHere's the part that took me a while to get right: a decision ledger that's just \"more events\" buys you nothing. To be *auditable* rather than merely verbose, it has to support a verifier that can prove ** executed == authorized** without trusting the agent's own narration. That decomposes into three layers, and each catches a failure the others can't.\n\nEach decision and each outcome is a well-formed, canonicalized, hash-bound record. The load-bearing field is on the *outcome*: it must commit to the decision that authorized it.\n\n```\ndecision_event = { decision_id, action_ref, principal, auth_mode,\n                   policy_version, decision_state, args_digest, ts }\n\noutcome_event  = { action_ref,\n                   decision_digest = SHA256(JCS(decision_event)),\n                   result_digest, terminal_state, ts }\n```\n\n`action_ref`\n\nanswers *\"are these two events about the same intended action?\"* — make it content-derived (e.g. `SHA256(JCS({agent_id, action_type, scope, ts}))`\n\n) so any verifier can recompute it from the intent alone, with no shared runtime state.\n\n`decision_digest`\n\nanswers a *different* question: *\"did this outcome commit to the exact decision that authorized it?\"* Keep the two separate — collapsing them loses your ability to catch a **swapped outcome** (a result re-attributed to the wrong decision).\n\nLayer 1 can only reason about entries that *exist*. It cannot see an entry that was **never written** — and that's the highest-stakes failure for incident response, because a tool call that bypassed the policy path (or a crash between authority-grant and ledger-write) looks like *silence*, not a malformed row.\n\nClose it by chaining: each entry carries `prev_digest`\n\npointing at the prior ledger head, and each turn/session close records the current `ledger_head_digest`\n\n. Now the ledger is an append-only chain, and a dropped entry shows up as a **broken chain** — detectable without trusting the writer.\n\nThis catches two things Layer 1 can't:\n\n`allowed`\n\n, the handler then raises or times out, and no outcome is ever written. Indistinguishable from \"allowed and silently succeeded\" `allowed`\n\n.⚠️\n\nConcurrency gotcha.If your agent runs tool callsin parallel(most frameworks do), a naive`prev_digest`\n\nchainforks: two appends both chain to head`H`\n\n, and a fork becomes indistinguishable from a drop. Two fixes —serialize the append(single-writer per session: a lock or a monotonic sequence, even while the tools themselves run concurrently), or model the ledger as an explicitDAGwhere each entry records a parentsetand the head is a Merkle root over the closed frontier. Pick one, and make sure the verifier knows which shape it's checking: a linear verifier mustrejectforks; a DAG verifier mustacceptshared parents.\n\nThe final layer ties the ledger back to the execution trace you already emit. Require a **bijection at the action boundary**:\n\nevery executed tool span maps to exactly one\n\n`allowed`\n\ndecision and exactly one terminal outcome — and vice versa.\n\nThe trace proves execution *happened*; the ledger proves it was *authorized*; the bijection between them is the \"**no tool executes off-ledger**\" invariant. It's the omission detector that Layer 1's per-entry rules structurally cannot express, because it reasons across two independent systems.\n\nPut together, the invariant a verifier can now assert is:\n\nNothing executed unauthorized, and nothing authorized vanished.\n\nThat's the actual compliance property — and you cannot get it from logging alone, no matter how thorough. Per-entry conformance proves each record is well-formed and bound; the chain proves the *set* is complete; the bijection proves the set matches reality.\n\nThe deeper principle is one I keep coming back to: a step that *reasons* can only ask you to trust it; a step that emits a **re-checkable artifact** — a content hash, a solver's optimality certificate, a recomputable digest — turns \"we logged it\" into \"anyone can re-run it and get the same answer.\" Move the factual, state-changing parts of an agent through deterministic tools that leave certificates, and the audit stops being a leap of faith.\n\n(That re-checkable-certificate idea is what I've been building into [OraClaw](https://github.com/Whatsonyourmind/oraclaw) — deterministic decision tools that return verifiable results — but the three-layer ledger above is framework-agnostic; it's worth wiring into whatever runtime you're on.)\n\nIf you're building agents that will ever face an auditor, the cheapest time to add the ledger is before you need it.", "url": "https://wpnews.pro/news/traces-show-what-your-agent-did-a-decision-ledger-shows-what-it-was-allowed-to", "canonical_source": "https://dev.to/whatsonyourmind/traces-show-what-your-agent-did-a-decision-ledger-shows-what-it-was-allowed-to-do-18b5", "published_at": "2026-06-25 12:11:20+00:00", "updated_at": "2026-06-25 12:12:52.948011+00:00", "lang": "en", "topics": ["ai-agents", "developer-tools", "ai-safety"], "entities": ["OpenTelemetry"], "alternates": {"html": "https://wpnews.pro/news/traces-show-what-your-agent-did-a-decision-ledger-shows-what-it-was-allowed-to", "markdown": "https://wpnews.pro/news/traces-show-what-your-agent-did-a-decision-ledger-shows-what-it-was-allowed-to.md", "text": "https://wpnews.pro/news/traces-show-what-your-agent-did-a-decision-ledger-shows-what-it-was-allowed-to.txt", "jsonld": "https://wpnews.pro/news/traces-show-what-your-agent-did-a-decision-ledger-shows-what-it-was-allowed-to.jsonld"}}