cd /news/ai-agents/aatf-an-open-spec-for-recording-why-… · home topics ai-agents article
[ARTICLE · art-28945] src=github.com ↗ pub= topic=ai-agents verified=true sentiment=↑ positive

AATF – An open spec for recording why AI agents make decisions

The Agent Audit Trail Format (AATF) is an open specification for recording why AI agents make decisions, including alternatives considered, confidence scores, and rejected options. It provides a structured, tamper-evident format for accountability, distinct from logging or tracing tools. The project includes a reference SDK and aims to improve transparency in AI agent decision-making.

read4 min views1 publishedJun 16, 2026

The open specification and reference SDK for recording AI Agent decision chains.

Quick Start · The Format · Why Not Existing Tools? · SPEC · Examples

AATF is not another logging library. It's an open specification for recording why an AI Agent made each decision — including what alternatives it considered, how confident it was, and what it chose not to do.

Think of it as:

OpenTelemetry→ for observability** AATF**→ for Agent decision accountability

User asks: "Book a flight to Shanghai"

Step 1: [human_input]  → User request received
Step 2: [reasoning]    → Intent: flight booking (confidence: 0.95)
                          Alt: hotel booking → rejected (user said "flight")
                          Alt: train booking → rejected (user said "flight")
Step 3: [tool_call]    → flight_search_api (342ms) → 3 results
Step 4: [reasoning]    → Decision: CA1234 at ¥2580 (confidence: 0.88)
                          Alt: MU5678 at ¥2890 → rejected (¥310 more)
                          Alt: CZ9012 at ¥3200 → rejected (over budget)

→ SHA-256 hash chain: ✓ tamper-evident
→ PII redaction: ✓ email, phone, card numbers
→ Export: JSON / CSV / HTML (AATF-compliant)
python
from agent_audit_trail import AuditSession, Decision, Alternative

with AuditSession(agent_id="my-agent") as session:
    session.add_reasoning_step(
        name="choose_tool",
        decision=Decision(
            input_summary="User wants weather info",
            decision="Use weather API",
            reasoning="Factual query requiring real-time data",
            confidence=0.95,
            alternatives_considered=[
                Alternative(description="Answer from memory",
                           reason_rejected="Weather changes constantly"),
                Alternative(description="Ask for clarification",
                           reason_rejected="Query is clear enough"),
            ]
        )
    )

That's it. Every decision is now recorded with its reasoning, confidence score, and rejected alternatives — in AATF-compliant format.

The heart of AATF is the Decision record:

{
  "type": "reasoning",
  "name": "intent_classification",
  "decision": {
    "input_summary": "User wants to book a flight to Shanghai",
    "decision": "Classified as flight-booking intent",
    "reasoning": "Explicit keywords: 'flight' + destination + budget",
    "confidence": 0.95,
    "confidence_basis": "All three slots explicitly stated by user",
    "alternatives_considered": [
      {
        "description": "Hotel booking intent",
        "reason_rejected": "User said 'flight', not 'hotel'",
        "score": 0.05
      },
      {
        "description": "Train booking intent",
        "reason_rejected": "User explicitly said 'flight'",
        "score": 0.02
      }
    ]
  },
  "step_hash": "458942bbf4162f4d9cca121d93b9423413ec..."
}
Feature What It Does Why It Matters
alternatives_considered
Forces agents to list what they didn't choose
Proves the agent didn't just rationalize a foregone conclusion
confidence + confidence_basis
Numeric confidence + how it was determined
Lets auditors distinguish "95% sure because X" from "95% sure because vibes"
confidence_trajectory
Tracks confidence across the full decision chain Reveals when an agent becomes more or less certain as it gathers information

We respect the existing ecosystem. Here's where AATF fits:

Tool What It Does What AATF Does Differently
Blockchain ledgers (Notary, Action Ledger)
Store agent actions on-chain for immutability We're format-agnostic. Store wherever you want. We focus on what to record, not where.
LangChain callbacks
Framework-specific tracing We're framework-agnostic. Works with CrewAI, AutoGen, raw Python, or anything.
MCP audit tools
Audit tool calls in MCP protocol We go deeper: not just what tool was called, but why it was chosen over alternatives.
General logging (structlog, etc.)
Key-value event logs We're structured for decision reasoning, not generic events.

TL;DR: Other tools audit what the agent did. AATF audits why the agent did it.

from agent_audit_trail.integrations.langchain import AATFCallbackHandler
agent = create_agent(callbacks=[AATFCallbackHandler()])

from agent_audit_trail.integrations.openai import AATFOpenAIWrapper
client = AATFOpenAIWrapper(OpenAI())

from agent_audit_trail import audit_traced
@audit_traced(agent_id="my-agent")
def my_agent_function(query):
    return "answer"
pip install agent-audit-trail

Zero external dependencies. Python 3.10+. 700 lines of pure stdlib.

We used AATF to audit ourselves — an AI Agent reflecting on its own product's flaws. The result is a tamper-evident, 10KB audit trail that proves every reasoning step was genuine and not post-hoc rationalized.

📄 View the full audit trail JSON

AATF is an open specification, not a product. The SDK is the reference implementation.

📋 Read the full AATF v0.1.0 Specification

This is a draft spec. We want your feedback. Open an issue if you disagree with any design decision. Especially:

  • Should alternatives_considered

be mandatory or optional? - Is confidence

(0.0-1.0) the right abstraction, or should we use qualitative labels? - What hash algorithm should be standard? (Currently SHA-256)

  • Should the format support streaming/traces that are still in-progress?
Role What You Get
Agent Developer
Prove your agent reasons well. Debug decision failures. Show stakeholders the full chain.
Compliance Officer
Machine-parseable audit trails that map to EU AI Act, GDPR, SOC2 requirements.
CISO
Tamper-evident hash chains. PII redaction built-in. Export for auditors.
Researcher
Structured data on agent reasoning patterns. Confidence trajectories. Decision trees.
  • ✅ AATF Specification v0.1.0
  • ✅ Reference SDK (Python) — 134 tests passing
  • ✅ PII Redaction (email, phone)
  • ✅ Hash Chain Integrity Verification
  • ✅ LangChain / OpenAI / Generic Integrations
  • ✅ JSON / CSV / HTML Export
  • 🔲 PII Redaction expansion (credit card, SSN, API keys, IP)
  • 🔲 TypeScript/JavaScript SDK
  • 🔲 Community RFC process for spec changes
  • 🔲 LangChain/CrewAI published plugins

This project wants contributors. If you care about Agent accountability:

Read the— understand the formatSPEC** Open an issue**— disagree with something? We want to hear it** Build an integration**— your framework? Your plugin welcome** Spread the word**— star, tweet, blog post

MIT. Use it, fork it, improve it. The spec belongs to everyone.

If your Agent can think, its thinking should be auditable.

pip install agent-audit-trail

── more in #ai-agents 4 stories · sorted by recency
── more on @aatf 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/aatf-an-open-spec-fo…] indexed:0 read:4min 2026-06-16 ·