{"slug": "cryptographic-forensics-for-ai-coding-agent-sessions", "title": "Cryptographic Forensics for AI Coding Agent Sessions", "summary": "DEPOSE, a cryptographic forensic framework designed to create verifiable audit trails for AI coding agent sessions (like Claude Code or Codex CLI). It addresses the problem that standard session logs (JSONL files) are not trustworthy evidence because they can be altered by anyone with shell access. DEPOSE bundles session events into a directory with a hash chain, Ed25519 signatures, and RFC 3161 timestamps, allowing a separate, static Go verifier to confirm that no events were tampered with, even if the original machine is untrusted.", "body_md": "A Claude Code or Codex CLI session writes a JSONL file to disk. If the agent runs `rm -rf`\n\non a training-data directory or `terraform destroy -auto-approve`\n\non production, that file is where an incident review starts.\n\nA JSONL file is not evidence. Anyone with shell access can rewrite it. To a third party who doesn't trust the machine it came from, it proves nothing.\n\nThat gap matters once agents have credentials to real infrastructure. Most agent observability tooling is built for debugging and quality, not for the moment after damage is done. This post is about the three cryptographic properties that turn a transcript into something an auditor or regulator can verify, and how the DEPOSE project wires them together.\n\n## Three properties\n\nAssume the machine that produced the bundle can't be trusted. Three things need to hold at once:\n\n-\n**Tamper-evident.** Any byte change has to be detectable. Hash chain over events: change a byte, replay fails. -\n**Authenticated.** The record has to be bound to a key the producer controls and publishes a fingerprint for. Ed25519 signatures over a manifest. -\n**Anti-backdated.** A party other than the producer has to anchor the record in time. RFC 3161 tokens from a public TSA.\n\nThe primitives are old and well understood. The hard part is wiring them through a normalized event schema and shipping a verifier that doesn't depend on the producer's runtime.\n\n## No LLM in the signed path\n\nEvery event is captured at execution time or normalized from the session JSONL, then committed to the hash chain. The human-readable narrative is generated separately, from deterministic Handlebars templates over the signed events. It's excluded from the root hash.\n\nIf generated prose became part of the signed record, verification would depend on model behavior staying stable and reproducible. DEPOSE avoids that dependency. The signed record is event data and hashes. The prose is templated commentary with `[#evt-<ulid>]`\n\ncitations back to the signed events. You can rewrite the narrative without affecting verification. Change an event and verification fails.\n\n## What's in a bundle\n\nA DEPOSE bundle is a directory, not an opaque archive:\n\n```\nincident-01JABC.../\n├── manifest.json            bundleId, rootHash, eventsJsonlSha256, sigs, timestamps\n├── events.jsonl             every event in canonical JSON, byte-pinned by manifest\n├── rules/destructive.yaml   ruleset used at reconstruction time\n├── narrative.md / .html     templated prose with per-event citations\n├── verify.txt               human-readable verification summary\n├── artifacts/               captured file diffs, payloads\n├── attestations/            Ed25519 signatures, RFC 3161 timestamp tokens\n└── raw/                     source JSONL, shell history, capture records\n```\n\nChange a byte of `events.jsonl`\n\n, `manifest.json`\n\n, or `rules/destructive.yaml`\n\nand verification fails. Canonical JSON follows RFC 8785 (JCS), which is what lets a Go verifier check a TypeScript-produced bundle without either side trusting the other's serializer.\n\n## Two binaries\n\nThe producer is TypeScript. The verifier is a separately-built static Go binary, `depose-verify`\n\n. The separation is deliberate: you hand the binary to whoever needs to check the bundle (auditor, opposing counsel, regulator, a customer's security team) and they run it on their own machine. No producer stack required.\n\nA passing run prints:\n\n```\nparse        OK\nsignature    OK\nchain-replay OK\nartifacts    OK\ntimestamp    OK\nPASS  bundleId=...  rootHash=...\n```\n\nThe cryptography here is mostly off-the-shelf. The actual engineering work is in normalization: getting Go and Node to serialize identically, getting timing and ordering right across capture sources, deciding what counts as one event versus two. Canonical JSON is the unsexy part. Float formatting, key ordering, unicode escapes: Go and Node have to agree byte-for-byte or the verifier rejects a bundle the producer thinks is fine. That's what the cross-language conformance vectors in `tests/conformance/`\n\nare for.\n\nVerifiers can pin a producer's expected key fingerprint and consult a revocation list, both at the command line. The RFC 3161 timestamp does double duty here: a bundle stamped before a key is revoked stays time-anchored, so \"when was this signed\" remains answerable even if the key is later compromised.\n\n## Capture modes\n\nTwo modes, different coverage.\n\n**Reconstruction** reads the Claude Code session JSONL after the fact, compares it against shell history (bash, zsh, fish) and git reflog where available, and builds a bundle. Lower-bound mode. It can verify integrity after packaging. It can't prove the original session file was complete before capture.\n\n**Active capture** installs a Claude Code `PreToolUse`\n\nhook and POSIX shell shims for the binaries that tend to do destructive things: `terraform`\n\n, `aws`\n\n, `gh`\n\n, `kubectl`\n\n, `psql`\n\n, `gcloud`\n\n, `railway`\n\n, `rm`\n\n. Records land under `~/.depose/captures/`\n\nat execution time. A later `depose package`\n\nmerges them with the session JSONL so every covered event has a verified pre-execution intent on record.\n\nDEPOSE can prove integrity of captured events. It can't prove an uninstrumented system captured everything. An agent that shells out to a binary not in the shim list, or hits an API directly, still shows up in the JSONL but won't have an active-capture record. The coverage matrix is in the repo.\n\nmacOS and Linux only. Windows isn't supported (POSIX 0600 on the key store, POSIX shell scripts for the shims). WSL2 works.\n\n## Release pipeline\n\nReleases ship with SBOMs, provenance attestations, and signed checksums. The specifics: CycloneDX for both halves, SLSA L3 provenance, and `SHA256SUMS`\n\nsigned via cosign keyless. CI rebuilds the two checked-in example bundles (an `rm -rf`\n\non training data, a `terraform destroy`\n\non infrastructure) on every push and runs three semantic tamper rejections to confirm the verifier fails closed.\n\nRight now most coding-agent session logs are treated like disposable debug output. That assumption gets weaker the moment an agent can modify infrastructure.", "url": "https://wpnews.pro/news/cryptographic-forensics-for-ai-coding-agent-sessions", "canonical_source": "https://dev.to/aftermathtech/cryptographic-forensics-for-ai-coding-agent-sessions-2oaa", "published_at": "2026-05-20 14:58:25+00:00", "updated_at": "2026-05-20 15:05:05.205551+00:00", "lang": "en", "topics": ["cybersecurity", "artificial-intelligence", "developer-tools", "data", "research"], "entities": ["Claude Code", "Codex CLI", "DEPOSE"], "alternates": {"html": "https://wpnews.pro/news/cryptographic-forensics-for-ai-coding-agent-sessions", "markdown": "https://wpnews.pro/news/cryptographic-forensics-for-ai-coding-agent-sessions.md", "text": "https://wpnews.pro/news/cryptographic-forensics-for-ai-coding-agent-sessions.txt", "jsonld": "https://wpnews.pro/news/cryptographic-forensics-for-ai-coding-agent-sessions.jsonld"}}