Piper – DevOps copilot where the LLM picks typed actions, not shell Piper, a new DevOps copilot, operates with a safety-first architecture where a large language model never directly executes commands — it only selects typed actions from a fixed catalog, which are then validated by deterministic code and run locally on the user's machine. The tool uses a conversational terminal interface to drive existing tools like SSH, kubectl, and Docker, but gates any mutating operations behind explicit human approval, preventing the LLM from reaching infrastructure without consent. By separating the LLM's planning role from command execution, Piper aims to provide a secure, auditable alternative to traditional AI-powered command-line tools that generate arbitrary shell strings. DevOps at the speed of thought. A terminal-first, LLM-driven DevOps copilot that is safe by construction — the LLM proposes, deterministic code validates, the human approves anything that mutates. Why why-this-exists · · quick-start Quick start · the-deterministic-gate The gate · action-catalog Catalog · knowledge-base--rag Knowledge base Security Important The LLM never executes anything. It only picks an action from a fixed catalog. PIPER then validates the choice and runs the command on your own machine through a single audited executor. The LLM is a planner, not a shell. This is the entire product. PIPER pulls the relevant runbook from its knowledge base, runs read-only diagnostics over SSH, finds the planted issues, proposes fixes — and refuses to apply them, because M1 is read-only. The LLM proposes; the deterministic gate validates; the human stays in the loop. PIPER drives the tools you already trust ssh , kubectl , docker , gh , aws , gcloud , journalctl , ... from a conversational terminal UI — but every command runs locally , picked from a typed action catalog , validated by a path denylist + secret scrubber, and for anything that mutates gated behind an explicit human approval. The LLM can hallucinate freely; it cannot reach your infrastructure unless a real human says yes. › uptime, memory and disk on staging — tail nginx logs if anything looks off PIPER planning… 3 actions chosen from the catalog 1. system.uptime 2. system.memory 3. system.disk usage ✓ system.uptime 520ms, ran locally ✓ system.memory 340ms, ran locally ✓ system.disk usage 410ms, ran locally ▌ Y ◉ ◉ Y ▌ Staging has been up for 14 days with a 0.43 load average ev-1 . Memory ▌ has plenty of headroom — 12 GB free out of 16 GB total ev-2 — and the ▌ root volume sits at 38% ev-3 . Nothing worth flagging on the resource ▌ side, no need to dig into the nginx logs right now. Every ev-N is a link back to the exact command output that produced the claim. PIPER cannot make claims without evidence — the verifier rejects ungrounded synthesis and retries. This is the heart of the product. Read it twice. | What most LLM CLIs do | What PIPER does | | |---|---|---| Who composes the command | The LLM writes a shell string tail -f … , kubectl get … | The LLM picks an action name + typed args from a closed catalog | Who runs it | An execution layer that runs whatever the LLM wrote | PIPER's local executor runs a fixed command template bound to that action | What if the LLM hallucinates | A bogus command might run on your infrastructure | The catalog has no entry for a bogus action → the executor refuses | What you can audit | Prompts + arbitrary shell history | A typed list of actions in source — src/actions/builtin/ — plus the verbatim local exec in audit log | Where the command runs | Sometimes a remote sandbox, sometimes your machine | Always your machine. Local subprocess, optionally SSH'ing into an allowlisted host you registered | Concretely: when the LLM wants to check disk space, it does not emit "df -h /" . It emits a typed tool call — { "name": "system.disk usage", "args": { "host": "staging", "path": "/" } } — and PIPER's executor only src/exec/executor.ts runs anything translates that into df -h / and spawns a local subprocess. The shell string is built in PIPER's source code , not by the LLM. The args are validated by Zod before the spawn. Secrets are stripped on the way out, before going to the audit log and before going back to the LLM. The LLM can ask for system.disk usage . It cannot ask for system.evil undocumented thing . That's the safety property. We built PIPER for two people, both real: The lone developer with no DevOps support. You shipped an app, you need to keep it running, and there's no one to call when the staging container won't come up at 11pm. Today your fallback is pasting logs into ChatGPT and hoping. The DevOps engineer doing the same diagnostic dance fifty times a day. Tail the logs on that node. Check why the deploy is stuck. Verify the cron. You don't need a tutor — you need an editor for infrastructure with audit trail and rollback wired in. Both meet on the same contract: PIPER never silently mutates anything, and you can always see exactly what it is about to do. Not an autonomous agent. PIPER does not act on mutate / destructive actions without approval, and never will. Not a chat product. The TUI is a working surface, not a conversation. Not a Kubernetes admin panel, not a CI replacement, not a monitoring tool. PIPER drives the CLIs you already trust and adds the safety + grounding layer. Not a black box. Every action, prompt, approval rule and audit log entry is readable in source. | Milestone | What | State | |---|---|---| M0 | Spike — Bun --compile + Ink + PGlite WASM | ✅ shipped | M1 | Read-only diagnostics: SSH, logs, health, container/pod status, deterministic gate | ✅ shipped | M1.5 | RAG/memory layer, 3 embedding backends, sessions + resume, auto-compaction, interactive /model & /memory , HUMAN/YOLO modes, 40+ read actions | ✅ shipped | M2 | Mutations behind HITL — docker deploy, env updates, migrations, rollback | ⏳ next | M3 | Scale — Kubernetes deploys, continuous monitor loop, repo suggestions | ⏳ | M4 | On-prem / regulated — local-model-only path, encrypted audit, runbook ingestion at install | ⏳ | No mutate or destructive tier actions exist in the catalog yet, and the runner explicitly refuses them. M1.5 is fully diagnostic by design. Download the binary for your platform from the latest release https://github.com/antoniociccia/piper/releases/latest — no Bun, no node modules , single ~76 MB file. macOS Apple Silicon curl -fsSLO https://github.com/antoniociccia/piper/releases/latest/download/piper-darwin-arm64 chmod +x piper-darwin-arm64 && mv piper-darwin-arm64 /usr/local/bin/piper piper Linux x64 curl -fsSLO https://github.com/antoniociccia/piper/releases/latest/download/piper-linux-x64 chmod +x piper-linux-x64 && sudo mv piper-linux-x64 /usr/local/bin/piper piper A .sha256 is published alongside each binary — verify the download before running. Need Bun https://bun.sh ≥ 1.2 one-line install: curl -fsSL https://bun.sh/install | bash . git clone https://github.com/antoniociccia/piper cd piper bun install bun dev On first launch PIPER detects that ~/.piper/credentials.json doesn't exist and runs an interactive wizard: Backend — probes for any local LLM server running Ollama :11434 , LM Studio :1234 , llama.cpp :8080 , vLLM :8000 , or asks for an OpenRouter API key. Model — pick a tier Featherweight ~$0.10/M, Economy ~$0.44/M, Balanced ~$3/M, Premium $30+/M or a local model from the listed catalog. Embedding backend — wasm default, in-process, offline after first run , http local OpenAI-compatible endpoint , openrouter cloud, paid , or none disable RAG . Budget — per-session USD cap default $0.50; hard stop, not a warning . SSH environment — optionally add a first host PIPER will be able to reach. The wizard writes ~/.piper/credentials.json with mode 0600 . From there: › check uptime and disk usage on staging To resume a previous session at startup: bun dev -- --resume opens a picker over recent sessions bun run build ./dist/piper, ~76 MB ./dist/piper runs without Bun, without node modules The binary embeds PostgreSQL WASM ~13 MB and Yoga layout. The embedding model is not bundled — it lazy-fetches on first RAG use ~120 MB, one time and caches at ~/.piper/cache/models/ . You cannot get "no hallucination" from an LLM. Don't try. Instead, make being wrong safe . PIPER's LLM lives inside a deterministic cage: ┌────────────┐ proposes actions ┌────────────────┐ │ LLM │ ─────tool calls──────► │ Action catalog│ │ any model │ │ read|mutate| │ │ │ │ destructive │ └────────────┘ └───────┬────────┘ ▲ │ validate │ scrubbed │ args Zod │ messages ▼ │ ┌────────────────┐ │ │ Executor │ │ │ the ONLY │ │ │ side-effect │ │ │ surface │ │ └───────┬────────┘ │ │ │ scrub stdout/stderr │ spawns kubectl / └──────────────────────────────────────┤ docker / ssh / │ nc / gh / ... ▼ ┌──────────────────┐ │ PGlite + pgvector│ │ audit log, │ │ evidence, │ │ knowledge │ └──────────────────┘ Three permission tiers with no overrides: | Tier | Examples | Approval | |---|---|---| read | uptime , docker.ps , kubectl get | None. Executes directly. Safe by definition. | mutate | M2 docker deploy , env update | Per-env approval prompt; remembered | destructive | M2 delete , drop , prune , force-push | Fresh prompt every time. Never remembered. Ever. | Five overlapping defenses, applied at every layer: Architectural — SSH keys never leave the OS ssh binary; API keys never enter messages .content . Single-module discipline + CI rule. Path denylist — ~/.ssh/id , ~/.aws/credentials , ~/.kube/config , ~/.gnupg/ , ~/.docker/config.json , ~/.netrc , ~/.piper/ , .env . Non-disablable. User config can extend the list, never weaken it. Two-pass scrubbing — write-time every Executor output → audit log and pre-LLM every message body → HTTP call . Defense in depth. Args refuse — if the LLM tries to embed a recognisable secret AKIA… , sk-or-… , JWTs, PEM blocks, Bearer … in an action's args, the Executor refuses the action — it does not redact. Redaction would mutate semantics. Provider-level privacy — OpenRouter requests set body.provider.data collection = 'deny' . Local mode routes inference through Ollama / llama.cpp / LM Studio / vLLM — network egress for inference is zero. Full design rationale in docs/architecture.md /antoniociccia/piper/blob/main/docs/architecture.md and . /antoniociccia/piper/blob/main/docs/decisions/ADR-001-deterministic-gate.md docs/decisions/ADR-001-deterministic-gate.md 40+ read-tier actions across the major DevOps surfaces. Every action is a typed object registered in src/actions/builtin/ , validated by Zod, executed only through src/exec/executor.ts . Free-form shell from the LLM is not representable in the type system. Click to expand the full catalog | Category | Action | What it does | |---|---|---| System | system.uptime | uptime load average + time up | system.os info | uname -a + /etc/os-release | | system.memory | free -h | | system.disk usage | df -h path? | | system.process list | ps -eo pid,user,pcpu,pmem,args -ww | | system.list dir | ls -la