{"slug": "new-version-of-peers-the-ai-couple-doing-things", "title": "New version of \"peers\" – the AI couple doing things", "summary": "Peers, an open-source tool released on GitHub, uses two or more AI coding agents as cooperating peers that must clear hard, measurable gates—such as passing tests and maintaining coverage—before a task is considered done, with one agent implementing, another blind-reviewing, and an adversarial skeptic re-auditing the work. The system runs unattended, budget-capped, and container-sandboxed, and in a diagnostic test, it built an expression-language interpreter to zero defects over 50,000 random test programs, catching planted regressions and edge-case bugs the acceptance suite missed.", "body_md": "**Two AI coding agents are better than one — if you make them prove it.**\n\npeers drives **n ≥ 2** AI coding CLIs (Claude Code, Codex, …) as cooperating\npeers that don't just *agree* a task is done — they have to clear **hard,\nmeasurable gates** first: tests pass, coverage holds, no regression, no\nTODO/stub/skipped-test, secrets clean. One peer implements, the **other\nblind-reviews** (without seeing the first's notes), and an **adversarial\nskeptic** re-audits before any \"done\" is accepted. Runs **unattended**,\n**budget-capped**, and **container-sandboxed**.\n\n**Why it beats a single agent on a loop:**\n\n**Gated, not vibes-based.**\"Looks done\" never converges —*gates green + skeptic-clean*does. No convergence theater.**Blind peer review catches rubber-stamping**— an independent second pair of eyes, by construction.** An adversarial skeptic hunts the edge cases**your tests miss.** Unattended & safe:**idle-timeout supervision, USD/tick budget caps, rootless cap-dropped container, egress allow-listing.\n\nIn an instrumented diagnostic, peers built an expression-language interpreter\nboth greenfield and brownfield to **0 defects over 50,000 random test\nprograms** — catching planted regressions and self-finding edge-case bugs the\nacceptance suite never probed.\n\nDeutsche Version:\n\n[README_DE.md].\n\n**HOWTO: full audit + fix on an existing app**:[docs/HOWTO-audit-and-fix.md](/c0decave/peers/blob/main/docs/HOWTO-audit-and-fix.md)—[deutsche Anleitung](/c0decave/peers/blob/main/docs/HOWTO-audit-and-fix_DE.md):`implement`\n\nmode (build a feature from PLAN.md)[docs/MODES_IMPLEMENT.md](/c0decave/peers/blob/main/docs/MODES_IMPLEMENT.md)—[DE](/c0decave/peers/blob/main/docs/MODES_IMPLEMENT_DE.md)- Security model:\n[docs/SECURITY.md](/c0decave/peers/blob/main/docs/SECURITY.md)—[DE](/c0decave/peers/blob/main/docs/SECURITY_DE.md)\n\n```\npeers-ctl new mything --modes=audit --spec ./mything-spec.md\n$EDITOR ~/c0de/peers-c0de/mything/.peers/goals.yaml   # trim project-specific gates\npeers-ctl start mything --max-ticks 20 --max-usd 5\n```\n\nAvailable modes: see `peers-ctl modes list`\n\n. Stack multiple with\n`--modes=audit,thorough`\n\n. Current built-in modes:\n\n| Mode | What it does |\n|---|---|\n`audit` |\nbug-hunt + 3-class test coverage + secrets + deps + API stability + regression + diff-size + skip/xfail justification |\n`thorough` |\nanti-convergence-theater hard gate: N=3 consecutive clean ticks + skeptic-pass + aggressive-honesty soft goals |\n`describe` |\niterative doc-writing mode — peers write SPEC.md/ARCHITECTURE.md/DESIGN.md until N consecutive non-substantive doc commits. Use BEFORE audit on a repo that lacks docs; not composable with audit modes |\n`implement` |\nend-to-end feature implementation from a markdown PLAN.md — frozen acceptance contract, blind-review between peers, reviewer-only checkoffs, HONESTY_AUDIT + cleanliness gates (no TODO/FIXME/stubs/skipped tests at convergence). Standalone; see\n|\n\nTypical multi-mode runs:\n\n```\n# audit + thorough (recommended default for an existing codebase):\npeers-ctl new myapp --modes=audit,thorough\n\n# bare audit:\npeers-ctl new myapp --modes=audit\n\n# write docs first, audit later (two separate runs):\npeers-ctl new myapp --modes=describe                   # run 1\npeers-ctl new myapp-audit --modes=audit,thorough       # run 2\n\n# implement a feature from a PLAN.md (standalone — not composable):\npeers-ctl new myfeature --container --modes=implement --plan ./PLAN.md\n# see docs/MODES_IMPLEMENT.md for the PLAN.md schema + escape valves.\n```\n\n**Automatic hooks** (opt-out flags):\n\n(default on): substrate scans the repo once before tick 1 and writes`recon`\n\npre-tick`.peers/recon.md`\n\n(detected languages, key docs, entry-point candidates, top-level tree). Free + fast — no LLM call. Eliminates the \"blind tick 1\" penalty. Opt out:`peers-ctl start <name> --without-recon`\n\n.(default on): substrate builds a structural CODEMAP from the AST and writes`codemap`\n\npre-tick`.peers/CODEMAP.yaml`\n\n(machine-readable: every public symbol, its`file:line`\n\nand signature) plus`.peers/codemap.md`\n\n(a compact, byte-capped digest peers read as context). Free + fast — no LLM call. Primes peers with the codebase's public-API shape before tick 1, on top of recon's file-level view. Opt out:`peers-ctl start <name> --no-codemap`\n\n.(default on): when`auto-skeptic`\n\npost-convergence`consecutive_clean_ticks >= N`\n\nwould fire`convergence-reached`\n\n, the orchestrator runs ONE extra tick with a critical re-audit prompt. If the skeptic-tick stays clean → really terminal. If it surfaces a new blocking bug → counter resets, loop continues. Opt out:`peers-ctl start <name> --without-post-convergence-skeptic`\n\n.\n\n`peers-ctl new`\n\n:\n\n- creates the directory if missing (refuses to scaffold into a\nnon-empty dir unless\n`--force`\n\n); **bare name**(no`/`\n\n) lands under`$PEERS_PROJECTS_ROOT`\n\n, default`~/c0de/peers-c0de/<name>`\n\n. Path with`/`\n\nis taken verbatim;`git init`\n\n+ initial scaffold commit;- ensures a top-level\n`README.md`\n\nexists, even when`--force`\n\nis used against an existing Git repo; - copies the\n`--spec`\n\nargument to`SPEC.md`\n\n(existing file paths are read; path-looking missing values such as`./typo.md`\n\nare rejected); - runs\n`peers init`\n\n(which writes`.peers/`\n\n, tags`peers-baseline`\n\n, commits`.gitignore`\n\n, and creates`.peers/log/runs.jsonl`\n\n); - with\n`--modes=audit`\n\n, installs six audit check scripts and an audit-ready`goals.yaml`\n\n; use`--lang=js`\n\n,`--lang=rust`\n\n, or`--lang=go`\n\nfor stack-specific check entrypoints; - registers the project with\n`peers-ctl`\n\nand creates the controller log under the peers-ctl config directory.\n\nTo use a different projects root (e.g. on a project-specific\ndisk): `export PEERS_PROJECTS_ROOT=/work/peers/`\n\nonce, then bare\nnames land there. `peers-ctl doctor`\n\nprints the active root.\n\n```\ncd /path/to/your-target-project\npeers init                              # writes .peers/ + commits .gitignore\n$EDITOR .peers/goals.yaml               # delete `placeholder-replace-me`, write real gates\npython3 - <<'PY'\nimport hashlib, pathlib\np = pathlib.Path(\".peers\")\n(p / \"goals.sha256\").write_text(hashlib.sha256((p / \"goals.yaml\").read_bytes()).hexdigest() + \"\\n\")\nPY\n$EDITOR .peers/config.yaml              # only if codex needs a custom argv path\npeers info                              # sanity-check: peers, goals, budget, health\n\npeers-ctl add /path/to/your-target-project --name mything\npeers-ctl doctor                        # confirms tooling + per-project config\n\npeers-ctl start mything --max-ticks 20 --max-usd 5\n```\n\nModes are baked into `.peers/goals.yaml`\n\nat scaffold-time. To re-run\nthe SAME project with a DIFFERENT mode set (e.g. you ran `audit`\n\nfirst\nand now want `audit,thorough`\n\non top):\n\n```\n# Variant 1: re-init in place (DESTRUCTIVE — overwrites goals.yaml + checks)\npeers-ctl new mything /path/to/your-project \\\n  --modes=audit,thorough --force\n# Then start as usual:\npeers-ctl start mything --container --max-ticks 30\n\n# Variant 2: separate worktree (NON-DESTRUCTIVE, recommended)\ngit -C /path/to/your-project worktree add \\\n  /path/to/your-project-thorough HEAD\npeers-ctl new mything-thorough /path/to/your-project-thorough \\\n  --container --modes=audit,thorough\npeers-ctl start --container mything-thorough\n# Cherry-pick the substantive fixes back to your main worktree when done.\n```\n\n**Variant 2 is the recommended pattern for iterative audits.** Each\nrun audits a worktree clone; fixes are cherry-picked back via merge\nwith `--no-ff`\n\nafter review. The worktree pattern keeps your existing\naudit history (`.peers/state.json`\n\n, `.peers/log/runs.jsonl`\n\n) intact.\n\n```\npeers-ctl status mything                # snapshot\npeers-ctl dashboard                     # all registered projects at once\npeers-ctl dashboard --live              # continuous redraw with alerts/events\npeers-ctl dashboard --project mything   # drilldown: recent runs + bugs\npeers-ctl tail mything                  # live tail (Ctrl-C to detach)\ntail -f /path/to/your-target-project/.peers/log/runs.jsonl   # rich per-tick audit\npeers -C /path/to/your-target-project replay 3               # inspect tick 3\npeers-ctl stop mything                  # graceful SIGTERM → 10s → SIGKILL\npeers -C /path/to/your-target-project report   # writes .peers/REPORT.md\npeers-ctl report mything                # writes controller REPORT-mything.md\npeers-ctl review mything                # latest handoff self-review\n```\n\nCI guardrails are available as `.gitea/workflows/test.yml`\n\nplus\n`scripts/pre-push.sh`\n\n; install the local hook with `make hooks-install`\n\n.\n\nThe controller is stateless; the project's own `.peers/state.json`\n\nand `runs.jsonl`\n\nare the durable record. If the host reboots\nmid-run, `peers-ctl list`\n\nwill mark the project `crashed`\n\n; you can\n`start`\n\nit again and the loop resumes from the saved iteration.\n\n**Project states shown by peers-ctl list:**\n\n| State | Meaning |\n|---|---|\n`fresh` |\nscaffolded by `peers-ctl new/add` but never started |\n`running` |\nactive loop, container/PID alive |\n`stopped` |\nexited cleanly — wrote `.peers/last-stop-reason.txt` with `complete` , `max_ticks` , `max_iterations` , or `budget:*` reason. A run that reached `convergence-reached` is `stopped` , not `crashed` . |\n`crashed` |\nprocess died without a sentinel — segfault, OOM, halt-pattern, goal-mutation, host reboot mid-run |\n\nA **mode** is a reusable bundle of audit goals + check scripts that\n`peers-ctl new --modes=…`\n\nlays down in `.peers/`\n\n. Modes are\n**stackable** (comma-separated list) — except `describe`\n\n, which is\nmutually exclusive with audit/security modes (it writes docs, not\naudits code).\n\nHard gates: `self-review-on-handoff`\n\n, `tests-pass`\n\n,\n`tests-cover-happy-edge-sad`\n\n, ** tests-no-unjustified-skip-or-fail\n(peers must justify every @pytest.mark.skip/xfail)**,\n\n`lint-clean`\n\n, `type-clean`\n\n, `bug-hunt-clean`\n\n, `tdd-reproduces-bug`\n\n,\n`no-secrets-committed`\n\n, `deps-justified`\n\n, `api-stable`\n\n,\n`no-prior-regression`\n\n, `diff-size-per-resolve`\n\n.Soft goals: `bug-hunt-round-1-deep`\n\n, `bug-hunt-round-2-cross-review`\n\n,\n`tests-3-class-review`\n\n.\n\n**Use it always.** Other modes assume `audit`\n\n's hard-gates are active\nand tighten what „clean\" means.\n\nAdds:\n\n`convergence-reached`\n\n(hard, N=3 default): N consecutive clean ticks without new crit/high/med bug-reports — the substrate refuses to declare success without N proofs of stillness.`all-peers-healthy`\n\n(hard): refuses to declare success while any peer is in`unavailable`\n\nstate (halt-pattern hit).`skeptic-pass`\n\n(soft, both peers, interval 1): every tick re-audits with extra suspicion; refuses to pass without documenting 5+ failure modes excluded per file/module.`aggressive-honesty`\n\n(soft, both peers, interval 3): per src top-level path: 3+ failure modes checked, 2+ security categories, 1 test-coverage gap explicitly named.\n\n** thorough alone (without audit) is incomplete** —\n\n`convergence- reached`\n\ndepends on `bug-hunt-clean`\n\n(from audit) to know what\n„clean\" means. Always stack with audit: `--modes=audit,thorough`\n\n.Peers WRITE the project's spec docs (SPEC.md + ARCHITECTURE.md + DESIGN.md) iteratively until N=2 consecutive non-substantive doc commits. Hard gates:\n\n`description-files-present`\n\n: all 3 files exist, ≥500 chars each`description-sections-present`\n\n: SPEC has`## Threat Model`\n\n+`## Invariants`\n\n+`## API`\n\n; ARCH has`## Components`\n\n+`## Data Flow`\n\n; DESIGN has`## Decisions`\n\n+`## Tradeoffs`\n\n; each section body ≥50 chars`description-converged`\n\n: last N commits to the 3 files are non- substantive (no new`##`\n\nsection, <100 lines added, <50% deletion)\n\n**Not composable** with audit modes — describe writes, audit attacks.\nRun `--modes=describe`\n\nFIRST on a repo that lacks docs, cherry-pick\nthe produced files into a follow-up `--modes=audit,…`\n\nrun.\n\nEnd-to-end feature implementation from a markdown PLAN.md.\n**Standalone — not composable with audit/thorough/describe.**\nSee [docs/MODES_IMPLEMENT.md](/c0decave/peers/blob/main/docs/MODES_IMPLEMENT.md) for the\nfull operator reference: PLAN.md schema, frozen acceptance contracts,\nreviewer-only checkoffs, escape valves (`[PARTIAL]`\n\n/ `[BLOCKED]`\n\n/\n`peers-ctl amend`\n\n/ `peers-ctl ack-block`\n\n).\n\n| Project type | Recommended modes |\n|---|---|\n| First touch on undocumented repo | `--modes=describe` (alone, run-1) then `--modes=audit,thorough` (run-2) |\n| Existing Python lib / CLI tool | `audit,thorough` |\n| Implement a planned feature | `--modes=implement --plan ./PLAN.md` |\n\n`peers-ctl modes list`\n\nalways shows the current built-in set.\n\nTwo CLIs:\n\nruns the loop INSIDE one repo. The inner driver.`peers`\n\nregisters + supervises one or more peers projects from outside. The outer controller. Spawns`peers-ctl`\n\n`peers run`\n\n(host or container) and tracks PID/container liveness.\n\n```\n# Lifecycle\npeers-ctl modes list                       # available modes\npeers-ctl new <name> [path] --modes=…      # scaffold + register\npeers-ctl add <path> --name <n>            # register an EXISTING .peers/\npeers-ctl start [<name>] --container       # start (--container = podman)\npeers-ctl status [<name>]                  # one or all\npeers-ctl stop [<name>] [--grace-s 10]     # SIGTERM → wait → SIGKILL\npeers-ctl remove <name>                    # unregister (does NOT delete .peers/)\npeers-ctl list                             # all projects + state\n\n# Observe\npeers-ctl dashboard                        # rollup across all projects\npeers-ctl dashboard --live --refresh-s 1   # live rollup with alerts/events\npeers-ctl dashboard --project <name>        # recent runs + bug drilldown\npeers-ctl tail [<name>]                    # follow controller log\npeers-ctl logs <name> [-n 100]             # print last N lines\npeers-ctl report [<name>]                  # write controller REPORT-<n>.md\npeers-ctl review <name>                    # latest handoff's self-review block\n\n# Maintenance\npeers-ctl doctor                           # pre-flight: peers + git + peer CLIs + image\npeers-ctl prune <name>                     # delete old per-project log files\npeers -C /path/to/target init              # write .peers/\npeers -C /path/to/target run               # start the loop in current shell\npeers -C /path/to/target run --max-ticks 5 # cap ticks\npeers -C /path/to/target run --max-usd 1   # cap budget (API-key billing only)\npeers -C /path/to/target status            # iteration / next peer / lock\npeers -C /path/to/target info              # config + goals snapshot\npeers -C /path/to/target verify            # one-shot goal evaluation\npeers -C /path/to/target report            # write .peers/REPORT.md\npeers -C /path/to/target replay <iter>     # reconstruct any past tick\npeers -C /path/to/target tick --after claude  # hooks-driver: trigger after a peer\npeers -C /path/to/target watch             # follow runs.jsonl\npeers-ctl start <name> --without-recon\n# Skip the substrate-only pre-tick recon step (no LLM call, free).\n# Only opt out if .peers/recon.md was hand-prepared.\n\npeers-ctl start <name> --no-codemap\n# Skip the substrate-only pre-tick structural CODEMAP step (no LLM call, free).\n\npeers-ctl start <name> --without-post-convergence-skeptic\n# Skip the auto-skeptic re-audit tick that fires when consecutive_clean_\n# ticks ≥ N would declare terminal. Default on for higher confidence;\n# opt out for CI runs where false-convergence is acceptable.\n\npeers-ctl start <name> --max-ticks 50 --max-usd 1\n# Same flags work on both peers-ctl and `peers run` directly.\n```\n\n`peers run --help`\n\nand `peers-ctl start --help-man`\n\nshow the full\nflag set with descriptions.\n\nRootless podman's default networking needs the `tun`\n\nkernel module.\nBypass with host networking:\n\n```\nPEERS_CTL_PODMAN_NETWORK=host peers-ctl start --container <name>\n```\n\nFor permanent: `echo 'export PEERS_CTL_PODMAN_NETWORK=host' >> ~/.bashrc`\n\n, then `source ~/.bashrc`\n\n. Alternatively load the module:\n`sudo modprobe tun`\n\n(persist via `/etc/modules-load.d/tun.conf`\n\n).\n\nThe orchestrator writes `.peers/last-stop-reason.txt`\n\nand reconcile\nmaps clean reasons to `stopped`\n\n. If you still see `crashed`\n\npost-convergence:\n\n`cat .peers/last-stop-reason.txt`\n\n— should contain`complete <ts>`\n\n.`make build`\n\nto ensure the container image matches the host code.\n\n`process-fail`\n\nafter ~4min usually = peer CLI returned 5xx (Anthropic Overloaded, Codex rate-limit) and idle-timeout kicked. Run produced no commit. Next tick retries the OTHER peer; the problematic peer auto-recovers if rate-limit was transient.`idle-timeout`\n\nafter exactly`health.idle_timeout_s`\n\n(default 900s) = peer wrote stdout below the silence threshold for too long. Increase`idle_timeout_s`\n\nin`.peers/config.yaml`\n\nfor heavy DA mode runs (peer spends more time thinking before each commit).\n\nA halt-class pattern matched (`authentication failed`\n\n, `quota exhausted`\n\n, `invalid API key`\n\n, `usage limit`\n\nper\n`templates/config.yaml`\n\n). Operator action required:\n\n- Re-login or top-up the OAuth account\n- Restart:\n`peers-ctl start <name> --container`\n\n- The loop resumes from the saved iteration\n\nThis is intentional — the substrate refuses to silently degrade peers on operator-action failures.\n\n`fresh`\n\nmeans the project was registered but NEVER started. After\nthe first successful `peers-ctl start`\n\n, state moves to `running`\n\n,\nthen `stopped`\n\n/`crashed`\n\non exit. If you intended to start it:\n`peers-ctl start <name> --container`\n\n.\n\nIf codex (or any other peer CLI) isn't on the host but is available\nin the `peers:dev`\n\nimage, run the loop inside the container:\n\n```\nmake build                              # one-time main image\nmake proxy-build                        # egress sidecar\nmake auth-proxy-build                   # Claude OAuth sidecar\npeers-ctl doctor                        # confirms podman + image exist\npeers-ctl start mything --container --max-ticks 20 --max-usd 5\n```\n\nThis spawns `podman run -d --rm --name ... --userns=keep-id ... peers:dev run …`\n\nand tracks the running container by name via `podman ps`\n\n. The displayed\nPID is only the host-side `podman logs -f`\n\nstreamer. `peers-ctl stop --grace-s N`\n\nuses `podman stop -t N`\n\n, then reaps the log streamer.\n\nContainer mode bind-mounts the target repo, `~/.claude`\n\n, `~/.codex`\n\n,\nand optional read-only `~/.gitconfig`\n\n. When `~/.claude.json`\n\nexists,\nit is mounted into the per-project `peers-auth-proxy_<name>`\n\nsidecar\ninstead of the workspace container; the workspace talks to\n`ANTHROPIC_BASE_URL=http://127.0.0.1:8080`\n\n.\nBefore launch, `peers-ctl`\n\ncompares the host package version with\n`peers --version`\n\ninside the image: minor/patch drift warns, major\ndrift refuses start until you rebuild (`make build`\n\n).\n\nOverride the image name with `PEERS_CTL_IMAGE=name:tag`\n\nif you've\ntagged your build differently.\n\n```\npip install -e .[dev]\npytest          # the full suite should pass\ncd /path/to/your-project\npeers init\n$EDITOR .peers/goals.yaml            # delete the placeholder, write your gates\npython3 - <<'PY'\nimport hashlib, pathlib\np = pathlib.Path(\".peers\")\n(p / \"goals.sha256\").write_text(hashlib.sha256((p / \"goals.yaml\").read_bytes()).hexdigest() + \"\\n\")\nPY\npeers run --max-ticks 20\npeers status\ntail -f .peers/log/runs.jsonl        # rich per-tick audit log\npeers replay <iter>                  # reconstruct any iteration\n```\n\n`peers init`\n\nwrites `.peers/`\n\ninto the target, tags the current HEAD\nas `peers-baseline`\n\n(rollback anchor), snapshots the goals hash\n(`goals.sha256`\n\n), and adds `.peers/`\n\nto the target's `.gitignore`\n\n.\nIf you edit `.peers/goals.yaml`\n\nmanually before starting a run, refresh\n`goals.sha256`\n\n; the loop intentionally halts on unacknowledged goal\nchanges or if `goals.yaml`\n\ndisappears mid-run.\n\n```\npeers init --driver=hooks            # scaffold Stop-hook snippets\npeers init --driver=hooks --install  # ALSO merge into your host config (with backup)\npeers tmux up                        # sessions driver: tmux up/down/attach\n```\n\n`--driver=hooks`\n\ndrops ready-to-paste fragments in `.peers/hooks/`\n\nfor your `~/.claude/settings.json`\n\nand `~/.codex/config.toml`\n\n.\n\n`--install`\n\n(only valid with `--driver=hooks`\n\n) goes one step further:\nit merges the Stop-hook entry directly into your host configs and\nwrites timestamped backups (`settings.json.bak.peers-<ts>`\n\n,\n`config.toml.bak.peers-<ts>`\n\n). Behavior:\n\n**idempotent**— re-running prints`noop`\n\nand does not duplicate entries. Each entry is tagged with`# peers:<absolute-target-path>`\n\nso the installer recognises its own work.**drift-aware**— if the target path changed (e.g. the project moved), the existing entry is rewritten in place and the old file is backed up.**conservative on TOML**— if your`~/.codex/config.toml`\n\nalready has a non-peers`[hooks]`\n\nsection with an`on_stop`\n\n, the installer refuses to touch it and prints a notice (codex has no general TOML merge logic in stdlib; we will not clobber a custom config).**Independent failure**— patching claude vs codex is independent. Whichever side succeeded is reported on stdout; the other is reported on stderr with the path of the snippet you can merge manually.\n\nSmoke-test after install:\n\n```\npeers status                         # nothing yet (no run)\npeers tick                           # one manual tick — should run cleanly\n```\n\n`peers-ctl`\n\nis a host-side controller that supervises many peers loops\nwithout a daemon. Each project is a detached background process; the\ncontroller stores PIDs (with a `/proc`\n\n-based starttime fingerprint to\nguard against PID recycle) under `~/.config/peers-ctl/`\n\n.\n\n```\npeers-ctl doctor                     # pre-flight: peers/git/peer-CLIs + per-project config sanity\npeers-ctl add  /path/to/project-a   --name a\npeers-ctl add  /path/to/project-b   --name b\npeers-ctl list\n\npeers-ctl start a --max-ticks 20 --max-usd 3\npeers-ctl status a\npeers-ctl tail a                     # follow log via tail -f\npeers-ctl report a                   # write Markdown controller report\npeers-ctl review a                   # show latest handoff self-review\npeers-ctl stop a                     # graceful: SIGTERM -> 10s grace -> SIGKILL; state.json persisted\npeers-ctl prune                      # delete old log files\n```\n\n`peers-ctl report`\n\nwrites a clean Markdown summary to\n`~/.config/peers-ctl/REPORT.md`\n\n(or `REPORT-<name>.md`\n\nwhen scoped to\none project). The report includes controller log paths, per-project\ntick counts, blocking bug counts, last activity, and README status so a\nhandoff can spot missing operator docs before the next run.\n`peers-ctl dashboard`\n\nis the fast terminal view: state, ticks, open\nhard/soft goals, blocking bug count, running container name, and last\ntick timestamp for every registered project. Add `--live`\n\nfor a\nperiodic redraw that also shows alert state and the newest decoded\nClaude session event when available. Add `--project <name>`\n\nfor a\nsingle-project drilldown with recent runs and bug reports; combine it\nwith `--live`\n\nto redraw that detail view.\n\nExample `peers-ctl doctor`\n\noutput:\n\n```\npeers-ctl doctor — 3 project(s) registered, config dir ~/.config/peers-ctl\n\n  [ok] snake                ~/code/snake\n           2 peer(s), 5 goal(s)\n  [ok] cpu-emu              /tmp/peers-dogfood-r2/cpu-emu\n           2 peer(s), 8 goal(s)\n  [FAIL] freshproject       ~/code/freshproject\n           missing ~/code/freshproject/.peers/config.yaml\n\nWarnings:\n  - `codex` is not on PATH. If any project uses it, either add it to PATH\n    or set the full path in that project's .peers/config.yaml.\n```\n\n`doctor`\n\nsurfaces three classes of problem up front: missing tooling,\nmissing or unparseable per-project config, and per-project ambiguity\n(unknown peer name, no goals, etc.). Use it before kicking off a\nlong autonomous run.\n\n`config.yaml`\n\naccepts an ordered `peers:`\n\nlist. The substrate is\nneutral about names; pick what you want.\n\n```\npeers:\n  - name: claude\n    tool: claude\n    model: opus        # optional; omit to use CLI default\n    reasoning: high    # claude: low|medium|high|xhigh|max\n    argv: [\"claude\", \"-p\", \"--dangerously-skip-permissions\", \"{PROMPT}\"]\n    prompt_mode: argv-substitute\n\n  - name: codex\n    tool: codex\n    model: gpt-5.1-codex-max\n    reasoning: xhigh   # codex: minimal|low|medium|high|xhigh\n    provider: openai   # openai|openrouter\n    argv: [\"codex\", \"exec\", \"{PROMPT}\"]\n    prompt_mode: argv-substitute\n\n  # Third peer is fine — anything in [A-Za-z0-9][A-Za-z0-9_-]{0,31}:\n  - name: claude-2\n    tool: claude\n    argv: [\"claude\", \"-p\", \"--dangerously-skip-permissions\", \"{PROMPT}\"]\n    prompt_mode: argv-substitute\n```\n\nThe legacy `tools: {claude: …, codex: …}`\n\nmapping is still loaded for\nback-compat and auto-promoted to the new shape.\n\n`model`\n\n, `reasoning`\n\n, and `provider`\n\nare optional convenience fields.\nExplicit `argv`\n\nswitches still win. To scaffold them without editing\nYAML:\n\n```\npeers-ctl new myapp --modes=audit \\\n  --peer-model claude=opus \\\n  --peer-provider codex=openrouter \\\n  --peer-model codex=~openai/gpt-latest \\\n  --peer-reasoning codex=xhigh\n```\n\nFor OpenRouter, export `OPENROUTER_API_KEY`\n\nbefore `peers run`\n\n,\n`peers tick`\n\n, `peers tmux up`\n\n, or `peers-ctl start`\n\n; these commands fail\nearly if the key is missing. Container mode passes the key name through\nand opens only `openrouter.ai`\n\nin the egress proxy allow-list for projects\nthat opt in.\n\n`opencode`\n\nis a first-class tool alongside `claude`\n\nand `codex`\n\n. Run it with\n`--format json`\n\nso the substrate gets the same structured channel it uses for\nthe others — token + USD accounting (from `step-finish`\n\nevents) and\necho-immune auth/quota halt detection (from `error`\n\nevents):\n\n```\npeers:\n  - name: opencode\n    tool: opencode\n    model: ollama/qwen2.5      # opencode's <provider>/<model> (NOT a separate provider:)\n    reasoning: high            # → --variant high\n    argv: [\"opencode\", \"run\", \"--format\", \"json\", \"--dangerously-skip-permissions\", \"{PROMPT}\"]\n    prompt_mode: argv-substitute\n```\n\nopencode is also the simplest path to **local models**. It is a universal\ngateway: configure the backend once in opencode's own config\n(`opencode providers`\n\n, or an `opencode.json`\n\ncustom provider) — ollama, vllm,\nllama.cpp, LM Studio, or any OpenAI-compatible `/v1`\n\nendpoint — then point a\npeer's `model`\n\nat `<provider>/<model>`\n\n:\n\n```\n    model: ollama/qwen2.5            # local via ollama\n    model: openai-compatible/<name> # local vllm / llama.cpp server\n    model: anthropic/claude-...      # cloud, routed through opencode\n```\n\nThe substrate needs no local-model-specific config; opencode resolves the provider. Notes:\n\n`provider:`\n\nis**not** used for opencode — encode the provider in`model`\n\n(`provider/model`\n\n). Setting`provider:`\n\non an opencode peer is rejected.- Billing for opencode is treated as\n**warn**, never a hard`max_usd`\n\nkill (local = free, opencode-hosted = subscription, BYOK cloud = metered — the tool name alone can't tell which, so the conservative default applies). `codex`\n\ncan also reach local models, but only`ollama`\n\n/`lmstudio`\n\nvia`codex exec --oss --local-provider …`\n\n, or a custom provider that speaks the OpenAI**Responses** API (`wire_api=responses`\n\n) — codex dropped chat-API support, so chat-only servers (llama.cpp, vanilla ollama OpenAI-compat) go through opencode instead.\n\nSoft goals get one of these `reviewer:`\n\nmodes:\n\n`other`\n\n— any non-active peer can submit a review on their turn.`both`\n\n— every peer must submit`consensus_needed`\n\npass:true reviews.`alternating`\n\n— review duty rotates one slot per recorded review.`quorum`\n\n— together with`quorum: \"N/M\"`\n\n, pass when ≥N of the most recent M reviews were pass:true.\n\n```\nmake build\nmake init-target TARGET=/path/to/your-target\nmake run         TARGET=/path/to/your-target\nmake status      TARGET=/path/to/your-target\n```\n\nOn some hosts the default `pasta`\n\nnetwork backend fails with\n`/dev/net/tun: No such device`\n\n; `make build`\n\ntherefore uses\n`BUILD_NETWORK=host`\n\nby default. Use `make run NETWORK=host TARGET=...`\n\nto bypass runtime networking issues too. Plain `podman`\n\nworks without\nthe Makefile:\n\n```\npodman build --network=host -f Containerfile -t peers:dev .\npodman run --rm -it --userns=keep-id --cap-drop=ALL \\\n    --security-opt=no-new-privileges \\\n    -v $PWD:/work \\\n    -v $HOME/.claude:~/.claude \\\n    -v $HOME/.codex:~/.codex \\\n    peers:dev run\n```\n\n`podman compose`\n\nworks too (see `compose.yaml`\n\n) but its\n`docker-compose`\n\nprovider needs the podman daemon socket.\n\nHost-side requirement: `podman`\n\n, `git`\n\n, `python3`\n\n. The container\nbrings its own Node.js and the Claude/Codex CLIs.\n\nThe `peers-ctl`\n\nflow is the recommended way to run unattended:\n\n**PID-recycle defence.** Each start records the process's kernel-issued starttime via`/proc/<pid>/stat`\n\n;`stop`\n\nverifies it matches before signalling, so a recycled PID owned by an unrelated process is never killed.**Graceful stop.**`peers-ctl stop`\n\nsends SIGTERM, which routes inside the loop into the substrate's KeyboardInterrupt path (state persisted, run.lock released) before falling through to SIGKILL.**Lock status clarity.**`run.lock`\n\nis intentionally left on disk after unlock so all contenders use the same inode;`peers status`\n\nprobes`flock`\n\nand distinguishes an active lock from a stale file.**Pre-flight check.**`peers-ctl doctor`\n\nflags missing tooling and per-project misconfiguration in one shot — no surprises 20 minutes into a run.**Crash detection.**`peers-ctl reconcile`\n\n(run automatically by`list`\n\n/`status`\n\n/`start`\n\n) sees that a recorded PID is dead, marks the project`crashed`\n\n, and clears the PID so a fresh`start`\n\nis unambiguous.**No daemon.** Each project's loop is a setsid'd background process.`peers-ctl`\n\nis a stateless CLI; the registry on disk is the source of truth, accessed under`fcntl.flock`\n\nso concurrent invocations serialise their mutations.\n\nThe substrate's health model is **output-driven**: a peer is \"stuck\"\nwhen its child process has written nothing to stdout/stderr for\n`idle_timeout_s`\n\nseconds. This works great for chatty peers\n(codex by default streams progress) but **claude in -p (print)\nmode is silent until the response is ready**. A claude tick that\nsets up a non-trivial project from scratch can take 5–20+ minutes\nof silent thought before any output appears.\n\nRule of thumb:\n\n| Task scale | `idle_timeout_s` |\n|---|---|\n| Small fixes / single-file edits | 600 (10 min) |\n| Multi-file feature work | 1800 (30 min) |\n| From-scratch project scaffolding | 3600 (60 min) |\n| Heavy refactors of large codebases | 5400 (90 min) |\n\nIf you see runs.jsonl entries with `classification: idle-timeout`\n\n,\nyour value is too low. Edit `.peers/config.yaml`\n\n:\n\n```\nhealth:\n  idle_timeout_s: 3600\n```\n\n`absolute_max_runtime_s`\n\nis a separate paranoid ceiling — set it\nlarger than `idle_timeout_s`\n\n(e.g. 2× to 4×).\n\n`claude -p`\n\nin its default text-output mode is silent about token\nusage, so `budget.max_usd`\n\nand `budget.max_tokens`\n\nare effectively\noff — the substrate sees `(tokens, usd) = (0, 0)`\n\nafter every tick.\n\nFix: switch claude to JSON output. The substrate auto-detects the\nenvelope and pulls `usage.input_tokens + cache_creation + cache_read + output_tokens`\n\nand `total_cost_usd`\n\n.\n\nEdit `.peers/config.yaml`\n\nonce:\n\n```\npeers:\n  - name: claude\n    tool: claude\n    argv: [\"claude\", \"-p\", \"--dangerously-skip-permissions\",\n           \"--output-format\", \"json\", \"{PROMPT}\"]\n    prompt_mode: argv-substitute\n```\n\nFor incremental output (so a long tick is not silent and `idle_timeout_s`\n\nsees progress) use `stream-json`\n\n:\n\n```\n    argv: [\"claude\", \"-p\", \"--dangerously-skip-permissions\",\n           \"--output-format\", \"stream-json\", \"--verbose\", \"{PROMPT}\"]\n```\n\n`claude`\n\n(Claude Code) and `codex`\n\n(ChatGPT-bundled) authenticate via\n**OAuth → flat subscription**. Their `total_cost_usd`\n\nfield reports\nthe *API-equivalent* price; the user pays $0 incrementally. A *hard*\nbudget cap is meaningless there — it kills a perfectly-paid run.\n\n`max_usd_mode`\n\ncontrols the policy:\n\n| mode | behavior |\n|---|---|\n`auto` (default) |\ninspect `~/.claude/.credentials.json` + `~/.codex/auth.json` (`auth_mode` ). All peers OAuth → `warn` ; any peer using an API key → `hard` . |\n`hard` |\nexit on cap (pre-Phase-3i behavior). Use this if you set `ANTHROPIC_API_KEY` / `OPENAI_API_KEY` . |\n`warn` |\nlog a one-time warning at the threshold; do NOT exit. |\n`off` |\nignore `max_usd` entirely. |\n\n`peers info`\n\nshows the *resolved* mode and the reason it picked, e.g.:\n\n```\nbudget:  iterations≤20, runtime≤10800s, USD≤$25.0\n  max_usd_mode=warn (auto: all peers OAuth-billed)\n```\n\nEvery `peers init`\n\nships five default goals plus the intentional\n`placeholder-replace-me`\n\nhard fail. The default set forces self-review\nand mutual bug-hunting before claiming convergence:\n\n| Gate | Type | Pass when |\n|---|---|---|\n`self-review-on-handoff` |\nhard | every handoff commit has `## Self-Review` and `Self-Review: pass` |\n`bug-hunt-clean` |\nhard | zero unresolved bugs at severity `crit` /`high` /`med` |\n`bug-hunt-round-1` |\nsoft (`consensus_needed: 2` ) |\neach peer says \"round 1 done\" |\n`bug-hunt-round-2` |\nsoft (`consensus_needed: 2` ) |\neach peer says \"round 2 done\" after round-1 fixes landed |\n`test-coverage-3-class` |\nsoft (`consensus_needed: 2` ) |\neach peer reviewed the other's tests for happy/edge/sad coverage |\n\nA peer files a bug as a standalone commit:\n\n```\nBUG-007: null deref in parser\n\n## Bug-Report\n{\"id\":\"BUG-007\",\"severity\":\"high\",\"fix_by\":\"codex\",\n \"location\":\"src/parser.py:42\",\n \"description\":\"Crashes on empty input; expected: return None.\"}\n\nPeer: claude\nBug-Report: BUG-007\n```\n\nThe `fix_by`\n\npeer resolves it with another commit:\n\n```\nResolve BUG-007\n\n## Bug-Resolution\n{\"resolves\":\"BUG-007\",\"status\":\"fixed\",\"note\":\"guarded with if not s: return\"}\n\nPeer: codex\nBug-Resolves: BUG-007\n```\n\nInspect anytime:\n\n```\npython3 -m peers.bug_hunt summary           # human rollup\npython3 -m peers.bug_hunt gate /path/to/repo  # exit 0 iff clean\npeers verify                                # re-runs every hard gate, includes bug-hunt-clean\n```\n\nSeverity ladder: `crit`\n\n(data loss / RCE) > `high`\n\n(broken feature)\n\n`med`\n\n(degraded UX) >`low`\n\n(nit) >`info`\n\n(note). Only the top three block completion. A`wontfix`\n\nresolution keeps the bug in the counter — use only with the other peer's agreement.\n\nThe full protocol (when to file vs fix, severity guidance, what NOT to\nbug-report) ships in the per-tick prompt as `BUG_HUNT_BLOCK`\n\n; peers\nsee it on every turn.\n\nWhen a peer process exits with `classification: \"api-error\"`\n\n, the\n`runs.jsonl`\n\nentry includes:\n\n```\n\"matched_error_pattern\": \"Authentication failed\",\n\"matched_error_snippet\": \"Authentication failed: token expired ...\"\n```\n\nso you can see *which* `health.error_patterns`\n\nregex fired without\ngrepping the raw container log. Any non-success tick also records\n`stderr_tail`\n\nand `stdout_tail`\n\n; soft-review ticks include\n`soft_reviews_seen`\n\n, `soft_reviews_ingested`\n\n, and\n`soft_reviews_rejected`\n\n.\n\nThe substrate's handoff detection reads git commits, not claude's\nstdout content, so the format change is safe — only your\nper-tick `runs.jsonl`\n\nconsole snippet becomes JSON instead of plain\ntext. `peers report`\n\nsummarizes that for you.\n\ncodex emits its own `tokens used`\n\nline by default; no config change\nneeded there.\n\nAfter `peers run`\n\ncompletes (or on any later check-out of the finished\nproject) you can re-run every hard goal against the current files,\nwithout spinning up any peer process:\n\n```\npeers verify           # exits 0 iff every gate passes; writes .peers/VERIFY.md\n```\n\nUse it to:\n\n- Confirm\n`tests-pass`\n\n,`ruff-clean`\n\n,`smoke-import`\n\n(and whatever else is in`goals.yaml`\n\n) on a different machine. - Validate a hand-edit didn't break a gate.\n- Smoke-test a UI build with\n`verify.commands`\n\n:\n\n```\n# .peers/config.yaml\nverify:\n  timeout_s: 60\n  commands:\n    - name: cli-help\n      cmd: \"PYTHONPATH=src python -m mything --help\"\n    - name: ui-screenshot\n      cmd: \"xvfb-run -a python tools/screenshot.py out.png\"\n      timeout_s: 30\n```\n\n`peers verify`\n\nuses `goals.timeout_s`\n\nfor hard goals unless\n`verify.timeout_s`\n\noverrides it. `verify.commands`\n\nexit code 0 = pass;\nnon-zero or timeout = fail.\nCombined hard-goals + verify.commands result is rendered as a markdown\ntable at `.peers/VERIFY.md`\n\n.\n\n**State durability.**`state.json`\n\nis atomically written tmp+fsync+rename with a parent-directory fsync, and v1 → v2 schema migration writes a`state.json.pre-migration`\n\nbackup once.**Self-review on handoff.** The`self-review-on-handoff`\n\nhard gate ships on every`peers init`\n\n. Every handoff commit must include a`## Self-Review`\n\nbody section and`Self-Review: pass`\n\ntrailer. The default gate runs the trusted package checker, not a mutable project-local copy.**Anti-cheating hard-block.** A turn that modifies only test files is reverted (`git revert --no-commit`\n\n+ commit), success is demoted to fail, the peer keeps the turn, and the warning lands in the next prompt. Two reverts in a row mark the peer`degraded`\n\n.**Sandboxed**`pass_when`\n\nDSL.`regex(...)`\n\nand`json('path')`\n\nare available;`json()`\n\nis restricted to relative paths inside the target repo, refuses symlinks/hardlinks via the safe readers, and has a 2 MiB read cap.`stdout`\n\n/`stderr`\n\nexposed to the DSL are capped at 1 MiB, string literals and regex patterns are bounded, and`regex()`\n\nhas a timeout.**Goal-mutation lock.**`goals.yaml`\n\n's sha256 is verified before every tick using no-follow reads; in-loop changes halt the loop with a clear reason, and deletion of`goals.yaml`\n\nis treated as mutation.**Control-plane file hardening.** State, logs, reports, verify output, controller registry files, and controller logs refuse symlinks, non-regular files, and hardlinks. Log appends open the parent directory with no-follow semantics to block late parent-symlink swaps. State, goals, project config, and controller registry reads are size-capped before JSON/YAML parsing;`health.error_patterns`\n\nalso has count and per-pattern size limits before regex compilation.**PID-recycle defence.**`peers-ctl`\n\nrecords each loop's`/proc/<pid>/stat`\n\nstarttime and refuses to signal a PID whose fingerprint no longer matches.**File-channel race-safe.** Hybrid-comm`send()`\n\nuses temp-file + atomic link publication so consumers never see partial messages, and avoids two concurrent senders colliding on the same NNNN.**Audit trail.**`runs.jsonl`\n\nrecords`soft_fail_reason`\n\n, tokens & USD per tick, head_before/after, peer_state_after, warnings_emitted, and the`truncated`\n\nflag from HealthGuard.`peers init`\n\ncreates the file up front, and`peers-ctl add/new`\n\ncreates the controller-side log up front, so there is always a stable place to write or inspect run evidence.\n\n```\nsrc/\n├── peers/                  # the substrate\n│   ├── cli.py              # peers init / run / status / tick / replay / watch / tmux\n│   ├── driver_orchestrator.py      # public facade\n│   ├── _driver_orchestrator_impl.py # thin runtime coordinator\n│   ├── driver_*.py          # decomposed lifecycle / observability / health hooks\n│   ├── state_store.py      # schema v2 + v1 migration\n│   ├── turn_manager.py     # round-robin over n peers\n│   ├── goal_engine.py\n│   ├── goals.py            # YAML loader + pass_when DSL\n│   ├── peer_spec.py        # PeerSpec + load_peer_specs\n│   ├── comm_layer.py       # GitCommLayer + HybridCommLayer\n│   ├── health_guard.py     # streaming reader + idle-timeout + truncation\n│   ├── prompt_builder.py\n│   └── templates/\n├── peers_ctl/              # the controller\n    ├── cli.py              # add / remove / list / start / stop / status / review / logs / tail / prune\n    ├── store.py            # registry on disk, fcntl-locked\n    └── runner.py           # detached spawn + PID-recycle defence\n└── auth_proxy/             # OAuth sidecar server\n\ntests/\n├── unit/                   # unit tests\n└── integration/            # smoke + adversarial peer fixtures\n```\n\n[docs/HOWTO-audit-and-fix.md](/c0decave/peers/blob/main/docs/HOWTO-audit-and-fix.md)— end-to-end recipe to audit + fix an existing application[docs/MODES_IMPLEMENT.md](/c0decave/peers/blob/main/docs/MODES_IMPLEMENT.md)—`implement`\n\nmode operator reference[docs/SECURITY.md](/c0decave/peers/blob/main/docs/SECURITY.md)— threat model + per-layer mitigations", "url": "https://wpnews.pro/news/new-version-of-peers-the-ai-couple-doing-things", "canonical_source": "https://github.com/c0decave/peers", "published_at": "2026-06-06 07:08:54+00:00", "updated_at": "2026-06-06 07:46:53.738901+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "ai-research", "ai-safety", "large-language-models"], "entities": ["peers", "Claude Code", "Codex"], "alternates": {"html": "https://wpnews.pro/news/new-version-of-peers-the-ai-couple-doing-things", "markdown": "https://wpnews.pro/news/new-version-of-peers-the-ai-couple-doing-things.md", "text": "https://wpnews.pro/news/new-version-of-peers-the-ai-couple-doing-things.txt", "jsonld": "https://wpnews.pro/news/new-version-of-peers-the-ai-couple-doing-things.jsonld"}}