An adversarial agent-pair harness for shipping code with AI. One model
drives a task through a pipeline (plan β implement β review); a
different model reviews the work adversarially; the output is a draft PR you
review before merge. The collision of two independent model perspectives is the
point β automatic self-agreement is what it exists to prevent. (A single-model
TDD mode that front-loads write_test β verify_test
is also available β see Phases by mode.)
Status: early. redteam was built as one project's internal harness and then extracted into this standalone repo, which owns it going forward β it has driven real, merged pull requests. (Its early git history reflects that origin, including cross-repo coordination from the parent project.) APIs and layout may still move.
Quick install (Claude Code) β two commands:
/plugin marketplace add https://github.com/AscendyProject/redteam
/plugin install redteam@ascendy-redteam
Not on Claude Code? Vendor it into any repo β see Install.
Given a batch of tasks (each a short input.md
brief), the orchestrator walks
every task through a fixed pipeline, persisting state.json
after each phase so
a run is fully resumable and retrying on CHANGES_REQUESTED
:
flowchart TD
PO[plan_outcome]:::worker --> PRV[plan_review]:::rev
PRV --> IMPL[implement]:::worker
IMPL --> RC[review_code]:::rev
RC -->|APPROVED| CPR[create_pr β draft PR]:::worker
RC -->|CHANGES_REQUESTED| IMPL
RC -. blocker persists .-> RES[rescue]:::rev
RES --> HGR[human_gate_rescue] --> CPR
CPR --> DONE([done]):::done
classDef worker fill:#e3f2fd,stroke:#1976d2,color:#0d47a1;
classDef rev fill:#fce4ec,stroke:#c2185b,color:#880e4f;
classDef done fill:#e8f5e9,stroke:#388e3c,color:#1b5e20;
Blue = worker model (writes) Β· pink = reviewer model (adversarial, fresh).
This is the default agent-pair flow. By design it runs with no human gates in the common path β the adversarial pair plus verification is the trust, and the output is a draft PR (your existing human checkpoint before merge), not an auto-merge. Human gates are something you add back for risky changes, not the default tax on every change β see When to use it.
mode
(agent-pair
by default, or tdd
) decides which phases run. The
authority is _phase_order()
in orchestrator.py
(AGENT_PAIR_PHASE_ORDER
/
TDD_PHASE_ORDER
) β driving the pipeline manually must follow the row for the declared mode, not the prose:
| Mode | Core phases |
|---|---|
agent-pair (default) |
|
plan_outcome β plan_review β implement β review_code β create_pr |
|
tdd |
|
plan_outcome β write_test β verify_test β implement β review_code β create_pr |
The agent-pair worker writes its tests inside implement β there is no separate test-authoring phase; the second perspective is the adversarial
reviewer(
review_code
), and the plan is independently checked by
plan_review
. The TDD mode instead drops
plan_review
and front-loads a
write_test β verify_test
pair before implement
. So write_test
/
verify_test
(the test-author / test-verifier sub-agents) run in TDD mode onlyβ inserting them into an agent-pair task runs a phase the mode excludes.
(The table shows the worker + reviewer phases. A rescue
slot is entered only if
a blocker persists across review rounds β in the untiered default a rescue is then
human-reviewed (human_gate_rescue
) before the PR. A plan-approval gate is opt-in per tier profile.)
Each phase is run by a focused sub-agent with its own prompt and tool scope
(.claude/agents/*.md
): an outcome-planner, implementer, code-security-reviewer, and pr-author β plus a test-author / test-verifier pair used only in TDD mode. The reviewer is a fresh agent that only sees the diff and the project's security checklist β it never sees the implementer's reasoning.
A plain "two-model" setup stops at a second model takes a second look. redteam makes that separation structural and then acts on it:
Findings are tiered, not pass/fail. The reviewer emits findings with a severity (blocker
/major
/minor
), and the orchestrator tracks each oneacross review rounds(a carry-over count) β a review is not a single thumbs up/down.Persistent problems escalate on a ladder. A blocker that survives multiple rounds climbs: retry the worker β a heavierrescue
pass β hand to a human (ask_user
). So one rejection doesn't kill a run, and a stubborn real bug doesn't get rubber-stamped after a single retry.The reviewer is blind to the writer. It's a fresh agent β and a configurablydifferentmodel β that sees only the diff and the security checklist, never the implementer's reasoning, so self-justification can't cross the boundary.The draft PR is the human checkpoint; the default common path has no gates. The pair plus verification is the automated trust, so the output is adraft PR you review before merge β it never autonomously merges. Blocking human gates (plan approval, etc.) areopt-in per tier for risky changes, not the default.Either model on either side, zero runtime deps. SeeModel freedomandInstall.
Roles bind to providers through a small adapter registry, not hardcoded calls. Today Claude and Codex can each take either side:
| role | providers implemented |
|---|---|
| worker (planner / implementer) | claude , codex |
| reviewer / rescue | codex , claude |
You choose per role in .redteam/config.toml [models]
. A reviewer value that
isn't an adapter (e.g. "human"
) falls back to the manual flow (you paste the review and touch the sentinel). Adding another provider is one adapter file plus one registry line.
The goal is minimal human intervention, without losing trust β the adversarial pair is the automated trust, so the common path has no human gates. But not every change needs the same weight: a typo shouldn't pay for a full agent-pair, and an auth change shouldn't ship with only the light path. So you scale the response to the risk of the change:
| change | response |
|---|---|
| trivial β non-behavior-changing (rename, comment, formatting) | single-agent, no review |
| routine β small, local, reversible | single-agent loop; review optional |
| guarded β behavior change with real blast radius (auth, storage, concurrency, public API, migrations) | |
| the adversarial pair + verification (the default) | |
| strategic / production-critical β architectural, irreversible, or changes prod posture | |
| the pair plus human gates (and a rollback plan you require) |
Tier-aware routing lets the harness apply this automatically (opt-in via
config.toml
). You define tier profiles as declarative toggles, and a deterministic classifier picks each task's tier:
[tiers.0] # trivial
review = false # single-agent, no adversarial pair
models = { implementer = "claude-haiku-4-5" } # cheap model
[tiers.2] # guarded (a sensible default)
review = true # the adversarial pair; no human gate
[tiers.4] # production-critical
review = true
gates = ["outcome", "pr"] # add human checkpoints back here
[tier_triggers]
"**/auth/**" = 4 # touching auth floors the task at tier 4
default = 2 # unclassified β safe default
The binding tier is max(declared, path-triggered, default)
β a task can be
raised but never lowered below what its paths demand, and an unclassified task
falls to the mandatory safe default. With no [tiers]
section, routing is off and every task takes the default pipeline (fully backward-compatible).
Two levers also work on their own, without tiers:
Model per role([models]
) β a cheaper implementer for routine work, a frontier reviewer for guarded work; either provider on either side.The escalation ladderβ ablocker
finding that survives review rounds climbs retry βrescue
, concentrating effort where a problem actually persists.
Trigger globs are git-pathspec-style: *
matches within a path segment, **
matches across directories (so **/auth/**
matches auth/x
at any depth).
Scope note: v1 path triggers match the paths a task
declaresin its front-matter, and tier profiles vary review/gates/models over the canonical pipeline (not arbitrary phase orders). Re-checking the real committed diff and richer profiles are tracked on[issue #13].
This repo doubles as a single-plugin marketplace, so two commands install it:
/plugin marketplace add https://github.com/AscendyProject/redteam
/plugin install redteam@ascendy-redteam
The HTTPS URL works everywhere, including behind firewalls that block SSH (port 22). The
AscendyProject/redteam
shorthand also works if you have GitHub SSH keys configured.
That registers the six sub-agents and the /redteam:*
commands. Run
redteam-install
(also exposed as a redteam-install
tool on PATH) from your project root to vendor the harness in, then use the others as needed:
/redteam:redteam-install # vendor .redteam/ into the current repo
/redteam:redteam-new-task # scaffold the next task-NNN dir + input.md from the template
/redteam:redteam-review # one-shot cross-model review of the current branch diff
/redteam:redteam-config # choose the per-role models (writer / reviewer / rescue)
/redteam:redteam-status # show the pipeline status for a batch
python3 .redteam/scripts/install.py /path/to/your/project
python3 .redteam/scripts/install.py /path/to/your/project --dry-run
Useful flags: --overwrite
(refresh harness-owned files; never touches your
config.toml
/ docs/*
/ batches/
), --protect-config
(opt-in: add Claude
Code Edit/Write
deny rules for .redteam/config.toml
to the consumer's
.claude/settings.json
, add-only β the runtime pairing guard is the backstop
regardless), and --check
(report whether a vendored install is behind this harness version, then exit β writes nothing).
Either way it's the same vendoring model: the harness ships inside your project
tree (.redteam/
) because the engine resolves your repo root from its own file
location. Harness-owned files (workflows/
, prompts/
, templates/
, agent
skeletons) are re-vendored on each run (--overwrite
to refresh); project-owned
files (config.toml
, docs/*
, verify.sh
, your batches/
) are seeded once and never overwritten.
The installer does not vendor the harness's own unit tests, so a consumer
never runs (or maintains) them β your verify.sh
runs your tests, not the
engine's. The vendored .redteam/
engine follows the harness's own style, so exclude .redteam/ from your project's linter/formatter (e.g. ruff's
extend-exclude
, an eslint ignore) to avoid it flagging code you don't own.- Python 3.11+ (stdlib only β zero runtime pip dependencies).
- The model CLIs you configure, installed and authenticated:
and/or
claude
codex
.
A vendored install is a copy of the engine in your repo, so it doesn't update
itself β you re-vendor when a new version ships. --overwrite
refreshes only
harness-owned trees (workflows/
, prompts/
, templates/
, scripts/install.py
,
the six agent skeletons, and the .redteam/.redteam-version
stamp); your existing
project-owned files (config.toml
, docs/*
, verify.sh
) and your task content
under batches/
are never overwritten (the installer only ensures an add-only
batches/.gitignore
rule there, leaving your files intact).
--check
compares thesourceside against your vendored stamp, so it's only meaningful when the source is thenewerone β run it from an updated plugin (redteam-install β¦
) or a fresh clone. Running your repo's own vendored.redteam/scripts/install.py
against that same repo compares the stamp to itself, so it can't reveal an upstream release (it just echoes the vendored version, orunknown
if the stamp is missing). Exit codes:0
current/ahead Β·1
outdated Β·2
cannot determine. It writes nothing.
The plugin ships the engine and puts redteam-install
on PATH, so updating is two layers β refresh the plugin first, then re-vendor the engine it carries:
/plugin marketplace update ascendy-redteam # refresh the cached marketplace
/plugin update redteam@ascendy-redteam # update the plugin to the latest
/plugin list # confirm the new version
/reload-plugins # apply updated commands/agents (no restart needed)
Then re-vendor the engine into your repo and confirm. Because redteam-install
self-locates the plugin's (now-updated) source, its --check
meaningfully compares that against your repo's vendored stamp:
redteam-install . --check # plugin source vs your vendored stamp: 1 = outdated
redteam-install . --overwrite # re-vendor the new engine into .redteam/
redteam-install . --check # expect "verdict: up-to-date."
bash .redteam/scripts/verify.sh # your gate still passes
Pull the latest of this repo (your clone), then run the clone's installer against your project so the source side is the updated one:
python3 /path/to/redteam-clone/.redteam/scripts/install.py /path/to/your/project --check
python3 /path/to/redteam-clone/.redteam/scripts/install.py /path/to/your/project --overwrite
Do the update on a branch and open a PR (don't push the engine bump straight to
your default branch), and keep .redteam/
excluded from your linter as in Install.
Edit .redteam/config.toml
for your stack (paths, verify_command
,
branch_prefix
, roleβmodel), then fill the three project docs the sub-agents read:
.redteam/docs/project-context.md
β stack + hard rules.redteam/docs/security-checklist.md
β the reviewer's hard lines.redteam/docs/test-conventions.md
β how your test suite is wired
Two complete examples to copy the shape from: examples/fastapi-like/
(Python β
FastAPI + Celery + Postgres + a vector DB) and examples/nuxt-like/
(JS/TS β Nuxt 3 + Vue + Vitest).
python3 .redteam/workflows/orchestrator.py new .redteam/batches/<batch> <slug> [--title "..."]
python3 .redteam/workflows/orchestrator.py start .redteam/batches/<batch>
python3 .redteam/workflows/orchestrator.py resume .redteam/batches/<batch>
python3 .redteam/workflows/orchestrator.py status .redteam/batches/<batch>
A batch is a directory of tasks/<task-id>/input.md
briefs. new
scaffolds the
next task-NNN
directory with a template input.md
(or use
/redteam:redteam-new-task
); fill in the brief, then start
. The orchestrator
creates a per-task branch (<branch_prefix>/<task-id>
), runs the pipeline, and stops at each human gate until you touch the sentinel file it names.
One-shot review (no batch). To run just the adversarial reviewer over your current branch diff β a different provider than whoever wrote the code, read-only:
python3 .redteam/workflows/orchestrator.py review
It reviews git diff <base>...HEAD
and exits 0
/ 1
/ 2
(approved /
changes requested / reviewer failed), so it can gate CI. Exposed as
/redteam:redteam-review
in Claude Code. Fail-closed: it refuses if the configured reviewer would collapse to the worker's own provider (self-review).
Issues and PRs welcome. See CONTRIBUTING.md for the
dev setup and the gate (bash .redteam/scripts/verify.sh
), and the Code of Conduct. The engine stays project-agnostic and stdlib-only β those two invariants drive most review feedback. To report a vulnerability, see SECURITY.md (don't open a public issue).
Apache License 2.0 (LICENSE
). Contributions are accepted under the Contributor License Agreement, which keeps provenance clean and preserves the option of offering the project under other terms.