Show HN: Pantheon – AI vs AI: one writes the code, the other attacks it A developer released Pantheon, an open-source multi-agent harness for Claude Code that runs coding tasks through parallel implementations and adversarial verification to catch bugs that single-pass models miss. The tool uses a pipeline of planning, parallel implementation, adversarial testing by a second AI, and synthesis, and is available in two versions: pantheon (Claude-only) and pantheon-x (cross-model with GPT-5.5). Two Claude Code skills that run a hard coding task through a multi-agent harness instead of a single model pass: plan → N parallel implementations → adversarial verification → judge . The point isn't a smarter model — it's that a second and third implementation, plus an independent reviewer whose job is to break the result, catches bugs a single pass ships green. It's a packaging of well-worn techniques — best-of-N sampling, tool-integrated self-correction, and LLM-as-judge / adversarial verification — wired into one /pantheon command so you don't reassemble them by hand each time. This is scaffolding around the model, not a change to it: it won't rescue a task the model fundamentally can't reason about, but it reliably tightens correctness on coding work whose answer you can express as tests. The harness runs a deterministic pipeline: Plan ──▶ Implement ×N parallel ──▶ Verify adversarial ×V ──▶ Synthesize │ │ each self-corrects │ try to BREAK each │ judge picks winner 1 planner │ against its own tests T1 │ green build │ + grafts best ideas N builders reviewers Plan — derive a tight spec, a test plan that defines correctness, and N distinct strategies before any code . Implement — N builders implement different strategies in parallel; each runs its own tests and self-corrects on failure tool-integrated self-verification, up to 5 iterations . Verify — independent adversarial reviewers try to break each green build; a build refuted by a majority is dropped. Synthesize — a judge picks the winner and lists superior ideas worth grafting from the runners-up. The value: a build can pass its own tests yet still be wrong. The adversarial layer catches defects the self-written tests miss, instead of rubber-stamping a green build. | Skill | Adversarial verifier | Requirements | |---|---|---| pantheon | Claude itself independent agents | Paid Claude Code plan + Workflows see below | pantheon-x | GPT-5.5 via Codex plugin cross-model | Above + OpenAI Codex plugin codex:codex-rescue | pantheon-x is the stronger setting: the implementation written by Claude is attacked by a different model, which shrinks single-model blind spots the same mistake slipping past a same-model verifier . If you don't have Codex/GPT-5.5, use pantheon . Both skills share the same harness pantheon-class.js ; they differ only in the crossModelVerify flag. These skills drive Claude Code's Workflow orchestration engine, so a stock/Free setup is not enough: Claude Code ≥ v2.1.154 on a paid plan — Pro, Max, Team, or Enterprise also Bedrock / Vertex / Foundry . Not available on the Free tier. - On Pro , enable it once: /config → turn on Dynamic workflows . the cross-model verifier runs as the pantheon-x only: codex:codex-rescue subagent, which ships in OpenAI's Codex plugin — not stock Claude Code. A logged-in codex CLI alone does not register it. Install the plugin:plus a ChatGPT subscription or /plugin marketplace add openai/codex-plugin-cc /plugin install codex@openai-codex OPENAI API KEY and the codex CLI on PATH. If — codex:codex-rescue isn't installed, use pantheon instead pantheon-x would otherwise silently skip the adversarial pass and pass every build. Skills and subagents themselves are stock Claude Code features; no extra setup beyond the above. Clone into your Claude Code skills directory personal install : git clone https://github.com/lolu1032/pantheon-skills.git cp -R pantheon-skills/pantheon ~/.claude/skills/pantheon cp -R pantheon-skills/pantheon-x ~/.claude/skills/pantheon-x Or for a single project, copy into