{"slug": "i-stopped-trusting-claude-s-code-reviews-so-i-built-a-skill-that-puts-my-code-on", "title": "I stopped trusting Claude's code reviews, so I built a skill that puts my code on trial", "summary": "A developer built Tribunal, an adversarial Claude skill that reviews code diffs by pitting biased agents against each other to produce more honest feedback. The system uses per-file 'haters' that attack the diff on technical merits and a cross-module bug hunter, with a judge that validates each accusation against actual code intent. The skill is open-source, portable across Claude Code and Claude Cowork, and supports multiple programming languages.", "body_md": "Every time I asked Claude to review my branch, I got one of two answers: a cheerful **\"Looks good! 👍\"** or a vague list where I couldn't tell a real bug from a matter of taste. The model wants to please you. That's exactly the problem.\n\nSo I built **Tribunal** — a Claude skill that reviews your diff *adversarially*, in stages, where the honest signal comes from agents fighting each other instead of one polite model.\n\nA single model told to \"be critical\" still hedges — it's trained to be agreeable. So instead of one balanced reviewer, Tribunal runs **one-sided roles that collide**:\n\nOne agent per file, deliberately biased. It tears the diff apart as if a clueless amateur wrote it — focused only on what changed. But strictly on the merits: correctness, races, leaks, edge cases, security. No style nitpicks.\n\nPer-file haters are blind to cross-module bugs. A separate agent hunts exactly those: a changed function signature whose caller still calls the old way, a return shape a consumer no longer matches, invariants out of sync across files.\n\nFor each accusation, the judge digs into the actual code and decides honestly: was this **deliberate and justified**, or **genuinely weak**? It's allowed to use docs and comments as evidence of intent — the opposite of the hater, who ignores them as excuses.\n\nKeeps only the spots the judge **couldn't defend** — or conceded are weak even while defending the choice. Everything else drops to a full transcript.\n\nThe balance doesn't live inside any single agent — it comes from the clash *between* them. A hater that can **only** attack, meeting a judge that **only** looks for justification, produces a sharper, more honest signal than one model trying to be \"balanced\" on its own.\n\nAnd the hater is allowed to return nothing. On a clean diff it's not forced to invent problems — empty is a valid, honest result.\n\nA ranked report written to `docs/reviews/`\n\n, plus a short chat summary: what to actually fix, by severity (critical → major → minor), with a concrete fix for each.\n\nIt's **portable** — pure Claude sub-agents (the `Agent`\n\ntool), no external runtime, no dependencies. Works in **Claude Code** and **Claude Cowork**, in any language (Python, JS/TS, Go, Rust, Java… one config line to add yours).\n\nIt's MIT and free: [https://github.com/hekman316/claude-skill-tribunal](https://github.com/hekman316/claude-skill-tribunal)\n\nInstall is one paste — ask Claude to fetch the `SKILL.md`\n\nfrom the repo and drop it in `~/.claude/skills/`\n\n. Then in any repo just say `/tribunal`\n\n.\n\nI'm genuinely curious what people think of the adversarial-roles approach. Does forcing the model into one-sided roles actually beat just asking it to be harsh? Would love feedback — or attempts to break it.", "url": "https://wpnews.pro/news/i-stopped-trusting-claude-s-code-reviews-so-i-built-a-skill-that-puts-my-code-on", "canonical_source": "https://dev.to/hekman316/i-stopped-trusting-claudes-code-reviews-so-i-built-a-skill-that-puts-my-code-on-trial-1ll0", "published_at": "2026-06-13 07:32:35+00:00", "updated_at": "2026-06-13 07:47:38.940289+00:00", "lang": "en", "topics": ["ai-agents", "developer-tools", "large-language-models", "ai-products"], "entities": ["Claude", "Tribunal", "GitHub", "Claude Code", "Claude Cowork"], "alternates": {"html": "https://wpnews.pro/news/i-stopped-trusting-claude-s-code-reviews-so-i-built-a-skill-that-puts-my-code-on", "markdown": "https://wpnews.pro/news/i-stopped-trusting-claude-s-code-reviews-so-i-built-a-skill-that-puts-my-code-on.md", "text": "https://wpnews.pro/news/i-stopped-trusting-claude-s-code-reviews-so-i-built-a-skill-that-puts-my-code-on.txt", "jsonld": "https://wpnews.pro/news/i-stopped-trusting-claude-s-code-reviews-so-i-built-a-skill-that-puts-my-code-on.jsonld"}}