# /architect: Reduce Fable tokens by 80%, Fable orchestrates/reviews, Codex builds

> Source: <https://github.com/DanMcInerney/architect-loop>
> Published: 2026-06-12 20:33:22+00:00

**Claude Fable is the architect — it designs every slice, freezes the
acceptance gates, and judges the results. GPT-5.5 Codex is the builder and
researcher — it does all the engineering and all the web research, in
parallel, unattended, for hours.** Two Claude Code skills that run this
cross-vendor loop on the flat-rate subscriptions you already have — no API
keys, no token bills.

```
git clone https://github.com/DanMcInerney/architect-loop
cd architect-loop && ./install.sh        # Windows: .\install.ps1
npm i -g @openai/codex@latest            # the builder (Codex CLI >= 0.133)
```

`./install.sh --project`

installs to the current repo only instead of
globally. You need [Claude Code](https://claude.com/claude-code) on any paid
plan and the Codex CLI signed into a ChatGPT plan.

```
/architect                                      # the build loop
/architect-research <what you're considering>   # the research loop
```

`/architect`

runs one work block: judge the last run, spec the next slice,
dispatch builders. `/architect-research`

is for when you're still deciding
*what* to build — its cited report feeds the build loop's PRD.

One short Fable session per work block — judgment only, it never writes code:

**Spec + gates first.** Fable specs a one-PR slice, splits it into 1–4 lanes with provably disjoint file sets, and commits the acceptance gates to`docs/gates/`

*before*any builder starts. Gates are read-only; a builder edit to a gate file fails the slice automatically.**Parallel isolated builders.** One fresh`codex exec`

(xhigh) per lane, each in its own git worktree. Builders must argue with the spec before building (silent compliance = defect), build only their declared files, and report raw results — they physically can't commit (the sandbox protects`.git`

).**Fable judges and integrates.** It runs the gate commands itself (builder claims are hearsay), reads the diff against the spec's intent (passing tests ≠ mergeable work), then commits and merges passing lanes. Judgment happens in a fresh session — cross-context review measurably beats same-session review.**The repo is the only memory.**`docs/HANDOFF.md`

(a short table of contents, pruned every session),`docs/gates/`

,`docs/lanes/`

, git history. Not in the repo = didn't happen.**Supervision built in.** Liveness checks on dispatched runs, stall triage (diagnose the child process tree, kill the narrowest thing), explicit timeouts on every long command.

Scout-first, like the production deep-research systems — no fixed lane taxonomy:

**A cheap Codex scout maps the topic**(~10 searches): canonical terminology, the load-bearing systems and papers, the named people, the topic's natural fault lines. Skipped for comparisons and fact-finds.**Fable designs 3–6 topic-specific lanes** from the scout's map, drawing per-source-class tactics from a library (academic citation snowballing, dependents-not-stars repo evidence, emerging-vs-hype gating, production pattern mining, expert tracking) — checked for overlap and gaps before dispatch.**Parallel Codex researchers** run under hard budgets: search caps, ≤5 subjects per lane, saturation stop, strict findings discipline (URL + date- quote + confidence tag; NOT FOUND beats inference; no recommendations). Expert opinion runs as a second wave, roster-seeded by the first.

**Fable verifies and writes.**≥2 independent sources per load-bearing claim, adversarial falsification searches, citations only from URLs actually fetched — then one author writes one decision-oriented report. Gathering parallelizes; synthesis never does.

Each piece is there because evidence put it there (full citations in
[DESIGN.md](/DanMcInerney/architect-loop/blob/main/DESIGN.md)):

- Weak planners hurt more than weak executors — so the strongest model does the design, and builders get exhaustive specs.
- Manager + worktree-isolated workers is the measured-best topology for shared-artifact software work; naive shared-file coordination collapses throughput.
- Frozen external gates beat trusting the agent — but agents game visible tests and their passing PRs are frequently unmergeable, so the architect also reads the diff.
- Memory files rot — so the handoff stays a short map, and detail lives in linked gate/lane files.
- Every production deep-research system uses planner-designed decomposition, none uses fixed lanes — so research lanes are designed per topic, after a scout pass.

| File | What it is |
|---|---|
|

[skills/architect/SKILL.md](/DanMcInerney/architect-loop/blob/main/skills/architect/SKILL.md)[skills/architect/dispatch.md](/DanMcInerney/architect-loop/blob/main/skills/architect/dispatch.md)`codex exec`

commands, builder block, worktree fan-out, stall triage[skills/architect/research.md](/DanMcInerney/architect-loop/blob/main/skills/architect/research.md)[skills/architect/HANDOFF.template.md](/DanMcInerney/architect-loop/blob/main/skills/architect/HANDOFF.template.md)[skills/architect-research/SKILL.md](/DanMcInerney/architect-loop/blob/main/skills/architect-research/SKILL.md)[skills/architect-research/lanes.md](/DanMcInerney/architect-loop/blob/main/skills/architect-research/lanes.md)[tests/validate_skills.py](/DanMcInerney/architect-loop/blob/main/tests/validate_skills.py)**Do I need API keys?** No. Claude Code runs on your Claude plan; Codex CLI
on your ChatGPT plan.

**What does a run cost?** Builder/researcher runs draw on your ChatGPT
plan's 5-hour and weekly quotas; a multi-hour run is a meaningful fraction
of a weekly window. Fable's architect sessions are minutes, not hours.

**What if a builder wrecks things?** Nothing reaches a branch until the
architect's tamper, boundary, and gate checks pass — worktrees are
discarded and re-dispatched from the freeze commit.

**Can I watch a run?** Yes — every dispatch prints the builder block, so you
can paste it into an interactive `codex`

session with `/goal`

instead.

**Why two skills?** Research-grade fan-out costs ~15× chat-level tokens — it
should be a deliberate act, not a side-effect of the build loop.

MIT