Show HN: LoopFlow – loop engineering for Claude Code

wpnews.pro

Stop prompting your coding agent. Design the loop that prompts it.

LoopFlow turns Claude Code into a system that runs itself: you declare a goal, a pipeline of agents, and a verification gate in one YAML file — LoopFlow iterates until the gate passes, the budget runs out, or the attempt limit is hit. One agent writes, a different agent checks, and a memory file makes every run smarter than the last.

$ loopflow run test-and-fix

Iteration 1/3
  ▸ fix …
    done · $0.31 · resume: claude --resume 072f1abb…
  ▸ review (gate) …
    gate FAIL · $0.12
    │ The date parser fix only handles ISO strings; the failing test
    │ also feeds epoch millis. Root cause not addressed.

Iteration 2/3
  ▸ fix …
    done · $0.28
  ▸ review (gate) …
    done · $0.11

✓ success · 2 iteration(s) · $0.82

LoopFlow demo — release-check loop, 2 iterations

For two years the workflow was: write a prompt, read the output, write the next prompt. You held the tool the whole time.

That's changing. As Boris Cherny (creator of Claude Code) put it: "I don't prompt Claude anymore. I have loops running that prompt Claude."

A loop is a recursive goal: you define what "done" looks like, and the agent iterates until it gets there. But doing this raw has three sharp edges:

Agents grade their own homework. The model that wrote the fix will happily declare it works.Unattended loops burn money. A loop running itself is also a loop making mistakes — and spending tokens — unattended.The agent forgets everything between runs. Every run re-derives what the last run already learned.

LoopFlow is a small, sharp tool built around exactly those three problems:

Problem	LoopFlow answer
Self-grading	Gates — a separate agent, with a separate persona, must output `VERDICT: PASS` before the loop ends
Runaway cost	Budgets — a hard USD ceiling enforced twice: by the runner and by Claude Code's own `--max-budget-usd` on every step
Amnesia	Memory — a plain Markdown file per loop, appended after every run, injected into every prompt. The agent forgets; the repo doesn't
Collisions	Worktrees — opt-in git worktree isolation, so loops never fight you (or each other) for the working tree
Auditability	Every step logs a session id — `claude --resume <id>` drops you into the full transcript of any step, any time

No API keys, no daemon, no cloud. If claude

works in your terminal, loopflow

works.

npm install -g @loopflow/cli   # or: npx @loopflow/cli

cd your-project
loopflow init             # scaffolds .loopflow/ with three starter loops
loopflow run test-and-fix --dry-run   # see exactly what each agent will be told
loopflow run test-and-fix             # run it for real

Requirements: Node 18+, Claude Code installed and authenticated.

name: test-and-fix
description: Run the test suite, fix failures, verify the fix.

budget:
  max_usd: 2.00        # hard ceiling for the whole run, all iterations included
  max_iterations: 3    # how many attempts the gate may reject

worktree: false        # set true to run in an isolated git worktree

defaults:
  permission_mode: acceptEdits

steps:
  - id: fix
    role: >            # persona — appended to Claude's system prompt
      You are a careful maintainer. You make the smallest change that fixes
      the problem, and you never weaken a test to make it pass.
    prompt: |
      Run this project's test suite. Diagnose and fix the root cause of any
      failure. Re-run to confirm. Summarize what you changed and why.

  - id: review
    gate: true         # ← the loop cannot succeed until this step says PASS
    role: >
      You are a skeptical senior engineer reviewing a change you did not
      write. You trust nothing without evidence.
    prompt: |
      A previous agent claims to have fixed failing tests. Inspect the diff,
      re-run the suite yourself, and check no test was weakened or deleted.
┌──────────────────────────────────────────────┐
            │              iteration (≤ max)               │
            │                                              │
 memory ──▶ │  step: fix ──▶ step: review (gate) ──┐       │
   ▲        │      ▲                               │       │
   │        │      └── reviewer feedback ◀── FAIL ─┤       │
   │        └──────────────────────────────────────┼───────┘
   │                                               │ PASS
   └──────────────── run record ◀──────────────────┘

Each step is one headless Claude Code run ( claude -p

). Steps see the loop'smemory, the** outputs of earlier stepsin the iteration, and — on retries — the gate's feedback**. - A gate must end with VERDICT: PASS

orVERDICT: FAIL

. No verdict counts as FAIL:an unverified pass is not a pass. - On FAIL, the loop starts over with the reviewer's feedback injected into every prompt.

Every run appends a record to .loopflow/memory/<loop>.md

— outcome, cost, and the final summary — which the next run reads.

Here's what that looks like in a real run — a release-check loop catching a debug artifact the fix step missed:

loopflow init

gives you three loops designed to be stolen from:

— fixer + skeptical reviewer gate. The canonical write/verify pair.test-and-fix

— a discovery loop. Maintainsdebt-audit

.loopflow/reports/debt-audit.md

and uses memory to track what got fixed, what's new, and what keeps being ignored.— finds documentation that drifted from the code, fixes it in an isolated worktree, and a gate verifies every claim against the source.docs-sync

Got a loop of your own? Contribute it to the cookbook — community loops live in loops/.

LoopFlow deliberately ships no daemon. Use the scheduler you already have:

0 9 * * 1  cd /path/to/project && loopflow run debt-audit

schtasks /create /tn "debt-audit" /sc weekly /d MON /st 09:00 ^
  /tr "cmd /c cd /d C:\path\to\project && loopflow run debt-audit"

CI works too — a GitHub Action that runs loopflow run docs-sync

weekly and opens a PR from the kept worktree branch is ~20 lines.

A loop changes the work — it doesn't delete you from it. LoopFlow's design assumes three things stay true:

Verification is still on you. Gates catch the obvious failures, but--verbose

andclaude --resume <session-id>

exist so you can read what the loop actually did. Read it.Comprehension debt is real. The faster a loop ships code you didn't write, the faster the gap grows between what exists and what you understand. Memory files and kept worktrees are designed to beread by humans, not just machines.The comfortable posture is the dangerous one. When the loop runs itself, it's tempting to stop having an opinion. Design the loop with judgment — then keep judging the output.

Build the loop. But build it like someone who intends to stay the engineer, not just the person who presses go.

Command	What it does
`loopflow init [--force]`
Scaffold `.loopflow/` with starter loops
`loopflow list`
List loops with steps, gates, and budgets
`loopflow validate [name]`
Validate loop definitions (all by default)
`loopflow run <name>`
Run a loop
`--dry-run`
Print every composed prompt; invoke nothing
`-i, --iterations <n>`
Override `budget.max_iterations`
`-b, --budget <usd>`
Override `budget.max_usd`
`-v, --verbose`
Print full step outputs

Exit codes: 0

success · 1

loop failed (gate exhausted, budget, error) · 2

configuration error. Cron- and CI-friendly.

Everything the CLI does is exported:

import { loadLoop, runLoop } from "@loopflow/cli";

const loop = loadLoop(process.cwd(), "test-and-fix");
const result = await runLoop(loop, { root: process.cwd() });
console.log(result.outcome, result.costUsd);

loopflow daemon

— built-in scheduler with cron expressions inloop.yaml

Parallel steps (fan-out across worktrees)
Structured gate verdicts via --json-schema
Loop run history & loopflow logs
Adapters for other headless agents (Codex CLI, …)

The most valuable contribution is a loop that solved a real problem for you — see CONTRIBUTING.md. Code contributions: the engine is ~600 lines of typed, tested TypeScript; npm test

runs in under a second.

source & further reading

github.com — original article

Show HN: LoopFlow – loop engineering for Claude Code

Here's what that looks like in a real run — a release-check loop catching a debug artifact the fix step missed:

Run your AI side-project on zahid.host