The Hooman Method — my attempt at capturing and formalizing how I use and combine chat and coding agents across my projects. A developer named Hooman has published version 0.7 of "The Hooman Method," a formalized escalation protocol for combining chat and coding agents across projects. The method is designed for a single principal doing intermittent, high-stakes work across many domains, optimizing for cheap re-entry and delegation by introducing a scope-intake gate, kill criteria, and a visible-artifacts-over-opaque-memory rule. The operational guide and companion notes are hosted on GitHub Gist, with the gist serving as the authoritative canonical version. How to run the method. The reasoning, provenance, and references live in the companion: hooman-notes.md same gist . Status: Draft v0.7 2026-05-29 . Frames the method as an escalation protocol explore informally below the threshold; add structure only once re-entry / delegation / review / mutation-safety begin to bite ; moves scale-to-stakes to the front as the first decision; adds a scope-intake gate name the smallest useful slice or decompose — big is allowed, undefined is not , kill criteria for the friction audit, a process-as-avoidance failure mode, and a visible-artifacts-over-opaque-memory rule. A compress-and-clarify pass, not new machinery. Full changelog in the companion. Canonical version: https://gist.github.com/hooman/5811ee3bb7c235573299400167403985 https://gist.github.com/hooman/5811ee3bb7c235573299400167403985 . Local copies may lag; treat the gist as authoritative. This guide is operational: how to set up a workspace and run the method day to day. Read it to operate. The companion hooman-notes.md holds the why : motivation, what's distinctive, ecosystem positioning, verified references, the candidate-binding roadmap, and open questions. Consult it when you want the reasoning behind a prescription. Each rule here keeps its short why inline; the extended rationale lives in the companion. This is the method's own "table of contents, not textbook" rule applied to itself. Hooman does not govern all AI-assisted work, and is not meant to. Below the threshold, work the normal way: chat, sketch, experiment, run ad-hoc reconnaissance and code sessions. Invoke the method only when informal exploration stops being enough — when the work is large or durable enough that re-entry, delegation, review, or mutation-safety start to cost you. Concretely, escalate when two or more hold: - you expect to return to this after a gap; - an Executor will inspect or mutate nontrivial material; - there is more than one plausible path; - mistakes are expensive or hard to reverse; - you need an audit trail. Below that line the apparatus is overhead; above it, its absence is. This is scale-to-stakes from Bootstrapping , stated up front because it is the first operational decision — not a classification to file every effort into from the start. The scope-intake gate — big is allowed, undefined is not. The failure this method most has to guard against is not premature invocation; it is scope gravity — a large, interesting effort entering the system as fog. So before the first real brief, name three things: the smallest independently useful slice , the exclusion zone what is explicitly out , and the first decision that needs making. If you cannot name the slice, the next task is not setup or implementation — it is scope decomposition . This is the direction check below, fired at the moment of escalation. Working with an LLM on anything non-trivial hits a predictable ceiling: a single conversation can't hold the project's full context, detail work pollutes the attention budget, and decisions get re-derived each session. But the ceiling that actually governs this method is sharper than that, and stating it explicitly explains most of what follows. This method is built for a single principal doing intermittent, high-stakes work across many unrelated domains, whose main job is not this project and whose attention is the scarce resource. The dominant cost in that situation is not bad output — it is re-entry : picking a project back up after a three-week gap and reconstructing what was decided, what was half-done, and what must not be touched. So the method optimizes for re-entry being cheap, delegation being the default, and the principal's scarce attention going to judgment rather than to rediscovering context or shuttling handoffs. The response that works is role separation with mechanical artifacts : a conversational assistant Chat that holds strategy and direction, a tool-equipped agent the Executor that investigates and implements, and artifacts in the project tree that survive across sessions and across handoffs to entirely new agents. The human stays in the seat that can't be delegated — priorities, constraints, final decisions — and pushes reconnaissance, implementation, inventory, and cleanup outward. A high-level conversational assistant. Default register: conceptual, strategic, architectural, decision-focused. Chat's job: - Hold the project's vision and direction. - Frame problems and surface decisions. - Translate intent into well-shaped briefs for the Executor. - Review Executor output, weigh trade-offs, decide. - Keep the long arc visible across sessions. What Chat does not do: bulk-process detail — read codebases page-by-page, generate file inventories, summarize long docs. That work is offloaded. The exception that matters: Chat may inspect a targeted piece of evidence — a single pivotal file, one failing test, a narrow diff — when seeing it directly changes a decision. The rule is don't bulk-process details , not never look ; an over-abstracted Chat that refuses to look at anything becomes hostage to the Executor's framing. A tool-equipped agent e.g. Claude Code with filesystem and execution access. The distinction from Chat is embodiment, not subject matter : the Executor has hands — it can read, run, and write — and it fills this role whether the work is code, a bill of materials, or a research brief. The role runs in two modes with different safety profiles , and the method keeps them separate: Reconnaissance. Given a brief, investigate the relevant slice, surface trade-offs and the decisions that need human input, and recommend a path — without mutating anything . Read-only. This is the mode most users underuse: the Executor is a remarkably effective design partner when the brief asks it to think, not to type. Mutation. Once scope is locked through Chat iteration, execute. Writes are real; the discipline tightens accordingly show-diff, tests, rollback . The two modes carry different review requirements see Output contracts and, on a capable tool, different enforcement: reconnaissance maps to a read-only constraint like plan mode; mutation maps to show-diff-before-write. Treating them as one undifferentiated "do the work" role is the most common way unreviewed writes slip through. When Chat finds itself about to read a file, summarize a long doc, inventory a directory, or otherwise generate details rather than direction , that is the signal to write an Executor brief instead. The test: is the content I'm about to produce conceptual belongs in Chat or inventory / details belongs to the Executor ? A second axis — contextual weight — applies to persistence. Before adding anything to a context file that loads automatically into future sessions — AGENTS.md, the project context file, SKILL.md frontmatter, anchor docs — ask whether the marginal signal justifies the marginal attention cost. Every line in a persistent context file claims a slice of every loader's attention budget, every session. Lines that aren't pulling their weight are net-negative. The directional axis asks where content goes now ; the contextual-weight axis asks whether it earns persistence at all. Scarce attention makes review cost the real bottleneck , not execution. Vague delegated output forces you to reconstruct context just to judge it — the exact cost the method exists to avoid. So every delegated task carries a contract: what "done enough to decide quickly" looks like for this kind of work. The contract is stated in the brief and is what you check the response against. Three contracts cover most work. Match the contract to the task, not to the role. Reconnaissance read-only investigation . Returns: decision points surfaced; an evidence map tagging every material claim as inspected read directly , inferred reasoned from adjacent evidence , or unverified assumed, not checked ; assumptions marked; a recommended path; non-recommended paths briefly rejected; and an explicit statement that nothing was implemented. The evidence map is the load-bearing field — agents reliably produce plausible plans that under-report what they never actually looked at, and the tag forces that distinction into the open. Mutation implementation . Returns: files changed; tests run and their result; risks remaining; manual checks still needed; rollback notes where relevant. Show-diff-before-write applies. Analysis research, drafting, non-code synthesis . Returns: source quality separated from speculation; the strongest opposing evidence, not just the supporting case; practical decision implications; uncertainty labeled. This is the contract for work the Executor does outside software — a bill of materials, an analytical brief, a literature scan — where "good output" differs from a code plan. A delegated task with no answer to "what does done-enough-to-decide look like?" isn't ready to dispatch. Of the three, the evidence map is the one to keep if you keep only one — it is what makes a plausible plan auditable instead of merely trusted. The methodology accretes artifacts, rules, conventions, and process. Without discipline, this becomes the kind of process manual nobody reads. The test for every new rule, template, convention, or artifact category: what specific friction did this solve? If the answer is hypothetical "might be useful one day" , the rule isn't ready to codify. Write it only when real friction shows up. Maintain it only as long as the friction remains. This applies equally to additions to this methodology, to the project handbook, to the glossary, and to operational templates. Conventions earn their place by paying their way. The retrospective form: friction audit. The same test applied backwards on a recurring pass. Walk the existing artifact set — anchors, roadmap tracks, glossary entries, operational templates, any bindings adopted — and ask of each: what friction does this still solve? Items that no longer pay their way get deprecated or removed; items where the friction has shifted get re-scoped; items still earning their place stay. Treat the friction audit as one audit type among the others see Audits — review artifacts ; its findings seed a maintenance cycle the same way any audit does. Run on a slow cadence, or whenever accretion outpaces use. Kill criteria — deletion triggers, not just adoption triggers. "What friction does this still solve?" is necessary but not sufficient: without concrete signs to delete , the guardrail stays pure discipline, which is what the method otherwise tries to avoid. Treat these as heuristics that prompt a removal decision, not a tracking system — an artifact or mechanism is a candidate for deletion or merge when it has not been consulted in several cycles , when you catch yourself avoiding it , when it duplicates what the project context file already says , or when it has too few live entries to earn its own file . The friction audit is where they get applied. Underlying principle: every persistent context file is a table of contents, not a textbook. Anything that loads into a reader's working context — agent or human, entry-point doc or SKILL frontmatter or cycle-brief boilerplate — should point at content, not contain it. Detail lives in the doc that owns the topic and is loaded just-in-time. This keeps every reader's attention budget available for the work, not the navigation. Within this, project documents serve different audiences. Mixing audiences in one document leads to bloat and to docs nobody fully reads. The methodology distinguishes three categories: Agent-specific docs — entry points for AI agents. Lean, pointer-heavy. Tell an agent how to operate in this workspace and where to find context. The canonical example is AGENTS.md , increasingly a project-root convention across AI coding tools. Its effectiveness depends on being operational rather than narrative: command-first the exact commands an agent should run , task-organized sections by what an agent does, not by topic , closure-defined every section says how an agent knows the task is done . Keep it under roughly 200 lines — past that, neither humans nor agents reliably read it. Human-leaning docs — entry points for humans returning collaborators, new contributors . The canonical example is the project handbook see below . Agents can read them, but they're optimized for humans. Hybrid docs — content that serves both. Anchors philosophy, invariants, personas , roadmap tracks, the glossary. Both audiences read them; both benefit from the same content. Specific cases of the underlying principle: AGENTS.md doesn't restate the philosophy — it points to PHILOSOPHY.md . The handbook doesn't restate the rules — it points to this guide. The glossary doesn't restate full architectural definitions — it points to anchors. Definitions stay canonical, duplication stays low. A line limit without a pruning order invites arbitrary cuts. When AGENTS.md grows past its budget, remove in this order until it fits: Narrative rationale — move to the handbook or an anchor. Duplicated rules — keep one canonical statement, delete the rest. Stale setup notes — anything describing a state the repo has left behind. Repo-specific content — move down into that repo's nested AGENTS.md see below . Commands that belong in repo docs — point to them rather than inlining them. Prose paragraphs in AGENTS.md. Operational policy reads as imperative bullets, not narrative. Ambiguous directives "be careful," "use good judgment" . Replace with concrete rules or remove. Contradictory priorities across sections. If two sections imply different orderings, pick one and reconcile. "Docs AIs read but humans don't." Files written entirely in agent-pleasing telegraphic style accrete around AI-heavy projects and quietly displace the human-readable doc surface. The handbook is the cure — keep it human-leaning and don't let it drift toward terse agent-style brevity. When multiple context files exist AGENTS.md, CLAUDE.md, user-level memory, per-folder rules, overlays , they can contradict each other. Agents resolve such conflicts arbitrarily by default. Pick an explicit precedence rule for your workspace and document it in the handbook. Typical default: user-level < workspace-level < repo-level < per-folder; later overrides earlier . Whatever you choose, write it down — a sleeping precedence ambiguity is a future hour of debugging. Nested AGENTS.md as the natural application. In a multi-repo workspace, AGENTS.md doesn't live at a single level — it's a tree. A workspace-root AGENTS.md owns cross-cutting concerns where artifacts live, conventions common to all repos . Each sibling repo's own AGENTS.md owns repo-specific content stack, package manager, test commands, non-obvious patterns . Repo-level files extend the root — they don't restate it. The precedence rule above makes the layering unambiguous: closer files win for rules they state; rules they don't state inherit from the next level out. The artifacts that make the methodology work, in roughly the order you would create them. They appear on friction, not on day one — see Bootstrapping . A workspace is a parent directory holding one or more repos as flat siblings, plus a small set of process artifacts at the root.