Show HN: Gave Claude Code ADHD.. Now it thinks 3x better Researchers introduced ADHD, a method that fans out parallel divergent branches under different cognitive frames to prevent premature convergence in large language model agents. Across six open-ended engineering problems, ADHD won 5/6 against a single-shot baseline, achieving mean improvements of +5.17 in novelty, +4.17 in breadth, and +7.67 in trap detection on a 0–10 rubric. The method addresses the failure mode where models default to the first plausible answer, making it particularly valuable for creative and design-shaped tasks where the goal is to escape high-probability responses. Preprint · v0.1 · 2026-05-25 Large language model agents exhibit premature convergence : when asked to ideate on an open-ended design problem they default to the first plausible candidate and polish it, producing competent but forgettable output. We introduce ADHD , a method that fans out N parallel divergent branches under structurally different cognitive frames e.g. regulator , speedrunner , biology , $0 budget , with no cross-branch context, then converges via a separate critic pass that scores, clusters, and deepens only the top- K survivors. ADHD differs from Chain-of-Thought along three load-bearing axes: branches are isolated rather than shared, branching is driven by vantage-point reframing rather than next-step variation, and the generator/critic split is enforced mechanically separate LLM calls with opposite system prompts rather than promised by a single context. Across six open-ended engineering problems judged by an independent LLM-as-judge, ADHD wins 5/6 against a single-shot baseline at the same model, with mean improvements of +5.17 in novelty , +4.17 in breadth , and +7.67 in trap detection on a 0–10 rubric. We argue ADHD is the right inference-time structure for creative, interdisciplinary, and design-shaped tasks where the failure mode is not wrong but obvious . A modern LLM, prompted with "give me a few ways to do X" , will almost always produce the same three answers a senior practitioner would. This is not a bug at the token level — those are the high-probability completions — but it is a failure at the task level whenever the user's purpose is to escape the high-probability answer. We call this failure mode premature convergence : the model evaluates as it generates, the early tokens anchor the late tokens, and the output is the centroid of the training distribution dressed up as a recommendation. Premature convergence is most costly in exactly the regimes where ideation matters most: architecture decisions, API and SDK design, debugging fuzzy intermittent failures, refactor planning, naming, positioning, and any task whose deliverable is a set of viable options rather than a single answer. In these tasks the textbook answer is often the trap, and the interesting answer lives in what the original divergent-ideation skill calls "the awkward middle, past the first three" . 1 ref-skill Existing inference-time methods address adjacent problems. Chain-of-Thought CoT 2 makes one head reason more slowly along one path, exposing the intermediate steps so the model does not skip them. We propose ADHD : a method that produces such a range by structurally preventing the generator from converging during divergence, and only converging in a separate, posterior critic pass. ADHD borrows the tree structure of ToT but replaces its branching driver next-step search with vantage-point reframing , and replaces ToT's intermingled generator/evaluator with two strictly separated LLM calls. The result, on the evaluations we report below, is a method that wins clearly against a single-shot baseline on novelty, breadth, and trap detection — the dimensions premature convergence destroys. CoT makes one head think slower. ToT makes one head search wider. ADHD makes many heads think differently , in parallel, then has a critic pick. Chain-of-Thought 2 elicits intermediate reasoning by prompting or fine-tuning the model to "think step by step". It is decisively useful on multi-step problems with verifiable answers arithmetic, symbolic reasoning but it is a single linear trace: each step is conditioned on the previous, which is precisely the anchoring dynamic ADHD is designed to break. Tree-of-Thought 3 generalises CoT to a tree of intermediate "thoughts" with explicit search BFS or DFS and an evaluator function that scores partial states. ToT is the closest neighbour of ADHD, and ADHD can be described as a ToT variant. The differences are not cosmetic: i ToT's branches share a single conversational context so anchoring still occurs across steps, ii ToT's branching driver is Multi-Agent Debate 6 has multiple instances critique each other across rounds; this can improve factuality but converges aggressively toward consensus, which is the opposite of what ideation needs. A separate strand of work assigns the model a role — "you are an expert X" — to bias output style or domain knowledge. ADHD's cognitive frames superficially resemble this but differ in intent: frames are not chosen for expertise but for structural distortion . The "10-year-old" frame is not asked to be correct; it is asked to ignore convention . The "speedrunner" frame is not asked to be authoritative; it is asked to look for glitches . Frames are vantage-point operators, not credentials. ADHD operationalises a written skill on divergent ideation 1 that prescribes a divergence/convergence loop with explicit anti-patterns "convergence disguised as divergence", "weird-for-weird's-sake with no convergence", "refusing to commit" . Our contribution is to turn that prose into a mechanically enforceable runtime: separate LLM calls, isolated branches, and scoring-then-deepening rather than scoring-during-generating. ADHD is a two-phase loop with a hard mechanical separation between phases. Given a problem p , we select N frames F 1, …, FN from a library of 15 e.g. Critically, the N calls do not share context. The regulator branch never reads what the speedrunner branch produced. Anchoring is eliminated by construction, not by prompting. The frame library is tagged code , design , general , wild . When codeMode is enabled the default we bias selection toward engineering-relevant tags but always reserve one slot for a wild frame to preserve range. With the pool of N × k ideas in hand, we run three further calls: The final output is the wide set clustered , a 2–4 idea shortlist with the non-obvious-but-viable pick flagged explicitly, the trap list, the deepened sketches with their child ideas, and one wildcard provocation drawn from the highest-novelty leaf. Three invariants are load-bearing. Removing any of them collapses ADHD into a method that already exists. We implement ADHD as a Node/TypeScript library on top of the Claude Agent SDK 8 . The package ships a CLI adhd "