The successor to vibe coding is ultracoding: let agents programmatically spawn copies of themselves via code execution. Dynamically spin up multi-agent hierarchies in a task-dependent manner and, in doing so, scale up to previously unheard of tasks. Ride the dragon of exponential productivity.
It feels the like the jump from single-threaded scripts to MapReduce and Spark: fan-out across many, reduce/verify steps, and capable of orders of magnitude higher throughput.
This is what the future of building software is going to look like. Meta-harnesses like Claude workflows are the path to scaling up to massive multi-agent hierarchies capable of a fundamentally new category of tasks, in software and beyond.
#Operating at a new scale This pattern of LLMs recursively invoking themselves has previously demonstrated impressive results on academic benchmarks - see RLMs.
Recently however we've seen several impressive demonstrations in the wild in rapid succession, specifically for large code refactors and 0-1 projects.
Recent massive refactors demonstrated in the wild:
- Bun's refactor from
[Zig to Rust](https://github.com/oven-sh/bun/pull/30412) [Monty refactor](https://github.com/pydantic/monty/pull/500)to subprocess pool- Cursor
[building a browser from scratch](https://cursor.com/blog/scaling-agents)with a swarm of agents
Exact implementation details for the above are light, but we can infer that each was accomplished via a swarm of agents working in parallel, managed by a small number of humans in a custom harness. The commonality is that each task has high test coverage and therefore lends itself to horizontally-scalable "ralph-loops" (now a first-class primitive in tools like Codex's /goal) and human verification.
#Code Mode as a Multi-agent Substrate A key enabler for this emerging pattern is agent proficiency at "code mode" - programmatically invoking tools via code execution.
The latest generation of LLMs are RL'd to operate specifically in this manner. It's a more efficient way to act on the world - it can compose bespoke bulk actions at runtime instead of one tool call at a time and enables agents to effectively assemble their own tools.
This pattern was introduced by Voyager, and Perplexity/Cloudflare/many others have since introduced code mode-oriented interfaces. OpenAI and Anthropic even expose this tool calling method in their APIs via simple config (1, 2).
Historically, multi-agent harnesses have been hard-coded and established an explicit heirarchy of agents with different roles and communication patterns. Ultracoding, like workflows, cedes this territory to the bitter lesson and acknowledges that agents can dynamically determine the best meta-harness at runtime. Infra-wise, this only requires the addition of a "spawn agent" tool within an existing (persistent) code mode execution environment.
This ability, to spin up a harness in a task-dependent manner at runtime, has radically reduced the barrier to entry and means you can realistically chat your way to a massive refactor or ambitious 0-1 project.
#Scaling Human-Agent Hierarchies: The UX Massive multi-agent hierarchies are unlocked from a capabilities perspective - now, the major barrier to widespread adoption is better UX for human in the loop.
As Swyx has noted, the UX patterns for ultracoding are nascent. There's no established way to view/triage incremental outputs; The two patterns that have dominated thus far have been agent lists and Kanban boards, however this is clearly not a terminal state.
I think we will imminently move towards a model where the agent expresses a UI for human oversight as part of the meta-harness. This may look like hooking into an existing UI like ClickUp or Linear, or alternatively writing bare HTML in a completely bespoke workflow in case bulk approvals or triage is necessary for the human.
In the fullness of time, agents will effectively dynamically code oversight applications for human orchestrators, directly hooked up to "workflows" and with with bespoke approval and triage flows baked in.
#Ultrawork I think about agents for general-purpose knowledge work and the analogies to code. From what I know of our customers at ClickUp it's obvious: this same pattern applies to many workstreams that emerge in recruiting, sales, project management, legal services, accounting, etc.
This pattern of dynamic multi-agent hierarchies will wash out over knowledge work more generally. Instead of babysitting chat loops, you spin up a bespoke app for the task on the spot, with a UI built to verify the task in aggregate. The stuff that lives in a spreadsheet today becomes an application the agent assembles for you.
Knowledge work time-lags code, so adoption will be incremental for everyone rather than a sudden flip. But the payoff is steep: the efficiencies are large, and once it works at scale you can take on a fundamentally new scope of work, not just the same tasks done faster.
The unlock is that UI. Give a human the right way to participate in and verify a large-scale job and the horizon of what an LLM can take on extends dramatically, not just inside code where the review flow is obvious, but everywhere. Excited to see conventions established for this in the back half of 2026.