I know you're arguing with your wife over 'one more prompt,' and here's why

A developer argues that AI coding tools like Cursor accidentally create addictive gamification loops, citing a METR study showing developers refused to work without AI even when paid $150/hour, despite evidence it slowed them down. The author maps the experience to seven of eight core drives in Yu-kai Chou's Octalysis framework, explaining why 'one more prompt' feels compulsive.

Strategy The Accidental Gamification of Vibe Coding The Accidental Gamification of Vibe Coding Why "one more prompt" feels like a slot machine for builders. I build gamification infrastructure for a living. Then, I realized Cursor accidentally shipped a better gamification system than anything I have built in the last year. Then I realized why I had not stopped for 11 hours. Here is the detail that made me sit up. In July 2025, METR ran a careful randomized trial on experienced developers and found that AI coding tools made them 19% slower while they believed they were 20% faster. A 39 point gap between feeling and reality. In February 2026 they tried to run it again with newer tools and could not. The reason is the most honest sentence anyone has written about this whole phenomenon: developers refused to participate, because they would not work without AI even for a handful of tasks in a paid research setting, even at $150 an hour. A productivity tool that people will not put down for one afternoon, even when paid to, even when the data says it slows them down, is not behaving like a productivity tool. It is behaving like a slot machine. At first I assumed this was just normal developer flow. You get an idea, you build, you debug, you ship. Nothing new. But vibe coding feels different because the loop is compressed. Describe intent. Watch code appear. Run it. See what breaks. Prompt again. The weird part is that the most addictive moment is not when the AI works perfectly. It is when it almost works. That "almost" creates the loop. One more prompt. One more fix. One more regeneration. One more deploy. One more "actually, change this tiny thing". I started mapping this against Yu-kai Chou's Octalysis framework, the standard model for gamification motivation. It explains the feeling more cleanly than I expected. Vibe coding accidentally hits seven out of eight core drives. Most products that ship gamification on purpose hit two or three. Quick context if you have not seen it before. Yu-kai Chou published Octalysis in 2015 after a decade studying what makes games and consumer products feel compulsive. He arranged eight core human motivations as an octagon. The top four Epic Meaning, Accomplishment, Empowerment, Ownership are white-hat drives that produce sustained, fulfilling engagement. The bottom four Social Influence, Scarcity, Unpredictability, Loss Avoidance are black-hat drives that produce urgency and compulsion. The framework also splits drives between left-brain extrinsic motivation driven by external reward and right-brain intrinsic motivation driven by the activity itself . It is the standard lens product designers use to analyze why a system feels engaging or addictive. Tencent, LEGO and eBay have used it explicitly in their product work. It explains the feeling more cleanly than I expected. Vibe coding accidentally hits seven out of eight core drives. Most products that ship gamification on purpose hit two or three. Development & Accomplishment Every successful run is a micro level-up. A passing test. A working button. A green deploy. An "implementation complete" message. Traditional coding has these too, but AI compresses the distance between desire and reward. You used to wait hours to see progress. Now you wait seconds. This is the same drive that makes Duolingo's lesson completion screen work. Vibe coding tools just run that loop faster. Empowerment of Creativity & Feedback This is the magical part. You describe a thing and watch it become real. The feedback loop is immediate. It feels less like programming and more like sculpting reality with language. The cost of "what if" experiments drops to near zero, which means you run more of them, which means you discover things you would not have explored before. This is why the first hour of vibe coding feels qualitatively different from the first hour of normal coding. You are not coding. You are exploring. Ownership & Possession Even when the codebase gets messy, it is your messy codebase. Your app. Your agent workflow. Your weird half-working product. The more prompts you invest, the harder it gets to stop. Not because it is rational, but because you have invested attention, prompts, credits, commits, and a piece of your identity. This is sunk cost wearing the clothes of pride. Social Influence & Relatedness The internet amplifies the loop. "Built this in a weekend." "Solo founder with agents." "30 days, 70k lines." "Shipped an MVP in 48 hours." Even if you are not competing directly, you absorb the pace. Stopping starts to feel like falling behind. This drive is strong enough that the people at the top of the industry openly admit they are inside it. Garry Tan, CEO of Y Combinator, publicly said he became so dependent on Claude Code that he stayed awake for 19 hours and now sleeps four hours a night. Andrej Karpathy, who coined the term vibe coding in February 2025, said he has "never felt so behind as a programmer". When the people defining the field say they cannot keep up with the field, the social pressure on everyone else is structural, not optional. Scarcity & Impatience Rate limits. Credits. Context windows. Fast models. New model drops. Scarcity makes the session feel more valuable. You do not just want to code. You want to use the good model while you still have access. But there is a sub-mechanic inside this drive that deserves its own attention, because once you see it you cannot unsee it. Most subscription AI tools, Claude included, do not just rate-limit you. They allocate a window of tokens that resets at a fixed time, and unused tokens evaporate when the window closes. This is no longer scarcity. It is "use it or lose it", which is one of the most aggressive loss-framing mechanics in behavioral science. Airline miles do it. Cigarette company promotions did it for decades. Cellphone "minutes that expire monthly" did it. It works because it stacks three black-hat drives at once. You feel the scarcity because the token pool is finite. You feel the loss because unused tokens are a visible deletion at window close. You feel the sunk cost because you paid for the subscription and any unused token reads as money on the floor. The result is something more specific than "I want to use the tool". It is "I should use the tool before the window closes". That second motivation is what turns a 2 hour session into a 6 hour one, and it is what produces the 4 hours of sleep pattern Garry Tan and others have publicly described. They are not optimizing for output. They are optimizing for window utilization. The product is scheduling their lives. Vibe coding tools did not invent any of these mechanics. They inherited them from infrastructure economics and discovered they work even harder on developers than they did on slot players. Unpredictability & Curiosity This is the big one. Every prompt is uncertain. Sometimes the model breaks everything. Sometimes it fixes the bug. Sometimes it solves the problem in a way you would never have considered. That variable reward is what turns the loop from "tool use" into "slot machine for builders". You are not only asking "will this work?". You are asking "what will happen this time?". That second question is what keeps you pulling the lever. Losses Disguised as Wins There is a sub-mechanic inside this drive worth pulling out, because once you see it you cannot unsee it. Slot machine researchers have a term called Losses Disguised as Wins. The player bets 10 units. The machine returns 5. The screen flashes, music plays, the win animation runs. The player has lost 5 units, but the brain encodes it as a win. Average slot players see roughly 1000 spins per hour. 680 are real losses. 140 are real wins. 180 are LDWs. The LDWs are what keep people at the table. Most of what we call "successful" AI code generations are LDWs. The compile is green. The tests pass. The button works. But the implementation uses an end-of-life dependency. Or a library with a known CVE. Or a structure that will be impossible to refactor in three months. Or a pattern that quietly violates the rest of the codebase. You felt a win. You actually bought deferred debt. The session feels productive because the animation runs. The cost shows up later, when someone else often a future you tries to extend the code. Loss & Avoidance The hardest time to stop is not after a clean success. It is when the app is broken but feels close. You cannot quit on a broken build. You cannot leave the bug unresolved. You cannot waste the last six hours. You cannot stop when the next prompt might fix it. This is the Zeigarnik effect with a slot machine attached. Open loops occupy mental space until closed, and vibe coding tools rarely give you a clean closing moment. There is always one more prompt that might resolve the loop. That is how a productive session becomes a compulsive session. Epic Meaning & Calling The eighth drive is softer but still present. Non-coders become builders. Developers become solo teams. Ideas that used to be impossible now feel reachable. That identity shift is real and powerful. It also makes the activity feel like a calling, which makes it harder to step away from on a Saturday afternoon. The white-hat to black-hat slide Octalysis splits these eight drives along a vertical axis. The top four Epic Meaning, Accomplishment, Empowerment, Ownership are called white-hat. They produce sustained motivation, fulfillment, and growth. The bottom four Social Influence, Scarcity, Unpredictability, Loss Avoidance are called black-hat. They produce urgency, compulsion, and short-term action. Vibe coding starts in the top half. You feel creative, capable, in flow. Then debugging starts. The longer the session goes, the more the bottom drives take over. By hour six you are not building anymore. You are avoiding loss, chasing variable reward, and protecting sunk cost. It starts as flow. It becomes an open loop machine. When one agent becomes a swarm The dynamic gets worse, not better, as the tools get more capable. In 2026 the frontier is no longer one developer plus one agent. It is one developer orchestrating four or five agents in parallel. One writes code, one updates documentation, one hunts bugs, one refactors. You stopped being a craftsman. You became an air traffic controller for algorithmic workers moving faster than your attention can track. A METR study published this year measured experienced developers using AI tools against developers without them. The AI group was 19% slower at task completion. They felt 20% faster. That is a 40 point gap between perceived productivity and actual productivity, and it is the size of the addiction. Steve Yegge's "Gas Town" experiment is the canonical case. 50 named agents, 189,000 lines of Go in 12 days, no human reviewing the architecture in any sustained way. At one point an agent called Deacon, designed to clean up "stale" worker processes, started killing live workers mid-task and froze the system for days. They called the resulting incident the Murder Mystery bug. Armin Ronacher, the creator of Flask, has a name for this whole class of failure mode. He calls it Agent Psychosis. The pattern is consistent. Developers stop questioning their agents. The agents reinforce each other's mistakes. The output keeps compiling. The system collapses on a timeline measured in weeks. So what is the conclusion? I do not think the answer is "vibe coding is bad". The honest conclusion is that motivational architecture is a design choice, and most products do not make it consciously. Cursor, Bolt, v0 and Claude Code did not set out to build a slot machine. They optimized for fast feedback, broad capability, and visible progress. The white-hat drives showed up first because those are the ones that produce magic. The black-hat drives showed up later because they are the natural shape of compressed feedback loops with variable rewards. Most products want the top half of Octalysis and ship the bottom half by accident. The companies that did the work to keep their products in the top half Duolingo until 2023, Strava, Notion's earlier years generally did it on purpose, with explicit choices about where to refuse engagement leverage. I build Hatched.live https://hatched.live because I think this should be an explicit choice, not an emergent one. If you ship a product, the question is not whether you ship gamification. Compressed feedback plus variable reward plus visible progress already counts. The question is which half of the framework you ship. If you want to see where your current product sits, hatched.live/audit https://hatched.live/audit will score your product against the eight drives for free. It takes about 30 seconds and gives you a Flow Architecture Score plus three concrete recommendations. The most dangerous part of vibe coding is not that AI is bad. It is that AI is sometimes brilliant. That uncertainty is what keeps you pulling the lever. The same is true for whatever you build next. The lever pulls people. The question is whether the pulling moves them forward or just keeps them at the table. Disclosure: I build Hatched.live, gamification infrastructure for product teams. This piece grew out of a weekend I could not stop coding.