The pitch was impressive. AI tools would make developers faster, reduce headcount costs, and pay for themselves many times over. Companies that moved early would have a structural advantage over those that waited.
Microsoft believed it. So did Uber. Both pushed hard on AI coding tool adoption across their engineering teams. Both are now dealing with same problem: the faster their employees embraced the tools, the faster the bills grew. In some cases those bills have started exceeding what the same work would have cost with human labor.
The problem is what happens to the economics when thousands of employees use something that charges per unit of thought.
Table of Contents
The token trap nobody planned for
AI models charge per token, the basic unit of text the model processes and generates.
When Uber’s CTO disclosed that the company had burned through its entire 2026 AI coding budget in four months, the detail that got less attention was how it happened. Uber had been actively pushing adoption, running internal leaderboards to rank teams by AI tool usage. More encouragement meant more usage. More usage meant more tokens. More tokens meant more compute. The budget math that looked reasonable in January looked catastrophic by April.
Amazon has been telling staff to “tokenmaxx,” meaning use as many tokens as possible. Meta built an internal tracking tool called Claudeonomics to monitor which employees were using AI most heavily. These are companies treating token consumption as a metric to maximize, which is exactly backwards if the goal is cost efficiency.
The paradox is structural. Agentic AI systems, the ones that work autonomously across multiple steps consume more tokens per task than standard models. Goldman Sachs forecasts a 24-fold increase in enterprise token consumption by 2030 as agentic deployments scale. Gartner projects that inference costs will fall nearly 90% by the same year. But Gartner also warned that cheaper tokens will not produce cheaper bills, because consumption growth will outpace price declines and AI providers are unlikely to pass through the full benefit of cost reductions to business customers.
Cheaper per token. Higher total bill. The more you use it the worse the math gets.
When compute costs more than the employee
The most uncomfortable acknowledgment of where this is heading came from Bryan Catanzaro, Vice President of applied deep learning at Nvidia, the company that supplies the chips powering essentially all of this infrastructure.
“For my team, the cost of compute is far beyond the costs of the employees,” he said.
That statement carries weight because of who said it. Nvidia has more financial interest in AI compute spending than almost any other company on earth. When its own executive acknowledges that compute costs are exceeding labor costs for his team, it is not a bearish take on AI. It is an honest description of the current economics from someone with no incentive to understate them.
Microsoft’s situation illustrates the same point from a different angle. The company cancelled most of its direct Claude Code licences after thousands of employees adopted the tool faster than anyone anticipated. The move doesn’t touch Microsoft’s $5 billion investment in Anthropic or its commercial relationship with the company. It’s a pure cost control decision on a tool its own engineers had grown to depend on. When the company that built GitHub Copilot, owns the dominant AI coding platform, and made one of the largest AI bets in the industry pulls back on AI coding spend, the economics are the only explanation that makes sense.
Where the math actually works
MIT research found AI is only economically viable in a limited number of job roles at current pricing. The tasks where it clears the bar tend to share common characteristics: well-defined scope, high repetition, low need for judgment across long sessions. Boilerplate generation, test scaffolding, documentation, straightforward refactors. Tasks where a developer might spend twenty minutes doing something mechanical and the AI does it in thirty seconds.
The tasks where the math breaks down are the ones that require sustained context, iterative judgment, and long agentic sessions. Those are also the tasks the industry has been most aggressively promoting AI for. The gap between where AI is cost-effective and where it is being deployed is where the Microsoft and Uber problem lives.
AI coding tools are currently better described as expensive productivity multipliers for specific task types than as wholesale replacements for engineering labor costs. The companies that figure out how to use them right, rather than encouraging blanket maximum adoption, will likely see the economics work. The ones that ran internal leaderboards rewarding token consumption are learning that lesson the hard way.
The bill is coming due
AI was sold as the great labor cost reduction play. The early returns from two companies that believed that part hardest suggest the reality is more complicated.
The tools work. The economics at scale don’t, at least not yet. Cheaper tokens haven’t produced cheaper bills. Encouraged adoption has produced budget crises. And the executive most invested in AI compute spending just admitted his compute costs exceed his payroll.
Jensen Huang has said he imagines 100 AI agents working alongside every human employee at Nvidia one day. That future may still arrive. But if token consumption keeps rising faster than unit costs fall, it will arrive with a price tag nobody has fully reckoned with yet. Microsoft and Uber just got the first invoice.