cd /news/artificial-intelligence/your-ai-bill-is-out-of-control-googl… Β· home β€Ί topics β€Ί artificial-intelligence β€Ί article
[ARTICLE Β· art-17406] src=businessinsider.com pub= topic=artificial-intelligence verified=true sentiment=Β· neutral

Your AI bill is out of control. Google has been waiting for this moment.

Google released its Gemini 3.5 Flash AI model, positioning it as a cost-effective alternative to frontier models as companies face ballooning token expenses from AI agents. CEO Sundar Pichai said top Google Cloud customers could save over $1 billion annually by shifting 80% of AI workloads to a mix of Flash and other models. The move capitalizes on growing corporate sticker shock over AI costs, shifting the competitive advantage from raw model capability to infrastructure and inference efficiency.

read4 min publishedMay 29, 2026

While Anthropic hypes its unreleased Mythos AI model as dangerously powerful, Google is changing the conversation β€” to cost and speed. Google says its latest Gemini 3.5 Flash model rivals frontier offerings, while saving money for companies that are racking up huge bills by churning through billions of tokens, the core unit of AI usage.

"Companies are already blowing through their annual token budgets and it's only May," Google CEO Sundar Pichai said recently. "If companies used a mix of Flash and other frontier models they could save a lot of money."

The timing of Google's new model is no coincidence. As companies embrace token-hungry AI agents they're also paying closer attention to their bills. Meanwhile, smaller AI companies under pressure to generate revenue are cranking up the cost of their products, pushing customers to reconsider their AI spend.

That presents an opportunity to win on value rather than raw capability. It's also where Google has an edge that will be hard for rivals to replicate β€” one it's been working on for a quarter-century.

Flash Sale #

For the first three or so years, the generative AI wars were largely about who had the biggest and smartest model. Now, as the performance gaps between labs shrink, the advantage is shifting to infrastructure and inference, or how models are run. As OpenAI President Greg Brockman recently declared: "the model alone is no longer the product."

A big reason for that shift is agents, which are becoming more useful and expensive to operate.

Google knows how high the token burn is getting. Pichai recently noted that monthly usage of its AI products has increased sevenfold to 3.2 quadrillion tokens since

. He also said that if the top customers of Google Cloud moved 80% of their AI workloads to a combination of Gemini 3.5 Flash and other frontier models, they could save more than $1 billion a year.

__last year__Companies are noticing how much AI is adding up. Uber's COO recently said it was becoming harder to justify the company's ballooning AI costs. Venture capitalist Chamath Palihapitiya said in March that his company, 8090, was moving away from using Cursor because it was spending too much money on tokens.

"As AI agents become more complex, long-running processes have become the norm," Dan Morgan, an analyst at Synovus Trust, told Business Insider. "This has created sticker shock at many organizations."

Cost and ROI go hand in hand because it's hard make a profit in this space, said Morgan. For some companies, having access to models on the absolute frontier may no longer be necessary. Good enough may be good enough.

This is where Google is well positioned. It has tighter control over the cost and speed of AI than most rivals because it owns the full stack β€” chips, data centers, cloud, the models, and many of the big applications on top.

Google pays around 50% less (and possibly as much as 75% less) for its internal AI compute than rivals because it uses its own TPU chips and sources components directly from manufacturers, analysts at William Blair estimated earlier this month.

OpenAI, on the other hand, pays Microsoft, Oracle, and other cloud giants a margin on every ChatGPT and Codex request, and those providers pay Nvidia for the GPUs that run it all. In fact, just about every company that isn't a hyperscaler is having to pay someone else for infrastructure right now.

The Search playbook #

If compute is destiny, as OpenAI CEO Sam Altman likes to say, Google has spent more than 25 years sealing its fate.

In 2006, Google Search commanded more than 40% of the market and was pulling away fast β€” not only because its results were good, but because Google was making its search engine faster and cheaper to serve. Google liked to boast about this by showing you the exact number of miliseconds it took to return answers.

Rather than invest in expensive servers, Google built custom systems using cheap, off-the-shelf parts to maximize speed and keep costs down. Meanwhile, data from all those searches β€” which increased as Google became more popular β€” improved the engine, creating a flywheel that slowly strangled rivals like Yahoo.

Google's results didn't need to be the absolute best, they just needed to be fast enough β€” and cheap enough to serve β€” that users would keep coming back for more.

Google is creating a similar flywheel with Gemini, except now it also has a hugely successful search advertising business that can subsidize its AI efforts as rivals like OpenAI and Anthropic race for more money and compute.

The search race was really an infrastructure race in disguise. Google is betting the AI race will be similar.

Have something to share? Contact this reporter via email at hlangley@businessinsider.com or Signal at 628-228-1836. Use a personal email address and a non-work device; here's our guide to sharing information securely.

── more in #artificial-intelligence 4 stories Β· sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/your-ai-bill-is-out-…] indexed:0 read:4min 2026-05-29 Β· β€”