Nvidia paid Groq $20 billion and took its top engineers. Now Groq is raising $650 million for what’s left. Nvidia paid Groq $20 billion in December to license its chip technology and hire several senior engineers, leaving the AI startup to rebuild around its inference cloud business. Groq is now raising $650 million from existing investors, with Disruptive and Infinitium guaranteeing the round, to fund its inference-as-a-service model. The raise tests whether Groq's purpose-built LPU hardware can maintain a cost advantage against Nvidia's advancing inference capabilities and model providers' aggressive pricing. TL;DR After Nvidia’s $20B not-acqui-hire, Groq is raising $650M from existing investors for its inference cloud. Two backers guarantee the round. The December deal paid out investors and licensed Groq's chip technology to Nvidia. The company is now rebuilding around its inference neocloud business. After Nvidia’s $20B not-acqui-hire, Groq is raising $650M from existing investors for its inference cloud. Two backers guarantee the round. Groq is raising $650 million https://techcrunch.com/2026/05/29/after-nvidias-20b-not-acqui-hire-ai-chip-startup-groq-reportedly-raising-650m/ from existing investors to fund its inference cloud business, Axios reported. The raise comes six months after Nvidia struck a $20 billion not-acqui-hire that paid out Groq’s investors in cash, took several senior engineers, and licensed Groq’s hardware technology. The same investors who were cashed out in December have now been asked to reinvest. Disruptive and Infinitium have agreed to fill the round if other existing investors decline their pro-rata shares. The funding is, in effect, guaranteed. The company is being led on an interim basis by CEO Adam Winter and CFO Matt Eng. Several top-level senior employees departed to Nvidia as part of the December deal. What remains is Groq’s inference cloud business, which lets developers and enterprises host inference-heavy applications on Groq’s proprietary Language Processing Unit hardware. Inference, the processing that happens after an AI prompt is submitted, is now a much larger market than model training. Every ChatGPT query, every Claude response, every AI agent action requires inference compute. The economics favour purpose-built silicon that can deliver tokens at lower cost and higher speed than general-purpose GPUs. Groq’s LPU architecture was designed specifically for this workload. The company has shipped its chips to multiple model providers and cloud customers. Its inference speed, measured in tokens per second, has consistently benchmarked above Nvidia’s GPU-based inference at comparable price points. The $20 billion December deal was unusual. It was not a full acquisition. Nvidia paid Groq’s investors in cash at what would have been Nvidia’s largest-ever purchase price. It licensed Groq’s chip technology. It took senior engineers. But it did not absorb the company. The result is a Groq that has been financially reset, technically depleted at the senior level, and now raising to rebuild around a narrower but potentially lucrative inference-as-a-service model. The inference chip market is attracting capital at an extraordinary rate. https://thenextweb.com/news/cerebras-ipo-spacex-openai-anthropic-listings Cerebras went public at a $95 billion valuation on an inference-optimised pitch. Fractile raised $220 million in London https://thenextweb.com/news/fractile-220m-inference-chip for inference chips that put compute and memory on the same die. Google is shipping millions of Ironwood TPUs designed specifically for inference. DeepSeek permanently cut its V4 Pro pricing by 75% this week https://thenextweb.com/news/deepseek-v4-pro-75-percent-price-cut-permanent , compressing the revenue-per-token economics that inference cloud providers depend on. Groq’s business model requires that its hardware delivers tokens cheaply enough to compete with both GPU-based inference and the model providers’ own API pricing. The DeepSeek price cut makes that competition harder. The $650 million is a bet that purpose-built inference hardware has a durable advantage over GPUs even as Nvidia pushes its own inference capabilities with each new architecture. Nvidia’s Blackwell and upcoming Vera Rubin platforms are designed to close the inference performance gap that gave companies like Groq their opening. Whether Groq can rebuild its engineering leadership, scale its inference cloud, and maintain a cost advantage against both Nvidia’s hardware improvements and model providers’ aggressive price cuts is the question the $650 million is supposed to answer. The investors who got cashed out at $20 billion are being asked to bet again on a smaller, leaner version of the same company. Two of them have agreed to guarantee the round. That is either conviction or obligation. Get the most important tech news in your inbox each week.