{"slug": "a-startup-says-it-cracked-the-maths-bottleneck-holding-back-ai-it-finally-has", "title": "A startup says it cracked the maths bottleneck holding back AI. It finally has the receipts.", "summary": "Miami startup Subquadratic claims to have solved a decade-old bottleneck in AI models by replacing dense attention with sparse attention, achieving 56x speed gains and dramatic cost reductions. Independent tests by Appen validated the performance, but skepticism remains due to limited public access and reliance on existing open-weight models.", "body_md": "A Miami startup says it has cracked a maths problem that has made AI models slow and power-hungry for almost a decade. The claim was bold enough to draw comparisons with Theranos. Now, though, the company has independent test results that back much of it up.\n\nThe startup is called Subquadratic. It came out of stealth in May with $29mn in seed funding and a new language model named SubQ. According to the company, SubQ is faster, cheaper, and far less energy-hungry than today’s leading models. It can also read up to 12 times as much text at once.\n\n## The decade-old bottleneck\n\nTo see why that matters, it helps to know how most large language models work. At their core sits a “transformer”, introduced by Google researchers in 2017. The transformer runs a process called dense attention.\n\nDense attention is thorough, but it is expensive. It compares every word in a text with every other word. So when you double the length of the text, the work roughly quadruples. That “quadratic” scaling is the main reason LLMs [guzzle so much compute and power](https://thenextweb.com/news/ai-researchers-say-weve-squeezed-nearly-as-much-out-of-modern-computers-as-we-can).\n\n## Subquadratic’s fix\n\nSubquadratic’s answer is to drop dense attention for “sparse attention”. Instead of comparing every word with every other, sparse attention keeps only the pairs that matter. The idea is old, and plenty of teams have tried it. Until now, however, none had matched dense attention’s quality.\n\nThe company says its version finally does. Crucially, it picks which words to focus on dynamically, based on the content rather than a fixed pattern. “That’s kind of where the secret sauce is,” says co-founder and chief technology officer Alex Whedon.\n\n## The receipts\n\nAt first, the claims rested on a handful of self-published scores. Naturally, the reaction was sceptical. One AI engineer summed it up on X: SubQ is “either the biggest breakthrough since the Transformer … or it’s AI Theranos”.\n\nSo the company brought in a third party. It asked Appen, a firm that evaluates other companies’ models, to run the tests. The results were striking. On a raw speed test, SubQ ran 56 times faster than FlashAttention, a leading existing method. On a tough coding benchmark, it scored 89.7 per cent, close to the best models around.\n\nThe cost gap looks just as wide. By the startup’s account, running one long-context test on Anthropic’s top model costs about $2,600. On SubQ, it says, the same test cost eight dollars.\n\n## Still too good to be true?\n\nEven so, there are reasons for caution. Benchmarks are not the same as real-world use. SubQ is also not widely available yet. Tens of thousands have joined the waitlist, but only a handful have access.\n\nThere is a wrinkle in the origin story, too. Rather than train SubQ from scratch, Subquadratic started from an existing open-weight model and swapped in its new attention method. That is common practice. However, it sits awkwardly next to the claim of fully reinventing how LLMs work.\n\n“They may have built something real and useful,” says Will Depue, an independent researcher who used to work at OpenAI. “But the public evidence does not yet justify the stronger claim that they have solved the quadratic attention bottleneck.”\n\n## Why it matters\n\nIf the results hold, the payoff is large. Cheaper, faster long-context models could read entire codebases, contract sets, or document troves in one pass. They would also cut the cost and energy of running AI.\n\nThat prize is one the whole industry is chasing. AI already strains against [the spiralling economics of AI agents](https://thenextweb.com/news/github-copilot-signup-pause-agentic-ai-usage-limits), and other startups, such as [Thomas Reardon’s Flourish](https://thenextweb.com/news/flourish-reardon-brain-inspired-ai-efficiency), are attacking efficiency from other angles. Subquadratic, though, is betting the whole field will follow it. “We don’t think anybody will be building on transformers in a few years,” says chief executive Justin Dangel.\n\n## Get the TNW newsletter\n\nGet the most important tech news in your inbox each week.", "url": "https://wpnews.pro/news/a-startup-says-it-cracked-the-maths-bottleneck-holding-back-ai-it-finally-has", "canonical_source": "https://thenextweb.com/news/subquadratic-subq-sparse-attention-llm-bottleneck", "published_at": "2026-06-19 15:27:38+00:00", "updated_at": "2026-06-19 16:12:18.506600+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-startups", "ai-products", "ai-infrastructure"], "entities": ["Subquadratic", "SubQ", "Appen", "Anthropic", "Alex Whedon", "Will Depue", "FlashAttention", "OpenAI"], "alternates": {"html": "https://wpnews.pro/news/a-startup-says-it-cracked-the-maths-bottleneck-holding-back-ai-it-finally-has", "markdown": "https://wpnews.pro/news/a-startup-says-it-cracked-the-maths-bottleneck-holding-back-ai-it-finally-has.md", "text": "https://wpnews.pro/news/a-startup-says-it-cracked-the-maths-bottleneck-holding-back-ai-it-finally-has.txt", "jsonld": "https://wpnews.pro/news/a-startup-says-it-cracked-the-maths-bottleneck-holding-back-ai-it-finally-has.jsonld"}}