{"slug": "logarithmic-math-fuels-bold-tensordyne-inference-claim", "title": "Logarithmic Math Fuels Bold Tensordyne Inference Claim", "summary": "Startup Tensordyne claims its new Napier AI chip, using logarithmic math to replace multipliers with adders, can run large language models four times faster using one-fifth the power of a comparable Nvidia GB300 system. The company has sent plans for its first chip to manufacturing, with commercial sales of a 72-chip system expected in the second half of 2027. Real-world performance data will not be available until the end of the year.", "body_md": "If simulations are to be believed, startup [Tensordyne’s](https://www.tensordyne.ai/) new AI chip could crush the performance of market leader [Nvidia](https://www.nvidia.com/en-us/) in terms of [energy efficiency](https://spectrum.ieee.org/tag/energy-efficiency) and latency for inferencing. The company just sent the plans for its first chip to be manufactured, with commercial sales of a 72-chip system scheduled for the second half of 2027. Tensordyne claims its 72-chip system can run large LLMs four times as fast using one-fifth the power compared to a 72-Nvidia [GB300](https://spectrum.ieee.org/mlperf-inference-51) system. However, real systems won’t be around to back these figures up until the end of the year.\n\nThe not-so-secret sauce behind the outsized efficiency of Tensordyne’s new chip, Napier, is how it does matrix multiplication, the main math of AI. It takes advantage of the fact that the logarithm of A times B equals the logarithm of A plus the logarithm of B.\n\n“We’ve turned multipliers into adders,” explains [Gilles Backhus](https://www.linkedin.com/in/gillesbackhus/), a Tensordyne founder and vice president of AI. Adders are smaller and more energy-efficient [logic circuits](https://spectrum.ieee.org/tag/logic-circuits) than those that do multiplication, he says. So Napier can pack more compute into a smaller area and still save on power.\n\n## New kinds of numbers\n\nThat such a thing was possible has long been known, but there wasn’t a good way to use it, because converting back and forth between logarithmic numbers and the [floating point](https://spectrum.ieee.org/tag/floating-point) numbers that describe [neural networks](https://spectrum.ieee.org/tag/neural-networks) took too much time and energy and introduced too many inaccuracies. Not anymore, according to Backhus.\n\n“So far no one has figured out how to do the linear to logarithm and logarithm to linear conversion as we have,” he says. “And that’s actually the crux of that whole thing. Our engineers have figured out ways to do this very elegantly and very very accurately and cheaply on silicon.”\n\nThe importance of number formats hasn’t been lost on the AI industry. Speaking at [IEEE Hot Chips](https://hotchips.org/) in 2023, [Nvidia](https://spectrum.ieee.org/tag/nvidia) chief scientist Bill Dally attributed the majority of the [improvement in the company’s GPUs](https://spectrum.ieee.org/nvidia-gpu) at the time to the use of shorter [number formats](https://spectrum.ieee.org/nvidia-blackwell) and the smaller circuits they require.\n\nResearchers have also worked on circuits to compute with alternative formats, such as the [logarithm-like posit](https://spectrum.ieee.org/floating-point-numbers-posits-processor) and more recently its scientific-computing counterpart the [takum](https://spectrum.ieee.org/number-formats-ai-scientific-computing). However, these formats have not reached mainstream adoption mostly because their hardware implementation is so different from traditional floating point.\n\n## Inference Demands Influence Architecture\n\nMarket trends, including the rise of [AI agents](https://spectrum.ieee.org/tag/agentic-ai), mean inference—the execution of neural network models—is becoming more important than training new [large language models](https://spectrum.ieee.org/tag/large-language-models) (LLMs). Factors like the cost and the speed at which answers are delivered are starting to dominate, and that’s led AI companies to look for system architectures that are a better fit for that.\n\nTensordyne executives say they saw this coming and engineered their computers to meet it.\n\nTensordyne’s Napier AI chip includes 144 gigabytes of HBM, but the real power comes from its unusual math.Tensordyne\n\nThere are two main parts to executing an LLM: prefill and decode. In the prefill stage the model takes in the input text and turns it into tokens, the basic units it can work with, and builds a kind of working memory about the input, called the key-value cache. It’s a computationally heavy task.\n\nDecode is where the LLM generates its output tokens, the answer or response to your input. Each new token is predicted using the previous token and the key-value cache. This sequential nature can make decode a slower process, and it’s more dependent on memory and network latency than computing power.\n\nSo AI chip makers are starting to build systems with those two different demands in mind. Nvidia is touting a system where a server rack full of B300 [GPUs](https://spectrum.ieee.org/tag/gpus) handles prefill and several racks of its[ Groq 3 processors ](https://spectrum.ieee.org/nvidia-groq-3)do the decode. [Amazon Web Services](https://aws.amazon.com/) is [combining](https://www.aboutamazon.com/news/aws/aws-cerebras-ai-inference) a rack of its Trainium [AI chips](https://spectrum.ieee.org/tag/ai-chips) for prefill with several racks of [Cerebras’s](https://www.cerebras.ai/) [wafer-scale computers](https://spectrum.ieee.org/cerebrass-giant-chip-will-smash-deep-learnings-speed-barrier) for decode.\n\nTensordyne says its system can handle both jobs. “We’re optimizing for two hard challenges here at the same time,” says [R.K. Anand](https://www.linkedin.com/in/r-k-anand/), chief product officer and co-founder of Tensordyne. “We’re the first company proving that you can do both without going to multiple vendors and multiple racks.”\n\nThe dense compute needed for prefill comes from the logarithmic math. The needs of decode come from 144-gigabytes of high-bandwidth memory and a custom 1-microsecond-latency network called Tensordyne Napier Link.\n\nIn a “pod” system that fits in one-quarter of a standard rack, Tensordyne packs in 72 Napier chips, 8 [Intel](https://spectrum.ieee.org/tag/intel) Xeon CPUs, and 64 terabytes of solid-state storage. A four-pod rack working on a 2-trillion parameter LLM would deliver 1,300 tokens per-second per-user at a cost of US $11 for 1 million tokens, while consuming 120 kilowatts of power, the company claims, with one pod crunching out prefill and three working on decode. To get similar tokens per-second per-user numbers, a nine-rack Rubin and Groq 3 system would likely consume 1.5 megawatts, according to Tensordyne.\n\nWhether or not these numbers really hold up will have to wait until later in the year. Tensordyne plans to have a beta version available through the cloud for customers to work with. It expects to begin shipping systems to customers about a year from now.\n\n[Nvidia Blackwell Ahead in AI Inference, AMD Second ›](https://spectrum.ieee.org/ai-inference)[With Nvidia Groq 3, the Era of AI Inference Is (Probably) Here ›](https://spectrum.ieee.org/nvidia-groq-3)\n\n[Samuel K. Moore](https://spectrum.ieee.org/u/samuel-k-moore)\n\n[Samuel K. Moore](https://twitter.com/SamuelKMoore) is the senior editor at *IEEE Spectrum* in charge of semiconductors coverage. An IEEE member, he has a bachelor's degree in biomedical engineering from Brown University and a master's degree in journalism from New York University.", "url": "https://wpnews.pro/news/logarithmic-math-fuels-bold-tensordyne-inference-claim", "canonical_source": "https://spectrum.ieee.org/tensordyne-inference-claim", "published_at": "2026-06-16 12:36:19+00:00", "updated_at": "2026-06-16 12:49:08.065007+00:00", "lang": "en", "topics": ["ai-chips", "ai-infrastructure", "large-language-models", "ai-products"], "entities": ["Tensordyne", "Nvidia", "Napier", "GB300", "Gilles Backhus", "IEEE Hot Chips", "Bill Dally"], "alternates": {"html": "https://wpnews.pro/news/logarithmic-math-fuels-bold-tensordyne-inference-claim", "markdown": "https://wpnews.pro/news/logarithmic-math-fuels-bold-tensordyne-inference-claim.md", "text": "https://wpnews.pro/news/logarithmic-math-fuels-bold-tensordyne-inference-claim.txt", "jsonld": "https://wpnews.pro/news/logarithmic-math-fuels-bold-tensordyne-inference-claim.jsonld"}}