Faster AI, lower costs: DSpark eases inference bottlenecks and chip strain, says DeepSeek

wpnews.pro

cd /news/artificial-intelligence/faster-ai-lower-costs-dspark-eases-i… · home › topics › artificial-intelligence › article

[ARTICLE · art-42468] src=scmp.com ↗ pub=2026-06-28T11:00Z topic=artificial-intelligence verified=true sentiment=↑ positive

Faster AI, lower costs: DSpark eases inference bottlenecks and chip strain, says DeepSeek

Chinese AI startup DeepSeek unveiled DSpark, a speculative decoding framework that speeds up AI inference by up to 85% while reducing costs and chip strain. The module uses a lightweight draft model and semi-autoregressive generation to accelerate response generation, addressing GPU underutilization and high latency in AI serving.

read1 min views1 publishedJun 28, 2026

Faster AI, lower costs: DSpark eases inference bottlenecks and chip strain, says DeepSeek — Image: Scmp (auto-discovered)

Start-up unveils speculative decoding framework that speeds up inference by up to 85 per cent amid China’s push to overcome US AI curbs

reducing serving costs and enhancing user experience.

reduce AI systems’ reliance on larger, more powerful chip infrastructure.

AI models’ conventional token-by-token output often slowed when responses were lengthy, leading to low utilisation of graphics processing units (GPU) and high user-perceived waiting time, which was a “primary bottleneck in serving AI”, the company said in research published on Saturday.

DeepSeek said the DSpark module accelerated AI response generation – also known as AI inference, which refers to serving a trained model to respond to user queries – by using a lightweight draft model to propose candidate responses and then verifying them in batches with a larger model, speeding up output.

DSpark further refined the approach with a semi-autoregressive generation method, allowing the model to produce small chunks of tokens rather than strictly one at a time.

It also introduced a confidence-based scheduling system that dynamically adjusted how much verification was applied based on computing demand, helping balance speed and output quality.

source & further reading

scmp.com — original article How the AI boom exposes investors to risk, while a downturn could see a sharp crash: BIS As AI pushes data centres to breaking point, some Chinese chipmakers bet on SiC Hong Kong’s AI push needs a broader vision and more realistic goals

~/api · this article 200

$curl api.wpnews.pro/v1/news/faster-ai-lower-costs-ds…

Read original on scmp.com → www.scmp.com/tech/big-tech/article/3358647/faste…

mentioned entities

DeepSeek

DSpark

Ben Jiang

metadata

slugfaster-ai-lower-costs-dspark-eases-inference-bottlenecks-and-chip-strain-says

topic#artificial-intelligence

secondary4 topics

sentimentpositive

canonicalscmp.com

navigation

← prevThey Taught Themselves to Hack

next →Bay Area hospitals race to rebui…

── more in #artificial-intelligence 4 stories · sorted by recency

dev.to · 28 Jun · #artificial-intelligence

Lossless, But Not Free: The Lossless, But Not Free — When Speculative Decoding Actually Pays Off (and When It Doesn't)

thenextweb.com · 28 Jun · #artificial-intelligence

The 33-year-old ex-Snap exec Nadella is trusting to fix Copilot now oversees 11,000 people

dev.to · 28 Jun · #artificial-intelligence

DeepSeek's DSpark Brings Speculative Decoding Back Into the Spotlight — Here's What Developers Need to Know

github.com · 27 Jun · #artificial-intelligence

GitHub DeepSeek-AI/DeepSpec

── more on @deepseek 3 stories trending now

wpnews · 25 May · #artificial-intelligence

Maia-3: free and open source

wpnews · 28 May · #ai-startups

[AINews] Cognition raises $1B in $26B Series D

wpnews · 5 Jun · #ai-agents

Miasma Worm Targets AI Coding Agents via GitHub Repos

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required