GitHub DeepSeek-AI/DeepSpec

wpnews.pro

cd /news/machine-learning/github-deepseek-ai-deepspec · home › topics › machine-learning › article

[ARTICLE · art-42057] src=github.com ↗ pub=2026-06-27T20:16Z topic=machine-learning verified=true sentiment=↑ positive

GitHub DeepSeek-AI/DeepSpec

DeepSeek-AI released DeepSpec, an open-source codebase for training and evaluating draft models for speculative decoding, supporting three draft model algorithms (DSpark, DFlash, Eagle3) and requiring up to 38 TB of storage for target cache preparation. The project, licensed under MIT, aims to accelerate inference by enabling efficient speculative decoding with target models like Qwen3 and Gemma.

read2 min views1 publishedJun 27, 2026

GitHub DeepSeek-AI/DeepSpec — Image: source

DeepSpec is a full-stack codebase for training and evaluating draft models for speculative decoding. It contains data preparation utilities, draft model implementations, training code, and evaluation scripts.

Install the Python dependencies:

python -m pip install -r requirements.txt

Data preparation additionally requires an inference engine to serve the target model when regenerating answers; see scripts/data/README.md for details.

Run the stages in order — each stage's output feeds the next:

Data Preparation— download prompts, regenerate target answers, and build the target cache.** Training**— train a draft model against the cached target outputs.** Evaluation**— measure speculative-decoding acceptance on benchmark tasks.

See scripts/data/README.md for the step-by-step data pipeline:

download and split training data,
regenerate answers,
prepare the target cache (storage warning: this can be very large — roughly 38 TB for the default Qwen/Qwen3-4B

setting).

bash scripts/train/train.sh

train.sh

launches train.py

, which spawns one worker per visible GPU. Select the algorithm and target model by pointing config_path

at one of the configs under config/ (e.g. config/dspark/dspark_qwen3_4b.py

); see the script header for the full list of configs, how to override config_path

/ target_cache_dir

, and how to use --opts

to override individual config fields. Checkpoints are written to ~/checkpoints/<project_name>/<exp_name>/step_*

Hardware: the default configs and scripts assume a single node with 8 GPUs. For fewer GPUs, reduce CUDA_VISIBLE_DEVICES

bash scripts/eval/eval.sh

eval.sh

runs eval.py

against a trained draft checkpoint over the speculative-decoding benchmarks in eval_datasets/ (gsm8k, math500, aime25, humaneval, mbpp, livecodebench, mt-bench, alpaca, arena-hard-v2). Set:

target_name_or_path

— the target model the draft was trained against (e.g.Qwen/Qwen3-4B

),draft_name_or_path

— the draft checkpoint, e.g.~/checkpoints/deepspec/dspark_block8_qwen3_4b/step_latest

Currently, DeepSpec includes three draft models: DSpark, DFlash and Eagle3.

DeepSpec is released under the MIT License. It includes code adapted from third-party projects under their own licenses; see NOTICE for the full attribution.

DeepSpec builds on the ideas and code of several excellent open-source projects:

SpecForge(Apache-2.0) — the overall training framework and Eagle3 implementation; portions of the Eagle3 modeling, loss, optimizer, attention, and evaluation code are adapted from it. Adapted files carry an in-file attribution comment, and the full notice is recorded inNOTICE.DFlash(MIT) — the DFlash draft-model design and training recipe.Qwen3andGemma— the target model families supported in this repo.

We thank the authors and maintainers of these projects. Contributions of new algorithms are welcome.

source & further reading

github.com — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/github-deepseek-ai-deeps…

Read original on github.com → github.com/deepseek-ai/DeepSpec

mentioned entities

DeepSeek-AI

DeepSpec

DSpark

DFlash

Eagle3

Qwen3

Gemma

SpecForge

metadata

sluggithub-deepseek-ai-deepspec

topic#machine-learning

secondary4 topics

sentimentpositive

canonicalgithub.com

navigation

← prevRecursive self improvement for h…

next →GPT-5.6 Preview System Card

── more in #machine-learning 4 stories · sorted by recency

marktechpost.com · 27 Jun · #machine-learning

DeepSeek Releases DSpark, a Speculative Decoding Framework That Accelerates DeepSeek-V4 Per-User Generation 60–85% Over MTP-1

byteiota.com · 27 Jun · #machine-learning

DeepSeek DSpark Goes Live with 80% Inference Speed Gains

cryptobriefing.com · 27 Jun · #machine-learning

DeepSeek unveils DSpark for 60% to 85% faster inference optimization

dev.to · 27 Jun · #machine-learning

PAL: Giving AI Agents Hands in the Physical World

── more on @deepseek-ai 3 stories trending now

wpnews · 25 May · #artificial-intelligence

Maia-3: free and open source

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 1 Nov · #developer-tools

Custom Zig Test Runner, better ouput, timing display, and support for special "tests:beforeAll" and "tests:afterAll" tests

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required