Dustin

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00

2026-06-25

arxiv.org

large-language-models

Dustin: Draft-Augmented Sparse Verification for Efficient Long-Context Generation with Speculative Decoding

Researchers propose Dustin, a sparse verification framework for long-context speculative decoding in LLMs, achieving a 27.85x speedup in self-attention and 9.17x end-to-end decoding speedup at 32k seq…

// co-occurs with top 3 entities

Qwen2.5-72B 1 PG-19 1 LongBench 1