20:00
2026-06-22
haoailab.com
large-language-models
JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting
Researchers introduced JetSpec, a speculative decoding method that trains a causal parallel draft head on a frozen target model to draft entire speculative trees in one pass while preserving autoregreβ¦