Humaneval

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00

2026-06-29

arxiv.org

large-language-models

EntMTP: Accelerating LLM Inference with Entropy Guided Multi Token Prediction

Researchers propose EntMTP, a training-free scheduler that dynamically adjusts multi-token prediction depth based on local entropy, achieving up to 1.36x speedup over Medusa baselines in LLM inference…

// co-occurs with top 6 entities

EntMTP 1 Hydra 1 Medusa 1 ShareGPT 1 GSM8k 1 Litbench 1