Zhijian Liu

mentions 1 type Person feed RSS

// recent coverage 1 mentions

17:54

2026-06-15

runtimewire.com

large-language-models

SGLang adds DFlash to push Qwen 3.5 397B-A17B inference up to 4.3x faster

Jian Chen, Yesheng Liang, and Zhijian Liu integrated Z Lab's DFlash block diffusion speculative decoding method into SGLang, collaborating with Modal and LMSYS. The team reports up to 4.31x higher thr…

// co-occurs with top 7 entities

SGLang 1 DFlash 1 Z Lab 1 Modal 1 LMSYS 1 Qwen 3.5 1 Jian Chen 1

// topics top 3 topics

large language models 1 ai infrastructure 1 ai research 1