cd/entity/llada.cppยท homeโ€บ entitiesโ€บ llada.cpp
grep -l @llada.cpp /news/*.json | wc -l โ†’ 1

llada.cpp

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00
2026-06-15
arxiv.org
large-language-models

Efficient On-Device Diffusion LLM Inference with Mobile NPU

Researchers introduced llada.cpp, the first NPU-aware inference framework for accelerating diffusion large language models on smartphones, achieving 17x-42x latency reduction over CPU baselines while โ€ฆ

// co-occurs with top 4 entities