EdgeSync-LLM

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

14:10

2026-06-30

github.com

large-language-models

EdgeSync-LLM – KV cache fragment engine for on-device LLM inference (Go/Android)

EdgeSync-LLM, a new KV cache fragment engine for on-device LLM inference, stores and retrieves transformer KV tensors via HNSW approximate nearest-neighbor search, enabling exact hits at ~8ms TTFT and…

// co-occurs with top 7 entities

llama.cpp 1 MLC-LLM 1 ONNX Runtime 1 ARM 1 Android 1 HNSW 1 MiniLM-L6-v2 1