22:29
2026-06-14
arxiv.org
large-language-models
Still: Amortized KV Cache Compaction in a Single Forward Pass
Researchers introduced Still, a per-layer Perceiver model that compacts KV cache in a single forward pass, enabling efficient long-context language model deployment. On Qwen and Gemma models, Still ouโฆ