21:33
2026-07-03
discuss.huggingface.co
large-language-models
Presenting TIS (Token Importance Scoring) - A new way to compress KV cache
A developer released TIS (Token Importance Scoring), a learned method for compressing the KV cache in large language models, achieving 100% accuracy on synthetic retrieval at 50% cache budget. The appโฆ