20:52
2026-06-29
arxiv.org
machine-learning
Simplified Sparse Attention via Gist Tokens
Researchers introduced Simplified Sparse Attention (SSA), a method that uses gist tokens to achieve sparse attention without architectural changes, outperforming baselines on LongBench and improving rโฆ