Tensor Cache: Eviction-conditioned Associative Memory for Transformers

wpnews.pro

cd /news/machine-learning/tensor-cache-eviction-conditioned-as… · home › topics › machine-learning › article

[ARTICLE · art-13547] src=arxiv.org ↗ pub=2026-05-25T04:00Z topic=machine-learning verified=true sentiment=· neutral

Tensor Cache: Eviction-conditioned Associative Memory for Transformers

Researchers introduced Tensor Cache, a two-level memory system for Transformer models that stores evicted key-value pairs from a sliding-window cache into a compressed outer-product fast-weight memory. The approach uses a learned gating mechanism to fuse exact local attention with compressed memory access, closing a training shortcut that introduced spurious cross-token computations. Tensor Cache improves the memory-quality tradeoff over existing bounded-state baselines across long-context language modeling and associative recall tasks.

read1 min views6 publishedMay 25, 2026

arXiv:2605.22884v1 Announce Type: new Abstract: Autoregressive Transformer KV caches grow linearly with context length; sliding-window caching bounds memory but discards evicted tokens entirely, so relevant evidence outside the window becomes inaccessible. We introduce \emph{Tensor Cache}, a two-level cache that pairs sliding-window softmax attention as a first-level cache (L1) with a fixed-size outer-product fast-weight memory as a second-level cache (L2) fed by KV pairs evicted from the window. Recent tokens remain in exact local attention; evicted pairs are compressed into a per-layer matrix $A$ and read by future queries through a single matrix multiplication, exploiting the linear-attention identity $q_t(k_i \otimes v_i)=\langle q_t,k_i\rangle v_i$. A learned scalar gate fuses the L1 and L2 outputs, and per-head decay and write-rate parameters are trained end-to-end. The outer-product memory and the read identity are well-known; our contribution is their use as an L2 cache fed exclusively by sliding-window evictions, plus identifying that the common chunked-mean training shortcut $A!\leftarrow!\lambda A!+!\eta(\bar k!\otimes!\bar v)$ silently introduces $C^2{-}C$ spurious cross-token outer products per chunk, and closing the gap with a parallel weighted-sum scan equivalent to per-token writes within float32 epsilon. Across systems scaling, controlled associative recall, long-context language modeling, and memory-capacity diagnostics, Tensor Cache improves the memory--quality frontier over bounded-state baselines.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/tensor-cache-eviction-co…

Read original on arxiv.org → arxiv.org/abs/2605.22884

mentioned entities

Tensor Cache

metadata

slugtensor-cache-eviction-conditioned-associative-memory-for-transformers

topic#machine-learning

secondary4 topics

sentimentneutral

canonicalarxiv.org

navigation

← prevThe Eternal Sloptember

next →Samsung memory workers call off …

── more in #machine-learning 4 stories · sorted by recency

machinebrief.com · 10 Jul · #machine-learning

Revamping Neural Topology: The Cost of Precision

404media.co · 10 Jul · #machine-learning

AI Fiction Is Easy to Detect Because It's Stupid and Bad, Research Finds

machinebrief.com · 10 Jul · #machine-learning

Why AI Struggles in Real-World Negotiations

machinebrief.com · 10 Jul · #machine-learning

Are Large Language Models Really Getting Smarter?

── more on @tensor cache 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI unveils Grok 4.5 AI model ahead of July 2026 public release

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required