04:00
2026-06-29
arxiv.org
large-language-models
Supersede: Diagnosing and Training the Memory-Update Gap in LLM Agents
Researchers at arXiv identified a memory-update gap in LLM agents, where accuracy drops from 92% to 77% when using bounded memory instead of full context, even with frontier models like gpt-5.4. The gโฆ