04:00
2026-06-17
arxiv.org
large-language-models
MemTrace: Probing What Final Accuracy Misses in Long-Term Memory
Researchers introduced MemTrace, a benchmark that evaluates LLM agents' long-term memory by probing individual knowledge points across dimensions like memory age, question type, and evidence conditionβ¦