cd /news/large-language-models/the-7-types-of-agent-memory-a-techni… · home topics large-language-models article
[ARTICLE · art-35946] src=marktechpost.com ↗ pub= topic=large-language-models verified=true sentiment=· neutral

The 7 Types of Agent Memory: A Technical Guide for AI Engineers

Large language models are stateless by default, but agents require memory to retain context across steps. A new technical guide identifies seven types of agent memory—working, semantic, episodic, procedural, retrieval, parametric, and prospective—each varying by timescale and storage location. Engineers must understand these types to build agents that can plan, learn, and act over time.

read6 min views1 publishedJun 21, 2026
The 7 Types of Agent Memory: A Technical Guide for AI Engineers
Image: MarkTechPost

Large language models are stateless by default. Each API call starts fresh. The model forgets your last message once the response returns. That is fine for a single question. It breaks the moment you build an agent.

Agents plan, call tools, and run across many steps. They need to remember. Memory is the infrastructure that fixes this. It turns a stateless model into a system that retains context. That system can learn from experience and act over time.

What is Agent Memory

Memory is any mechanism that carries information across a model’s reasoning. Some of it lives inside the context window. Some of it lives outside, in databases or model weights. Each type stores a different class of information for a different duration.

Memory varies by form and by time. Form means parametric, stored in weights, or non-parametric, stored as text. Time means short-term or long-term. The seven types below map onto those two axes.

The Seven Types of Agent Memory

1. In-Context / Working Memory (Short-Term): This is everything the model can currently see inside its context window. It includes the system prompt, recent messages, tool outputs, and reasoning steps. Think of it as RAM. It is fast and essential, but temporary and size-limited. Every other memory type competes for space here.

2. Semantic Memory (Long-Term): This is a persistent store of facts, preferences, and domain knowledge. It holds entries like “the user prefers Python over JavaScript.” The knowledge is decoupled from when it was learned. It is the agent’s organized encyclopedia about a user or topic.

3. Episodic Memory (Long-Term): This logs specific past events, full conversations, and task runs. It records what worked and what failed. The agent uses it to learn from experience. Systems like Reflexion and ExpeL write verbal post-mortems and store conclusions for future runs.

4. Procedural Memory (Long-Term): This is the agent’s knowledge of how to do things. It covers skills, tool usage patterns, workflows, and behavioral rules. A support agent handling its hundredth password reset does not re-reason the workflow. It executes a learned procedure instead.

5. External / Retrieval Memory (Short-Term + Long-Term): This is knowledge stored outside the model in a vector database. It is pulled into context at inference time using similarity search. This is RAG applied to agent history or documents. Retrieval quality becomes the bottleneck fast.

6. Parametric Memory (Long-Term): This is knowledge baked directly into the model’s weights during training. It holds language, reasoning patterns, and general world knowledge. The model does not look anything up. It generates from learned associations. The tradeoff is that this memory is frozen at training time.

7. Prospective Memory (Short-Term + Long-Term): This is the agent’s ability to remember future intentions and scheduled goals. It tracks things the agent planned but has not yet executed. It is critical for long-horizon and multi-step planning agents. Without it, an agent forgets its own commitments.

Side-by-Side: How the Seven Compare

The table below maps each type to its timescale, location, and typical implementation.

Memory type Timescale Where it lives What it stores Common implementation
Working / In-context Short-term Context window Prompt, messages, tool outputs Native to the LLM
Semantic Long-term External store Facts, preferences, domain knowledge Vector DB or profile schema
Episodic Long-term External store Past events, task runs, outcomes Vector DB plus event logs
Procedural Long-term Prompt or weights Skills, workflows, behavioral rules System prompt or fine-tune
Retrieval / External Both Vector database Documents, history chunks RAG pipeline
Parametric Long-term Model weights World knowledge, language, reasoning Pre-training or fine-tuning
Prospective Both State store Future intentions, scheduled goals Task queue or scheduler

Interactive Explainer

Use Cases: Which Memory Solves Which Problem

Each type answers a concrete product need. Map the need to the memory.

  • A coding assistant inside one session uses working memory. It tracks the open files and recent edits in context. Close the session and that state is gone.
  • A personal assistant that remembers you needs semantic memory. It stores “allergic to gluten” and recalls it next week. The fact survives across sessions.
  • A research agent that improves over time needs episodic memory. It recalls that risk sections landed well last month. It repeats what worked and avoids what failed.
  • A travel-booking agent needs procedural memory. It knows the flow: search flights, compare, reserve, confirm. The sequence is a learned skill, not a fresh plan.
  • A documentation chatbot needs retrieval memory. It embeds the docs and pulls relevant chunks per query. The answer stays grounded in retrieved text.
  • A long-horizon agent managing a week-long project needs prospective memory. It remembers to send the Friday report. The intention persists until execution.

A Combined Example: All Seven in One Agent

Consider an autonomous market-analysis agent. One task exercises every memory type at once.

Parametric memory supplies the base reasoning and language. Retrieval memory pulls current market data from a vector store. Semantic memory provides the user’s preferred report format. Episodic memory recalls which sources proved reliable before. Procedural memory drives the section order: sizing, then landscape, then risk. Prospective memory schedules the follow-up draft for next week. Working memory assembles all of it into the active context.

Remove any one layer and the agent gets weaker. Each handles a job the others cannot.

Implementation: A Minimal Memory Stack

Here is a stripped-down sketch in Python. It shows working, semantic, episodic, and procedural memory as separate stores.

from datetime import datetime

semantic_memory = {"diet": "vegetarian", "language_pref": "Python"}

episodic_memory = [
    {"timestamp": datetime.now(),
     "event": "recipe_request",
     "result": "user liked a 20-minute meal"},
]

def suggest_recipe(diet):
    return f"a quick {diet} recipe"

procedural_memory = {"suggest_recipe": suggest_recipe}

def build_context(query):
    diet = semantic_memory["diet"]
    last = episodic_memory[-1]["result"]
    skill = procedural_memory["suggest_recipe"]
    return (
        f"Query: {query}\n"
        f"Semantic: user is {diet}\n"
        f"Episodic: last time, {last}\n"
        f"Procedural: returning {skill(diet)}"
    )

print(build_context("suggest dinner"))

In production, the long-term stores move to a vector database. The pattern stays the same. Write to long-term memory, retrieve into working memory, then reason.

How to Layer Them: A Practical Build Order

Do not build all seven at once. Add memory only when a real need justifies the complexity.

  • Start with working memory. It ships with the model. Most simple agents need nothing more.
  • Add semantic memory when users expect the agent to remember them across sessions. This is the first long-term layer most products require.
  • Layer in episodic, procedural, and prospective memory later. Add them only when your agent must plan ahead, learn from failure, and adapt over time.
  • Parametric and retrieval memory are often already present. Parametric memory is the base model itself. Retrieval memory arrives the moment you add RAG.

Sources: CoALA framework (Princeton, arXiv:2309.02427); “Memory in the Age of AI Agents” survey (arXiv:2512.13564); “From Human Memory to AI Memory” survey (arXiv:2504.15965); LangChain LangMem, MongoDB, Redis, and Neo4j agent-memory documentation; original concept notes on the seven memory types.

── more in #large-language-models 4 stories · sorted by recency
── more on @reflexion 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/the-7-types-of-agent…] indexed:0 read:6min 2026-06-21 ·