{"slug": "103-agent-memory-short-term-long-term-and-episodic", "title": "103. Agent Memory: Short-Term, Long-Term, and Episodic", "summary": "An engineer built a four-tier memory architecture for AI agents, implementing working, semantic, episodic, and procedural memory systems from scratch. The system enables agents to retain context across sessions, recall past conversations and user preferences, and build persistent knowledge rather than starting fresh with each interaction. The implementation uses vector stores for semantic memory, structured databases for episodic memory, and in-context windows for working memory.", "body_md": "**Main Thumbnail Image Prompt:** A human brain cross-section illustration in neon tones on dark background. Three regions clearly demarcated and labeled. The hippocampus region glows blue, labeled \"Episodic Memory: what happened.\" The prefrontal cortex glows orange, labeled \"Working Memory: what I'm doing now.\" A network of distributed nodes glows green, labeled \"Semantic Memory: what I know.\" Arrows show information flowing between regions. Scientific but accessible, the memory architecture made neural and visual.\n\n**Memory Architecture Diagram Image Prompt:** Four storage boxes arranged vertically on dark background. Top: \"In-Context Window (Working Memory)\" — fastest, smallest, temporary, shown as RAM chip icon. Second: \"External Vector Store (Semantic Memory)\" — fast retrieval, persistent, shown as cylinder with search icon. Third: \"Key-Value Store (Episodic Memory)\" — structured facts, shown as database icon. Bottom: \"Fine-Tuned Weights (Procedural Memory)\" — slowest to update, most permanent, shown as brain with lock. Arrows showing read/write speeds between boxes. Clean, technical, the hierarchy is the insight.\n\n**Memory Retrieval Flow Image Prompt:** A query arrives at an agent on the left. Four parallel arrows go right to four memory sources: conversation history (short chat bubbles), vector database (semantic search visualization), structured database (table icon), model weights (brain icon). Each source returns relevant items. A \"Memory Fusion\" box on the right combines the results. The agent sees an enriched context. The retrieval from multiple stores is the architecture.\n\nEvery conversation with an LLM starts from zero.\n\nYou explain your project. You explain your preferences. You explain your constraints. You spend five minutes providing context. You come back tomorrow. You do it all again.\n\nThe model remembers nothing between sessions. The context window closes. The state is gone. Every interaction is the agent's first day on the job.\n\nHuman productivity depends on memory. We remember what worked last time. We build on past experience. We know our tools, our colleagues, our recurring problems. We do not start from scratch daily.\n\nAgents with memory do this. They remember past conversations. They recall relevant facts. They store successful strategies. They build up a model of the user's preferences and project context over time.\n\nThis post builds all four types of agent memory from scratch.\n\n``` python\nimport os\nimport json\nimport time\nimport hashlib\nfrom typing import List, Dict, Optional, Any, Tuple\nfrom dataclasses import dataclass, field\nfrom datetime import datetime\nfrom pathlib import Path\nimport anthropic\nimport numpy as np\n\nprint(\"The Four Types of Agent Memory:\")\nprint()\n\nmemory_types = {\n    \"Working Memory (In-Context)\": {\n        \"speed\":       \"Instant\",\n        \"capacity\":    \"Limited by context window (~200K tokens)\",\n        \"persistence\": \"Session only — gone when conversation ends\",\n        \"best_for\":    \"Current conversation, active task state\",\n        \"implementation\": \"messages list in API call\",\n    },\n    \"Semantic Memory (Vector Store)\": {\n        \"speed\":       \"Fast (milliseconds)\",\n        \"capacity\":    \"Millions of embeddings\",\n        \"persistence\": \"Persistent across sessions\",\n        \"best_for\":    \"Knowledge base, past conversations, documents\",\n        \"implementation\": \"ChromaDB, Pinecone, FAISS\",\n    },\n    \"Episodic Memory (Structured Store)\": {\n        \"speed\":       \"Fast (key-value lookup)\",\n        \"capacity\":    \"Unlimited\",\n        \"persistence\": \"Persistent across sessions\",\n        \"best_for\":    \"User preferences, facts, past actions, outcomes\",\n        \"implementation\": \"SQLite, Redis, JSON files\",\n    },\n    \"Procedural Memory (Weights)\": {\n        \"speed\":       \"Instant (baked in)\",\n        \"capacity\":    \"Model-dependent\",\n        \"persistence\": \"Requires fine-tuning to update\",\n        \"best_for\":    \"Skills, domain knowledge, behavioral patterns\",\n        \"implementation\": \"Fine-tuning, LoRA adapters\",\n    },\n}\n\nfor name, info in memory_types.items():\n    print(f\"  {name}:\")\n    for key, val in info.items():\n        print(f\"    {key:<18}: {val}\")\n    print()\nclass WorkingMemory:\n    \"\"\"\n    Short-term memory that lives in the context window.\n    Automatically manages the sliding window to stay within token limits.\n    \"\"\"\n\n    def __init__(self, max_turns: int = 20, max_tokens: int = 50000):\n        self.turns:      List[Dict] = []\n        self.max_turns   = max_turns\n        self.max_tokens  = max_tokens\n        self._token_count = 0\n\n    def add(self, role: str, content: str):\n        self.turns.append({\n            \"role\":      role,\n            \"content\":   content,\n            \"timestamp\": datetime.utcnow().isoformat(),\n            \"tokens\":    len(content.split()) * 1.3  # rough estimate\n        })\n        self._trim_if_needed()\n\n    def _trim_if_needed(self):\n        while len(self.turns) > self.max_turns * 2:\n            self.turns.pop(0)\n\n    def get_messages(self) -> List[Dict]:\n        return [{\"role\": t[\"role\"], \"content\": t[\"content\"]} for t in self.turns]\n\n    def get_recent(self, n_turns: int = 5) -> List[Dict]:\n        recent = self.turns[-(n_turns * 2):]\n        return [{\"role\": t[\"role\"], \"content\": t[\"content\"]} for t in recent]\n\n    def summarize_old(self, keep_last: int = 5) -> str:\n        \"\"\"Compress old turns into a summary to free context space.\"\"\"\n        if len(self.turns) <= keep_last * 2:\n            return \"\"\n        old_turns = self.turns[:-(keep_last * 2)]\n        summary_parts = []\n        for turn in old_turns:\n            if turn[\"role\"] == \"user\":\n                summary_parts.append(f\"User asked about: {turn['content'][:50]}\")\n        return \"Previous conversation summary: \" + \"; \".join(summary_parts)\n\n    def clear(self):\n        self.turns = []\n\n    def __len__(self):\n        return len(self.turns) // 2\n\nwm = WorkingMemory(max_turns=10)\nwm.add(\"user\",      \"My name is Rahul and I am building a recommendation system.\")\nwm.add(\"assistant\", \"Great! What type of recommendations? User-based or item-based?\")\nwm.add(\"user\",      \"User-based collaborative filtering for an e-commerce platform.\")\nwm.add(\"assistant\", \"For user-based CF, you will need a user-item interaction matrix...\")\n\nprint(\"Working Memory Demo:\")\nprint(f\"  Current turns:  {len(wm)}\")\nprint(f\"  Messages in context: {len(wm.get_messages())}\")\nprint()\nprint(\"  Recent context:\")\nfor msg in wm.get_messages():\n    print(f\"    [{msg['role']:<10}]: {msg['content'][:60]}...\")\npython\nfrom sentence_transformers import SentenceTransformer\nfrom sklearn.metrics.pairwise import cosine_similarity\n\nclass SemanticMemory:\n    \"\"\"\n    Long-term memory stored as embeddings.\n    Retrieves relevant past context by semantic similarity.\n    Think of this as the agent's 'searchable journal.'\n    \"\"\"\n\n    def __init__(self, embed_model: str = \"all-MiniLM-L6-v2\",\n                 persist_path: str = \"./agent_memory\"):\n        self.embedder      = SentenceTransformer(embed_model)\n        self.persist_path  = Path(persist_path)\n        self.persist_path.mkdir(exist_ok=True)\n\n        self._entries:      List[Dict]  = []\n        self._embeddings:   Optional[np.ndarray] = None\n        self._load()\n\n    def remember(self, content: str, memory_type: str = \"conversation\",\n                 metadata: Dict = None):\n        \"\"\"Store a memory with embedding.\"\"\"\n        entry = {\n            \"id\":          hashlib.md5(content.encode()).hexdigest()[:8],\n            \"content\":     content,\n            \"type\":        memory_type,\n            \"timestamp\":   datetime.utcnow().isoformat(),\n            \"metadata\":    metadata or {}\n        }\n        self._entries.append(entry)\n\n        new_emb = self.embedder.encode([content])\n        self._embeddings = (\n            new_emb if self._embeddings is None\n            else np.vstack([self._embeddings, new_emb])\n        )\n        self._save()\n\n    def recall(self, query: str, top_k: int = 3,\n               memory_type: Optional[str] = None,\n               min_score: float = 0.3) -> List[Dict]:\n        \"\"\"Retrieve most relevant memories for a query.\"\"\"\n        if not self._entries:\n            return []\n\n        query_emb   = self.embedder.encode([query])\n        scores      = cosine_similarity(query_emb, self._embeddings)[0]\n        ranked_idxs = np.argsort(scores)[::-1]\n\n        results = []\n        for idx in ranked_idxs:\n            if len(results) >= top_k:\n                break\n            entry = self._entries[idx]\n            score = float(scores[idx])\n\n            if score < min_score:\n                continue\n            if memory_type and entry[\"type\"] != memory_type:\n                continue\n\n            results.append({**entry, \"relevance_score\": round(score, 4)})\n\n        return results\n\n    def forget(self, memory_id: str):\n        \"\"\"Remove a specific memory.\"\"\"\n        idx = next((i for i, e in enumerate(self._entries)\n                    if e[\"id\"] == memory_id), None)\n        if idx is not None:\n            self._entries.pop(idx)\n            self._embeddings = np.delete(self._embeddings, idx, axis=0)\n            self._save()\n\n    def _save(self):\n        data_path = self.persist_path / \"memories.json\"\n        with open(data_path, \"w\") as f:\n            json.dump(self._entries, f, indent=2)\n\n        if self._embeddings is not None:\n            np.save(self.persist_path / \"embeddings.npy\", self._embeddings)\n\n    def _load(self):\n        data_path = self.persist_path / \"memories.json\"\n        emb_path  = self.persist_path / \"embeddings.npy\"\n\n        if data_path.exists():\n            with open(data_path) as f:\n                self._entries = json.load(f)\n\n        if emb_path.exists():\n            self._embeddings = np.load(emb_path)\n\n    def __len__(self):\n        return len(self._entries)\n\nsm = SemanticMemory(persist_path=\"./test_agent_memory\")\n\nsm.remember(\"User is building a recommendation system for e-commerce\", \"preference\")\nsm.remember(\"User prefers Python and PyTorch over TensorFlow\", \"preference\")\nsm.remember(\"Previous session: debugged a cosine similarity bug in the recommendation engine\", \"episode\")\nsm.remember(\"User's company uses PostgreSQL for the main database\", \"fact\")\nsm.remember(\"User struggled with cold-start problem for new users\", \"episode\")\nsm.remember(\"Solved cold-start by using content-based features initially\", \"solution\")\n\nprint(\"Semantic Memory Demo:\")\nprint(f\"  Stored memories: {len(sm)}\")\nprint()\n\nqueries = [\n    \"What database does this user use?\",\n    \"Has this user had problems with new users?\",\n    \"What tools does this user prefer?\",\n]\n\nfor query in queries:\n    results = sm.recall(query, top_k=2)\n    print(f\"  Query: '{query}'\")\n    for r in results:\n        print(f\"    [{r['relevance_score']:.3f}] ({r['type']}) {r['content'][:60]}\")\n    print()\npython\nimport sqlite3\nfrom contextlib import contextmanager\n\nclass EpisodicMemory:\n    \"\"\"\n    Structured memory for facts, preferences, and past events.\n    Uses SQLite for persistence. Think of it as the agent's 'fact file.'\n    \"\"\"\n\n    def __init__(self, db_path: str = \"./agent_episodes.db\"):\n        self.db_path = db_path\n        self._init_db()\n\n    def _init_db(self):\n        with self._conn() as conn:\n            conn.executescript(\"\"\"\n                CREATE TABLE IF NOT EXISTS facts (\n                    key         TEXT PRIMARY KEY,\n                    value       TEXT NOT NULL,\n                    category    TEXT DEFAULT 'general',\n                    confidence  REAL DEFAULT 1.0,\n                    created_at  TEXT,\n                    updated_at  TEXT,\n                    source      TEXT\n                );\n\n                CREATE TABLE IF NOT EXISTS episodes (\n                    id          INTEGER PRIMARY KEY AUTOINCREMENT,\n                    action      TEXT NOT NULL,\n                    result      TEXT,\n                    success     INTEGER DEFAULT 1,\n                    context     TEXT,\n                    timestamp   TEXT,\n                    session_id  TEXT\n                );\n\n                CREATE TABLE IF NOT EXISTS preferences (\n                    key         TEXT PRIMARY KEY,\n                    value       TEXT NOT NULL,\n                    updated_at  TEXT\n                );\n            \"\"\")\n\n    @contextmanager\n    def _conn(self):\n        conn = sqlite3.connect(self.db_path)\n        conn.row_factory = sqlite3.Row\n        try:\n            yield conn\n            conn.commit()\n        finally:\n            conn.close()\n\n    def store_fact(self, key: str, value: str,\n                   category: str = \"general\",\n                   confidence: float = 1.0,\n                   source: str = \"\"):\n        now = datetime.utcnow().isoformat()\n        with self._conn() as conn:\n            conn.execute(\"\"\"\n                INSERT OR REPLACE INTO facts\n                VALUES (?, ?, ?, ?, COALESCE((SELECT created_at FROM facts WHERE key=?), ?), ?, ?)\n            \"\"\", (key, value, category, confidence, key, now, now, source))\n\n    def get_fact(self, key: str) -> Optional[Dict]:\n        with self._conn() as conn:\n            row = conn.execute(\n                \"SELECT * FROM facts WHERE key = ?\", (key,)).fetchone()\n            return dict(row) if row else None\n\n    def get_facts_by_category(self, category: str) -> List[Dict]:\n        with self._conn() as conn:\n            rows = conn.execute(\n                \"SELECT * FROM facts WHERE category = ? ORDER BY updated_at DESC\",\n                (category,)).fetchall()\n            return [dict(r) for r in rows]\n\n    def log_episode(self, action: str, result: str = \"\",\n                     success: bool = True, context: str = \"\",\n                     session_id: str = \"\"):\n        with self._conn() as conn:\n            conn.execute(\"\"\"\n                INSERT INTO episodes (action, result, success, context, timestamp, session_id)\n                VALUES (?, ?, ?, ?, ?, ?)\n            \"\"\", (action, result, int(success), context,\n                  datetime.utcnow().isoformat(), session_id))\n\n    def get_recent_episodes(self, n: int = 10,\n                             success_only: bool = False) -> List[Dict]:\n        query = \"SELECT * FROM episodes\"\n        if success_only:\n            query += \" WHERE success = 1\"\n        query += \" ORDER BY timestamp DESC LIMIT ?\"\n        with self._conn() as conn:\n            return [dict(r) for r in conn.execute(query, (n,)).fetchall()]\n\n    def set_preference(self, key: str, value: str):\n        with self._conn() as conn:\n            conn.execute(\n                \"INSERT OR REPLACE INTO preferences VALUES (?, ?, ?)\",\n                (key, value, datetime.utcnow().isoformat()))\n\n    def get_preference(self, key: str, default: str = \"\") -> str:\n        with self._conn() as conn:\n            row = conn.execute(\n                \"SELECT value FROM preferences WHERE key = ?\", (key,)).fetchone()\n            return row[\"value\"] if row else default\n\n    def get_all_preferences(self) -> Dict[str, str]:\n        with self._conn() as conn:\n            rows = conn.execute(\"SELECT key, value FROM preferences\").fetchall()\n            return {r[\"key\"]: r[\"value\"] for r in rows}\n\nem = EpisodicMemory(db_path=\"./test_episodes.db\")\n\nem.store_fact(\"user_name\",        \"Rahul\",           category=\"identity\")\nem.store_fact(\"user_role\",        \"ML Engineer\",      category=\"identity\")\nem.store_fact(\"project_type\",     \"recommendation\",   category=\"project\")\nem.store_fact(\"db_technology\",    \"PostgreSQL\",        category=\"tech_stack\")\nem.store_fact(\"preferred_lang\",   \"Python\",            category=\"preference\")\nem.store_fact(\"preferred_ml_lib\", \"PyTorch\",           category=\"preference\")\n\nem.log_episode(\"Helped debug cosine similarity\", \"Fixed shape mismatch\",\n               success=True, session_id=\"sess_001\")\nem.log_episode(\"Explained collaborative filtering\", \"User understood\",\n               success=True, session_id=\"sess_001\")\nem.log_episode(\"Tried matrix factorization approach\", \"Memory error on large data\",\n               success=False, session_id=\"sess_002\")\n\nem.set_preference(\"response_style\", \"concise with code examples\")\nem.set_preference(\"explanation_depth\", \"intermediate\")\n\nprint(\"Episodic Memory Demo:\")\nprint()\nprint(\"  User Facts:\")\nfor fact in em.get_facts_by_category(\"identity\"):\n    print(f\"    {fact['key']}: {fact['value']}\")\n\nprint()\nprint(\"  Recent Episodes:\")\nfor ep in em.get_recent_episodes(3):\n    status = \"✓\" if ep[\"success\"] else \"✗\"\n    print(f\"    {status} {ep['action'][:50]}: {ep['result'][:40]}\")\n\nprint()\nprint(\"  Preferences:\")\nfor key, val in em.get_all_preferences().items():\n    print(f\"    {key}: {val}\")\nclass MemoryAgent:\n    \"\"\"\n    A complete agent with all four memory types integrated.\n    Personalizes responses based on accumulated memory.\n    \"\"\"\n\n    def __init__(self, agent_id: str = \"agent_default\",\n                 model: str = \"claude-3-5-haiku-20241022\"):\n        self.agent_id = agent_id\n        self.client   = anthropic.Anthropic(api_key=os.environ.get(\"ANTHROPIC_API_KEY\"))\n        self.model    = model\n\n        self.working_memory  = WorkingMemory(max_turns=15)\n        self.semantic_memory = SemanticMemory(\n            persist_path=f\"./memory_{agent_id}/semantic\")\n        self.episodic_memory = EpisodicMemory(\n            db_path=f\"./memory_{agent_id}/episodic.db\")\n\n        self._session_id = hashlib.md5(\n            str(time.time()).encode()).hexdigest()[:8]\n\n    def _build_memory_context(self, query: str) -> str:\n        \"\"\"Assemble relevant memories into a context block.\"\"\"\n        parts = []\n\n        prefs = self.episodic_memory.get_all_preferences()\n        if prefs:\n            parts.append(\"User preferences: \" +\n                         \"; \".join(f\"{k}={v}\" for k, v in prefs.items()))\n\n        key_facts = self.episodic_memory.get_facts_by_category(\"identity\")\n        key_facts += self.episodic_memory.get_facts_by_category(\"project\")\n        if key_facts:\n            facts_str = \"; \".join(f\"{f['key']}={f['value']}\" for f in key_facts[:5])\n            parts.append(f\"Known facts: {facts_str}\")\n\n        relevant_memories = self.semantic_memory.recall(query, top_k=3)\n        if relevant_memories:\n            mem_str = \"\\n\".join(\n                f\"- [{m['type']}] {m['content']}\" for m in relevant_memories)\n            parts.append(f\"Relevant past context:\\n{mem_str}\")\n\n        recent_episodes = self.episodic_memory.get_recent_episodes(3, success_only=True)\n        if recent_episodes:\n            ep_str = \"; \".join(ep[\"action\"][:40] for ep in recent_episodes)\n            parts.append(f\"Recent successful actions: {ep_str}\")\n\n        return \"\\n\\n\".join(parts) if parts else \"\"\n\n    def chat(self, user_message: str, verbose: bool = False) -> str:\n        self.working_memory.add(\"user\", user_message)\n\n        memory_context = self._build_memory_context(user_message)\n\n        system = f\"\"\"You are a helpful AI assistant with memory of past interactions.\nUse the provided context to personalize your responses.\n\n{f'Memory context:{chr(10)}{memory_context}' if memory_context else ''}\n\nAdapt your response to the user's known preferences and expertise level.\"\"\"\n\n        response = self.client.messages.create(\n            model      = self.model,\n            max_tokens = 800,\n            system     = system,\n            messages   = self.working_memory.get_messages()\n        )\n        answer = response.content[0].text\n        self.working_memory.add(\"assistant\", answer)\n\n        self.semantic_memory.remember(\n            f\"User asked: {user_message[:100]}\",\n            memory_type = \"conversation\",\n            metadata    = {\"session\": self._session_id}\n        )\n        self.episodic_memory.log_episode(\n            action     = f\"Answered: {user_message[:50]}\",\n            result     = \"Success\",\n            session_id = self._session_id\n        )\n\n        if verbose:\n            used_memories = len(self.semantic_memory.recall(user_message, top_k=3))\n            print(f\"  [Memory] Used {used_memories} relevant memories, \"\n                  f\"{len(self.working_memory)} conversation turns in context\")\n\n        return answer\n\nmem_agent = MemoryAgent(agent_id=\"rahul_session\")\n\nmem_agent.episodic_memory.store_fact(\"user_name\", \"Rahul\", \"identity\")\nmem_agent.episodic_memory.store_fact(\"project\",   \"e-commerce recommender\", \"project\")\nmem_agent.episodic_memory.set_preference(\"explanation_depth\", \"intermediate\")\nmem_agent.semantic_memory.remember(\n    \"User previously struggled with cold-start problem\", \"episode\")\n\nprint(\"\\nMemory-Augmented Agent Demo:\")\nprint(\"=\" * 60)\n\nquestions = [\n    \"Can you remind me where we left off with my recommendation system?\",\n    \"What approach did we decide to use for new users?\",\n    \"I want to add diversity to the recommendations. Any ideas?\",\n]\n\nfor q in questions:\n    print(f\"\\nUser: {q}\")\n    answer = mem_agent.chat(q, verbose=True)\n    print(f\"Agent: {answer[:200]}...\")\nclass MemoryManager:\n    \"\"\"Handles memory maintenance: summarization, pruning, importance scoring.\"\"\"\n\n    def __init__(self, semantic_memory: SemanticMemory,\n                 episodic_memory: EpisodicMemory):\n        self.semantic = semantic_memory\n        self.episodic = episodic_memory\n\n    def summarize_session(self, session_id: str,\n                           llm_client=None) -> str:\n        \"\"\"Compress a full session into a summary memory.\"\"\"\n        episodes = [\n            ep for ep in self.episodic.get_recent_episodes(50)\n            if ep.get(\"session_id\") == session_id\n        ]\n\n        if not episodes:\n            return \"\"\n\n        session_text = \"\\n\".join(\n            f\"- {ep['action']}: {ep['result']}\" for ep in episodes)\n\n        summary = (\n            f\"Session {session_id}: \" +\n            \"; \".join(ep[\"action\"][:30] for ep in episodes[:5])\n        )\n\n        self.semantic.remember(\n            summary,\n            memory_type = \"session_summary\",\n            metadata    = {\"session_id\": session_id}\n        )\n        return summary\n\n    def get_memory_stats(self) -> Dict:\n        return {\n            \"semantic_memories\":     len(self.semantic),\n            \"total_episodes\":        len(self.episodic.get_recent_episodes(1000)),\n            \"successful_episodes\":   len(self.episodic.get_recent_episodes(1000, success_only=True)),\n            \"stored_preferences\":    len(self.episodic.get_all_preferences()),\n            \"stored_facts\":          len(self.episodic.get_facts_by_category(\"identity\") +\n                                         self.episodic.get_facts_by_category(\"project\")),\n        }\n\nmm = MemoryManager(mem_agent.semantic_memory, mem_agent.episodic_memory)\n\nprint(\"\\nMemory Statistics:\")\nstats = mm.get_memory_stats()\nfor key, value in stats.items():\n    print(f\"  {key:<30}: {value}\")\nprint(\"\\nAgent Memory Reference Links:\")\nprint()\n\nrefs = {\n    \"Papers\": [\n        (\"MemGPT: Memory in LLM OS\",           \"arxiv.org/abs/2310.08560\"),\n        (\"Generative Agents (Stanford)\",        \"arxiv.org/abs/2304.03442\"),\n        (\"Memory-Augmented LLM Survey\",         \"arxiv.org/abs/2312.17512\"),\n        (\"Cognitive Architectures for LLMs\",    \"arxiv.org/abs/2309.02427\"),\n        (\"Reflexion: Verbal Reinforcement\",     \"arxiv.org/abs/2303.11366\"),\n    ],\n    \"Implementations\": [\n        (\"MemGPT GitHub\",                        \"github.com/cpacker/MemGPT\"),\n        (\"LangChain Memory docs\",                \"python.langchain.com/docs/modules/memory\"),\n        (\"LlamaIndex Memory module\",             \"docs.llamaindex.ai/en/stable/module_guides/storing/index_stores\"),\n        (\"Zep: Long-term memory for agents\",     \"getzep.com\"),\n        (\"Mem0: Memory layer for AI\",            \"mem0.ai\"),\n    ],\n    \"Tutorials\": [\n        (\"Building agents with memory (Anthropic)\", \"github.com/anthropics/anthropic-cookbook\"),\n        (\"LangGraph memory persistence\",             \"langchain-ai.github.io/langgraph/how-tos/persistence\"),\n        (\"Vector memory with ChromaDB\",              \"docs.trychroma.com/usage-guide\"),\n    ],\n    \"Cheat Sheets\": [\n        (\"SQLite Python reference\",              \"docs.python.org/3/library/sqlite3.html\"),\n        (\"Sentence Transformers quickstart\",     \"sbert.net/docs/quickstart.html\"),\n        (\"NumPy array operations\",               \"numpy.org/doc/stable/reference/routines.array-manipulation\"),\n    ],\n}\n\nfor category, links in refs.items():\n    print(f\"  {category}:\")\n    for name, url in links:\n        print(f\"    • {name:<48} {url}\")\n    print()\n```\n\nCreate `agent_memory_practice.py`\n\n.\n\nPart 1: build the four-type memory system from this post. Initialize WorkingMemory, SemanticMemory, and EpisodicMemory. Run a 5-turn conversation. After each turn, store the exchange in both semantic (embedding) and episodic (SQLite) memory. Verify both stores contain the data.\n\nPart 2: test cross-session recall. Start a new conversation. Without providing any prior context, ask the agent something that requires remembering a fact from the previous session. Does it retrieve the relevant memory and personalize the response?\n\nPart 3: memory retrieval comparison. Take 10 queries. For each, retrieve top 3 results from semantic memory. Also retrieve results from episodic memory by category. Compare what each memory type surfaces. When is each one more useful?\n\nPart 4: memory decay. Add a \"recency weight\" to semantic memory recall: recent memories score higher than old ones. Implement this by multiplying the cosine similarity score by a decay factor based on age. Does it change which memories get retrieved?\n\nAgents with memory are powerful. Agents that can write and execute code are transformative. The next post covers code agents: agents that write Python, run it, observe the output, and iteratively improve their code until it solves the problem. This is how GitHub Copilot and Cursor work at their core.", "url": "https://wpnews.pro/news/103-agent-memory-short-term-long-term-and-episodic", "canonical_source": "https://dev.to/yakhilesh/103-agent-memory-short-term-long-term-and-episodic-1n98", "published_at": "2026-05-31 06:53:59+00:00", "updated_at": "2026-05-31 07:11:12.245846+00:00", "lang": "en", "topics": ["ai-agents", "large-language-models", "artificial-intelligence", "neural-networks", "machine-learning"], "entities": [], "alternates": {"html": "https://wpnews.pro/news/103-agent-memory-short-term-long-term-and-episodic", "markdown": "https://wpnews.pro/news/103-agent-memory-short-term-long-term-and-episodic.md", "text": "https://wpnews.pro/news/103-agent-memory-short-term-long-term-and-episodic.txt", "jsonld": "https://wpnews.pro/news/103-agent-memory-short-term-long-term-and-episodic.jsonld"}}