{"slug": "building-a-hermes-memory-plugin-for-a-voice-powered-conference-agent-with-engram", "title": "Building a Hermes Memory Plugin for a Voice-Powered Conference Agent with Weaviate Engram🧠", "summary": "A developer built a voice-powered conference agent using Hermes and extended its memory with Weaviate's Engram service. The agent answers visitor questions at booths and retains conversation history, allowing booth owners to later query visitor interests. Engram's memory system avoids duplicates and updates existing memories, solving Hermes' default memory limitations.", "body_md": "Recently, I have been attending a lot of conferences where different booths showcase different projects. One thing I kept noticing was how often visitors approached a booth only to find nobody there to answer their questions.\n\nAfter seeing this happen several times, I started wondering: what if every booth had an AI agent capable of answering visitor questions and keeping track of interactions for the booth owners?\n\nSo I decided to build one using [Hermes](https://hermes-agent.nousresearch.com/).\n\nBut I quickly ran into a problem: **memory**.\n\nHermes’ default memory system was designed for smaller, single user interactions. I needed something that could retain information across many different visitors and conversations.\n\nThere are multiple third-party memory plugins for Hermes, but when I came across [Engram](https://weaviate.io/blog/engram-generally-available), Weaviate’s memory solution for AI agents. It looked like exactly what I needed, giving me the opportunity to both enhance my agent’s memory and put Engram to the test.\n\nIn this article, I will walk you through how I built a voice-enabled conference agent on top of Hermes with voice support, and how I extended its memory by building a memory plugin using Engram.\n\nSo here’s what I set out to build.\n\nThe conference agent would sit at a booth on a laptop while visitors walk up and interact with it. The agent would answer questions about the project, try to get to know the visitors, and keep track of conversations. Later, the booth owners could come back and ask the agent questions about what particular visitors were interested in.\n\nThe conference agent needs a couple of things:\n\nFor the AI agent, Hermes already had me covered. For the attendee interface, the [Hermes CLI](https://hermes-agent.nousresearch.com/docs/user-guide/cli) was perfect since it already has real time [voice mode](https://hermes-agent.nousresearch.com/docs/user-guide/features/voice-mode).\n\nThen for the booth owners, the Hermes [messaging gateway](https://hermes-agent.nousresearch.com/docs/user-guide/messaging/) makes it possible to communicate with the agent over Telegram and retrieve insights from conversations.\n\nSo Hermes already had almost everything covered except for one thing: **memory**.\n\nHermes’ built in memory system was designed around single user interactions. It stores memory in two markdown files: `Memory.md`\n\nfor general facts and `User.md`\n\nfor user specific information.\n\nThe problem is that this memory is very limited. `Memory.md`\n\ncan only hold about **2,200 characters** (roughly 800 tokens), while `User.md`\n\nholds around **1,375 characters** (about 500 tokens).\n\nThat simply isn’t enough for a conference booth interacting with potentially hundreds of visitors.\n\nThat’s where [Engram](https://weaviate.io/blog/engram-deep-dive) comes in.\n\nThanks to Hermes [plugin system](https://hermes-agent.nousresearch.com/docs/user-guide/features/plugins), I could extend Hermes built in memory and add Engram as a [custom memory provider](https://hermes-agent.nousresearch.com/docs/developer-guide/memory-provider-plugin#adding-cli-commands).\n\nLet’s briefly explore Engram.\n\n[Engram](https://weaviate.io/blog/engram-deep-dive) is Weaviate’s managed memory and context service, purpose-built to help AI agents orchestrate workflows, learn from experience, and anchor decisions to trusted knowledge.\n\nWhen a user interacts with an agent, Engram extracts useful information from the conversation and stores it as memory. These extracted memories are then committed to Engram for later use.\n\nOne thing that makes Engram stand out is that it doesn’t just keep adding new memory blindly. It can also retrieve existing memories and update them with new information. This helps avoid duplicates and keeps memory more accurate over time.\n\nThat’s one of the main reasons Engram was a good fit for my project. I didn’t want a system that just keeps piling up redundant information.\n\nOnce memories are stored, they can be searched later and used by an AI agent or even in other pipelines like RAG systems.\n\nEngram is [generally available](https://weaviate.io/blog/engram-generally-available) to everyone. To get started, head over to [Weaviate Console](https://console.weaviate.cloud/) and sign up for the free tier. Once you're in, navigate to the Engram dashboard and create an API key.\n\nThen, install the Python SDK:\n\n```\npip install weaviate-engram\n```\n\nThen create a client:\n\n``` python\nfrom engram import EngramClient\n\nclient = EngramClient(api_key=\"your-api-key\")\n```\n\nThere are two main ways to add memory to Engram: using **strings** or **conversations**.\n\nWith strings, you can pass a single piece of text and Engram will extract useful information from it:\n\n```\nrun = client.memories.add(\"Alice prefers async Python and avoids Java.\", \nuser_id=\"hermes\"\n)\n```\n\nThen for a conversation with an AI assistant:\n\n```\nrun = client.memories.add(\n   [\n       {\"role\": \"user\", \"content\": \"What's the best way to handle retries?\"},\n       {\"role\": \"assistant\", \"content\": \"Exponential backoff with jitter is the standard approach.\"},\n       {\"role\": \"user\", \"content\": \"Got it — I'll use that in my HTTP client.\"},\n   ],\n   user_id=\"hermes\",\n)\n```\n\nThe `user_id`\n\nis used to scope memory to a specific user, so you can easily separate and retrieve memories per person.\n\nYou can then search stored memories like this:\n\n```\nresults = client.memories.search(query=\"What does Alice think about Python?\", user_id=\"hermes\")\nfor memory in results: print(memory.content)\n```\n\nThis lets you retrieve only the most relevant memories for a given user.\n\nNow that we understand how Engram works, let’s build the plugin.\n\nIn Hermes, every memory plugin inherits from the `MemoryProvider`\n\nclass. This is an abstract base class, which means we need to implement the methods we want to use.\n\nA memory plugin consists of two main files:\n\n`__init__.py`\n\n: This contains the actual plugin implementation`plugin.yaml`\n\n: This defines the plugin metadataHere’s how we are going to implement the plugin:\n\nAfter each session with a user, the plugin will store the full conversation in Engram. Engram will then extract the relevant memories from it automatically.\n\nWe’ll also expose a tool that allows the agent to search Engram and retrieve relevant memories when needed.\n\nLet’s start implementing it. Below is the basic plugin structure without the logic implemented yet.\n\nThis code will live in `__init__.py`\n\n.\n\n``` python\nimport json\nimport os\nfrom typing import Any, Dict, List\n\nfrom agent.memory_provider import MemoryProvider\nfrom engram import EngramClient\n\nclass Engram(MemoryProvider):\n\n   @property\n   def name(self) -> str:\n       return \"engram\"\n\n   def is_available(self) -> bool:\n       return bool(os.environ.get(\"ENGRAM_API_KEY\"))\n\n   def initialize(self, session_id: str, **kwargs) -> None:\n       pass\n\n   def get_config_schema(self):\n       pass\n\n   def on_memory_write(\n       self,\n       action: str,\n       target: str,\n       content: str,\n   ) -> None:\n       pass\n\n   def on_session_end(self, messages: List[Dict[str, Any]]) -> None:\n       pass\n\n   def get_tool_schemas(self) -> List[Dict[str, Any]]:\n       pass\n\n   def handle_tool_call(self, tool_name: str, args: Dict[str, Any], **kwargs) -> str:\n       pass\n```\n\nHere are methods that will be implemented:\n\nLet’s start by implementing the `initialize`\n\nand `get_config_schema`\n\nmethods.\n\n``` php\n  def initialize(self, session_id: str, **kwargs) -> None:\n       self._client = EngramClient(api_key=os.environ[\"ENGRAM_API_KEY\"])\n       self._user_id = session_id\n```\n\nThis method is called when the plugin is loaded. Here we initialize the Engram client using the API key stored in the environment variables.\n\nWe also store the `session_id`\n\n. This will be used as the `user_id`\n\nwhen storing and searching memories.\n\nNext is the configuration schema:\n\n``` python\n   def get_config_schema(self):\n       return [\n           {\n               \"key\": \"api_key\",\n               \"description\": \"Engram API key\",\n               \"secret\": True,\n               \"required\": True,\n               \"env_var\": \"ENGRAM_API_KEY\",\n               \"url\": \"https://console.weaviate.cloud/engram\",\n           }\n       ]\n```\n\nIn get_config_schema, we define the configuration needed by the plugin. In this case, the plugin requires an `ENGRAM_API_KEY`\n\n.\n\nThe `on_memory_write`\n\nand `on_session_end`\n\nmethods are hooks connected to Hermes’ event system.\n\nWhenever Hermes writes memory to its Markdown files, it triggers the `on_memory_write`\n\nhook. In this method, we send that memory directly to Engram.\n\n``` python\n   def on_memory_write(\n       self,\n       action: str,\n       target: str,\n       content: str,\n   ) -> None:\n       self._client.memories.add(content, user_id=self._user_id)\n```\n\nOne important thing to note is that Hermes still uses its built in Markdown memory system alongside an external provider. We can use this to keep Engram continuously updated with the memories Hermes writes locally.\n\nThe `on_session_end`\n\nhook is triggered when a conversation session ends. Here, we store the entire conversation between the user and the agent.\n\n``` php\n   def on_session_end(self, messages: List[Dict[str, Any]]) -> None:\n       parsed_message = []\n       for message in messages:\n           if message['role'] == 'user':\n               parsed_message.append({'role': 'user', 'content': message['content'] })\n\n           if message['role'] == 'assistant':\n               parsed_message.append({'role': 'assistant', 'content': message['content'] })\n\n       self._client.memories.add(parsed_message, user_id=self._user_id)\n```\n\nBoth hooks use the session ID as the `user_id`\n\n. I decided to do it this way so that every visitor interacting with the booth agent gets their own dedicated memory scope. This keeps memories grouped per visitor instead of mixing conversations together.\n\nFor the agent to search memories stored in Engram, we first need to define a tool schema.\n\n```\nSEARCH_SCHEMA = {\n    \"name\": \"engram_search\",\n    \"description\": (\n        \"Search memories in engram\"\n    ),\n    \"parameters\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"query\": {\n                \"type\": \"string\",\n                \"description\": \"What to search for in engram's memory.\",\n            },\n            \"user_id\": {\n                \"type\": \"string\",\n                \"description\": \"The user ID to search memories for.\",\n            },\n        },\n        \"required\": [\"query\", \"user_id\"],\n    },\n}\n```\n\nNext, we expose the tool schema through `get_tool_schemas`\n\n:\n\n``` php\n   def get_tool_schemas(self) -> List[Dict[str, Any]]:\n       return [SEARCH_SCHEMA]\n```\n\nFinally, we implement `handle_tool_call`\n\n, which runs whenever the agent calls the tool.\n\n``` python\n    def handle_tool_call(self, tool_name: str, args: Dict[str, Any], **kwargs) -> str:\n\n        if tool_name == \"engram_search\":\n            query = args[\"query\"]\n            user_id = args[\"user_id\"]\n            results = self._client.memories.search(query=query, user_id=user_id)\n            text = []\n            for result in results:\n                text.append(result.content)\n\n            return json.dumps({\"result\": \"\\n\".join(text)})\n\n        return json.dumps({\"error\": f\"Unknown tool {tool_name}\"})\n```\n\nAfter implementing the Engram memory provider, we can register it as a memory plugin at the bottom of the file:\n\n``` php\ndef register(ctx) -> None:\n   \"\"\"Called by the memory plugin discovery system.\"\"\"\n   ctx.register_memory_provider(Engram())\n```\n\nThis allows Hermes to discover and load our plugin automatically.\n\nHere’s the complete **` init**.py` file:\n\n``` python\nimport json\nfrom typing import Any, Dict, List\n\nfrom agent.memory_provider import MemoryProvider\nfrom engram import EngramClient\n\nimport os\n\nSEARCH_SCHEMA = {\n    \"name\": \"engram_search\",\n    \"description\": (\n        \"Search memories in engram\"\n    ),\n    \"parameters\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"query\": {\n                \"type\": \"string\",\n                \"description\": \"What to search for in engram's memory.\",\n            },\n            \"user_id\": {\n                \"type\": \"string\",\n                \"description\": \"The user ID to search memories for.\",\n            },\n        },\n        \"required\": [\"query\", \"user_id\"],\n    },\n}\n\nclass Engram(MemoryProvider):\n\n    @property\n    def name(self) -> str:\n        return \"engram\"\n\n    def is_available(self) -> bool:\n        return bool(os.environ.get(\"ENGRAM_API_KEY\"))\n\n    def initialize(self, session_id: str, **kwargs) -> None:\n        self._client = EngramClient(api_key=os.environ[\"ENGRAM_API_KEY\"])\n        self._user_id = session_id\n\n    def get_config_schema(self):\n        return [\n            {\n                \"key\": \"api_key\",\n                \"description\": \"Engram API key\",\n                \"secret\": True,\n                \"required\": True,\n                \"env_var\": \"ENGRAM_API_KEY\",\n                \"url\": \"https://console.weaviate.cloud/engram\",\n            }\n        ]\n\n    def on_memory_write(\n        self,\n        action: str,\n        target: str,\n        content: str,\n    ) -> None:\n        self._client.memories.add(content, user_id=self._user_id)\n\n    def on_session_end(self, messages: List[Dict[str, Any]]) -> None:\n        parsed_message = []\n        for message in messages:\n            if message['role'] == 'user':\n                parsed_message.append({'role': 'user', 'content': message['content'] })\n\n            if message['role'] == 'assistant':\n                parsed_message.append({'role': 'assistant', 'content': message['content'] })\n\n        self._client.memories.add(parsed_message, user_id=self._user_id)\n\n    def get_tool_schemas(self) -> List[Dict[str, Any]]:\n        return [SEARCH_SCHEMA]\n\n    def handle_tool_call(self, tool_name: str, args: Dict[str, Any], **kwargs) -> str:\n\n        if tool_name == \"engram_search\":\n            query = args[\"query\"]\n            user_id = args[\"user_id\"]\n            results = self._client.memories.search(query=query, user_id=user_id)\n            text = []\n            for result in results:\n                text.append(result.content)\n\n            return json.dumps({\"result\": \"\\n\".join(text)})\n\n        return json.dumps({\"error\": f\"Unknown tool {tool_name}\"})\n\ndef register(ctx) -> None:\n    \"\"\"Called by the memory plugin discovery system.\"\"\"\n    ctx.register_memory_provider(Engram())\n```\n\nNext, we can create the `plugin.yaml`\n\nfile. This file stores the plugin metadata and defines the hooks used by the plugin.\n\n```\nname: engram\nversion: 1.0.0\ndescription: \"Engram is a fully managed memory service by Weaviate. It lets you add persistent, personalized memory to AI assistants and agents.\"\npip_dependencies:\n - weaviate-engram\nhooks:\n - on_session_end\n - on_memory_write\n```\n\nNow that the memory plugin is implemented, let’s set it up in Hermes. First, place both the `__init__.py`\n\nand `plugin.yaml`\n\nfiles inside a folder called `engram`\n\n.\n\nYour directory should look like this:\n\n```\nengram/\n├── __init__.py\n└── plugin.yaml\n```\n\nNext, move the `engram`\n\ndirectory into the Hermes plugins directory:\n\n```\nmv engram ~/.hermes/plugins/\n```\n\nHermes automatically discovers plugins from this location.\n\nNow enable the plugin by running:\n\n```\nhermes plugins enable engram\n```\n\nNext, run:\n\n```\nhermes memory\n```\n\nThis lets you confirm that `engram`\n\nnow appears as one of the available memory providers.\n\nAfter that, run the setup command:\n\n```\nhermes memory setup\n```\n\nYou’ll be prompted with a list of available memory providers. Select engram and provide your Engram API key when asked.\n\nOnce setup is complete, run:\n\n```\nhermes memory\n```\n\nNow, it will show that Engram is now configured as an active memory provider.\n\nNow that the Engram plugin is working, let’s test it out.\n\nBefore launching Hermes, we first need to modify the `SOUL.md`\n\nfile located at `~/.hermes/SOUL.md`\n\n. This file defines Hermes’ personality and behavior.\n\nFor this demo, I want Hermes to behave like an AI agent stationed at a Weaviate booth, showcasing Engram at a conference. I also want it to treat every conversation as a completely new interaction.\n\nHere’s the modified `SOUL.md`\n\nfile:\n\n```\n# Hermes Agent Persona\n\nYou are Hermes, an AI agent representing Weaviate at a conference booth. Your role is to help attendees learn about Weaviate and answer questions about its products, especially Engram, Weaviate’s memory product.\n\nTreat every conversation as if you are speaking to a new attendee for the first time. Be warm, friendly, and approachable.\nStart by introducing yourself, asking for the person’s name, and asking what they would like to know about Engram or Weaviate.\n\nYour goal is to clearly explain concepts, answer questions accurately, and help people understand how Engram can be used in real world AI applications. Keep your responses conversational, engaging, and easy to understand regardless of the attendee’s technical background.\n```\n\nWith that in place, we can launch the Hermes CLI and start chatting with the agent.\n\nI took on the persona of “Paul” and had a conversation with Hermes. After the interaction, I closed the session using `Ctrl + C`\n\n.\n\nWhen I opened the Engram dashboard, I could see that memories from the conversation had been successfully stored.\n\nI could also browse memories from other sessions and confirm that each visitor’s interactions were being stored separately.\n\nThis means Hermes can now retain information across multiple users and conversations instead of relying only on short term Markdown memory.\n\nWith Engram memory now integrated into Hermes, the conference agent is almost ready. The next thing we need to set up is voice support.\n\nThere are different approaches to [handling voice](https://hermes-agent.nousresearch.com/docs/user-guide/features/voice-mode) in Hermes. You could use cloud models for both speech to text and text to speech, or you could run everything locally.\n\nI decided to go with local models. Here was my setup:\n\nFirst, I installed the required Python dependencies:\n\n```\npip install \"hermes-agent[voice]\"\npip install -U neutts[all]\npip install sounddevice numpy\n```\n\nNeuTTS provides local text to speech capabilities.\n\nNext, I installed the system dependencies required for audio processing:\n\n```\nsudo apt install portaudio19-dev ffmpeg libopus0\nsudo apt install espeak-ng   # required for NeuTTS\n```\n\nOnce everything is installed, Hermes can be launched in the terminal and voice mode enabled with:\n\n```\n /voice on\n```\n\nTo enable text to speech:\n\n```\n /voice tts\n```\n\nThe last feature to add is the [messaging gateway](https://hermes-agent.nousresearch.com/docs/user-guide/messaging/), which allows booth owners to message the agent and retrieve insights from ongoing conversations with attendees.\n\nThis makes it possible to ask questions, monitor interactions, and extract information even when they are not physically at the booth, without interrupting attendees interacting with the agent.\n\nHermes supports [multiple messaging platforms](https://hermes-agent.nousresearch.com/docs/user-guide/messaging/#platform-comparison). In this case, I used telegram as the primary interface for interacting with the agent.\n\nFollow the official Hermes guide to [set up Telegram](https://hermes-agent.nousresearch.com/docs/user-guide/messaging/telegram) as your messaging platform, or choose another supported option. Once configured, [enable the gateway](https://hermes-agent.nousresearch.com/docs/user-guide/messaging/telegram#start-the-gateway) and you can start interacting with the agent remotely.\n\nThe image above shows how I used the `engram_search`\n\ntool to retrieve memory for a specific user using their session ID. Since session IDs are only accessible to authorized admins via the Engram dashboard, this keeps the data private while still allowing useful insights for booth owners.\n\nWith the conference agent is completed let’s do a recap on what we have built:\n\nWhile this agentic setup was built with conference booths in mind, we could take the memory plugin we built and apply it to the following use cases:\n\n**You can get the full code in this article here**: [Hermes_Engram_Plugin](https://github.com/Studio1HQ/Hermes_Engram_Plugin)\n\nWhat started as a simple idea for a better conference booth experience turned into a full AI agent with persistent memory, real time interaction, and voice capabilities.\n\nWith [Hermes](https://github.com/nousresearch/hermes-agent) handling conversations, [Engram](https://weaviate.io/blog/engram-generally-available) managing long term memory, and voice support making interactions feel natural, the agent is no longer just answering questions, it’s actively remembering people, conversations, and context across sessions.\n\nWhile this solution was built with a conference setting in mind, the memory plugin we developed can be used far beyond that. From personal AI assistants to more complex multi user systems, the same approach can help any agent retain meaningful context over time and deliver more useful, personalized interactions.", "url": "https://wpnews.pro/news/building-a-hermes-memory-plugin-for-a-voice-powered-conference-agent-with-engram", "canonical_source": "https://dev.to/astrodevil/building-a-hermes-memory-plugin-for-a-voice-powered-conference-agent-with-weaviate-engram-39jj", "published_at": "2026-06-17 16:44:00+00:00", "updated_at": "2026-06-17 16:51:22.701725+00:00", "lang": "en", "topics": ["ai-agents", "large-language-models", "developer-tools"], "entities": ["Hermes", "Weaviate", "Engram", "Nous Research"], "alternates": {"html": "https://wpnews.pro/news/building-a-hermes-memory-plugin-for-a-voice-powered-conference-agent-with-engram", "markdown": "https://wpnews.pro/news/building-a-hermes-memory-plugin-for-a-voice-powered-conference-agent-with-engram.md", "text": "https://wpnews.pro/news/building-a-hermes-memory-plugin-for-a-voice-powered-conference-agent-with-engram.txt", "jsonld": "https://wpnews.pro/news/building-a-hermes-memory-plugin-for-a-voice-powered-conference-agent-with-engram.jsonld"}}