This is a submission for the Hermes Agent Challenge
I installed Hermes Agent for the first time expecting "ChatGPT with tools."
Thirty minutes later it was interviewing me. How much pushback I want. Whether I prefer the "what" or the "how." How autonomous it should be. What principles should never be violated.
Then it wrote a file that permanently changes how the agent behaves around me.
That was the moment I realized AI agents are a completely different category of tool.
I have been writing code for years. I have used ChatGPT. I have used Claude. I have tried GitHub Copilot. But I had never actually run an AI agent on my own machine.
I kept seeing the word "agent" everywhere. I kept scrolling past it. Then this challenge showed up and I figured it was time to stop scrolling and actually try one.
This is what happened.
Before I get into the setup, let me answer the question I had before I started. What is an AI agent, and how is it different from the AI tools most developers already use?
ChatGPT, Claude, Gemini: you type, they respond. Some have memory. Some have tools. But they are still fundamentally reactive. You prompt them. They answer. The session ends and they wait for the next message.
Hermes is different in three specific ways.
First, it runs as a process on your machine or a server, not in a browser tab. You install it once and it stays running. Many people run it on a VPS or home server and talk to it from Telegram or Discord. It is not tied to your laptop.
Second, it has one brain across many interfaces. Whether you are in the terminal, on Telegram, or in Discord, the same memory, skills, and preferences are active. You are not starting fresh across different surfaces. The agent is singular even if the ways you reach it are many.
Third, it has persistent memory across sessions. It can recall things you told it weeks ago when they are saved into its memory files or history. It builds a model of who you are over time: your preferences, your projects, your environment.
Beyond those three, Hermes is built around five core pillars: persistent memory, reusable skills it creates from experience, a SOUL.md
personality layer, scheduled cron automations, and a self-improvement loop that runs across all of them. You do not bolt these on as optional plugins. They are built into the runtime by default, and you refine them rather than assembling them from scratch.
It also runs terminal commands, reads and writes files in its environment, browses the web, and sends messages across platforms like Telegram and Discord. It is less like an AI you chat with and more like a process that works alongside you.
I came to Hermes because of this challenge. I wanted to try an AI agent properly for the first time and this was the push I needed.
But I already had Claude Code installed. And I had set up OpenClaw before. So the question worth answering is: how does Hermes actually compare to the tools you might already know?
I already use Claude Code. So this is less about choosing between them and more about understanding where each one actually sits.
Claude Code is a deep coding agent. It lives inside your projects. You cd into a repo and it reads the whole codebase, writes and refactors code across multiple files, runs tests, and manages git. It is very good at that specific job.
Its memory is project-centric. It reads a CLAUDE.md
file at the start of each session for project context and saves session history locally. That works well inside a repo. What it does not have is a user-level learning loop. It does not build a model of who you are across sessions. It does not create reusable skills from experience automatically. If you want those things, you add them via plugins.
Hermes is the inverse. It does not live inside a repo. It runs as a long-lived process on your machine or server. It builds a model of you over time. It can create reusable skill documents from repeated workflows and feedback. It runs scheduled jobs, sends messages, browses the web, and handles workflows that have nothing to do with code.
A useful mental model: Claude Code is a contractor who sits beside you at the desk while you are working. When you close the laptop, the contractor goes home. Hermes is the agent that keeps running after you walk away. Different jobs. Different design entirely.
There is also the cost. Claude Code requires at minimum a $20 per month subscription. Claude as a chat product has a free tier. The terminal coding agent does not. Hermes is open source and free to install. You bring your own API key and pay only for tokens you actually use, alternatively you can also install local LLM to run with it.
Many users who run both treat them as complementary. Claude Code inside the repo for serious coding work. Hermes outside the repo for everything else: automations, scheduling, research, and memory that follows you across projects and sessions.
I had set up OpenClaw before starting this challenge. I got it running. But I never got far with it.
The difference between the two comes down to philosophy.
OpenClaw is gateway-first. It focuses on connecting AI to messaging platforms like Telegram and WhatsApp, then letting you assemble your own flows, tools, and behaviors. That is genuinely useful. But the agent logic is something you define yourself. You wire up the pieces.
Hermes is agent-first. Memory, user modeling, skill creation, cron jobs, and long-running workflows are already built into the runtime. You refine the system rather than assembling it from scratch.
Recent OpenClaw versions have added their own memory systems, so memory alone is no longer the main difference. The bigger distinction is how Hermes treats skills and long-term procedural learning as part of the core architecture rather than something you configure separately.
That difference became much clearer once I started building the SOUL.md
workflow.
Installation was one command. I used the curl installer, which is the recommended path on Mac. It tracks the main branch directly, so you get the latest version rather than waiting for a tagged pip release.
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
It runs through checks and installs all the required libraries automatically. Once that finishes, an interactive setup wizard starts.
The wizard asks how you want to set up Hermes. I chose Quick Setup to keep things simple.
From there it walks you through four decisions:
Provider: I chose OpenRouter. One API key gives you access to hundreds of models across multiple providers. You are not locked into one company.
Model: This is where I hit a small snag. I wanted google/gemini-3-flash-preview
but it was not in the list.
The fix was straightforward: I chose "Custom Model" and typed the model string in manually.
google/gemini-3-flash-preview
Terminal backend: I selected "Keep current (local)" since I wanted to run it directly on my laptop.
Messaging platform: I skipped this for now.
That completes the quick full installation.
You can always connect Telegram, Discord, or other platforms later with:
hermes setup gateway
Setup complete. I reloaded my shell:
source ~/.zshrc
Note: Mac defaults to zsh. Use source ~/.zshrc
, not ~/.bashrc
.
Then type hermes
to start the agent.
hermes
One note on the model. google/gemini-3-flash-preview
costs $0.50 per million input tokens on OpenRouter. It has reasoning support and handles multi-turn conversation well.
I had just installed Hermes for the first time. Before I could use it for anything serious, I realized I needed to do something first.
Hermes is only as useful as the context it has about you. Without that context, every session starts generic. It does not know your stack. It does not know how you think. It does not know whether you want a detailed walkthrough or just the answer. You would spend the first part of every conversation re-establishing who you are and how you work.
I did not want that. I wanted to set the foundation properly before building anything on top of it.
While reading through the Hermes docs, I found exactly how to do that.
There is a file called SOUL.md
. You store it at ~/.hermes/SOUL.md
. Without it, Hermes uses a default persona. With it, every session starts loaded with your thinking style, your communication preferences, and your rules for how it should operate around you. Everything that follows builds on top of it.
It is not a fun first experiment. It is the starting line.
I could have written SOUL.md
myself. But I knew I would write what sounds good rather than what is actually true about how I work. An interview flips that. Hermes asks the questions, I answer naturally, and it picks up things I would not have thought to include.
But I also knew the interview needed to be guided by what the docs say SOUL.md is actually for. Not tech stack. Not project conventions. Identity, tone, communication style, and how to handle uncertainty and disagreement. So I gave Hermes the docs first and let them shape the questions.
Here is the prompt I used:
Please read these two reference pages fully before we start:
https://hermes-agent.nousresearch.com/docs/user-guide/features/personality
https://hermes-agent.nousresearch.com/docs/guides/use-soul-with-hermes
Then interview me to build my SOUL.md. Let the docs guide what you ask and what structure you use for the output.
Ask one question at a time. Confirm what you understood in one line before moving on.
Write the final file to ~/.hermes/SOUL.md. This will be the personality of my primary agent, the orchestrator and first point of contact before any other agents or automations run.
Then I waited.
The first thing Hermes did was actually browse both reference pages. You can see the browser_navigate
calls in the terminal before it said a single word. It read the docs, confirmed its understanding: "SOUL.md is the primary identity of the orchestrator, defining tone and style rather than technical project details." Then it started.
Six questions in total. Each one identity-focused: the core spirit of the agent, how to handle communication, decision-making under uncertainty, technical standards, hard never-evers, and finally a name.
Each question made me stop and think. These were not surface-level questions. Hermes was asking how I want it to operate, not who I am. I spent real time on each answer before typing. By the end of six questions, I had said things about how I work that I had never written down before.
The last question I did not expect: does this partner have a name?
I told it: Chief of Staff. Speed over perfection. Ship the core, improve later. One plan, full commitment. Less words, more action.
Hermes confirmed and started writing.
It confirmed that my SOUL.md
had been written. Then it showed me the diff of what changed. The old SOUL.md
was replaced cleanly. What remained was identity, communication, and operating principles only. Exactly what the docs said SOUL.md
is for.
I checked on my newly updated SOUL.md:
vi ~/.hermes/SOUL.md
Here is the final file:
The whole session took about 20 minutes, most of which I spent thinking before each answer.
The strange part was not that Hermes generated the file. Any LLM can generate markdown. The strange part was recognizing myself in it.
Some of the rules it captured were things I had been doing for years but had never explicitly written down before. Hermes was not just summarizing answers. It was extracting operational patterns from the conversation and turning them into working instructions.
That was the first moment where this stopped feeling like "ChatGPT with tools" and started feeling like something else entirely.
A few things stood out that I did not expect going in.
The first was that Hermes actually browsed both reference pages before asking a single question. You can see the browser_navigate
calls in the terminal. It did not guess at what SOUL.md
should contain. It read the docs, summarized what it understood, confirmed its interpretation, then started the interview. That is not how a chatbot behaves. That is an agent preparing its approach before executing a task.
The second was the quality difference between this session and a generic interview prompt. Because Hermes read the docs first, every question was about identity and operating principles, not about tech stack or project workflow. Question 1 asked for the core spirit of the agent. Question 3 asked how it should handle uncertainty and decisions. Question 5 asked for the hard never-evers. None of those would have appeared without the doc reference guiding the structure.
The third was the naming question. After five questions, Hermes asked: does this partner have a name? That was not something I prompted for. It came from understanding that SOUL.md is an identity, not just a config file. The result, "Chief of Staff," gave the whole thing a weight that a nameless config file would not have had.
Then I saw the diff on the changes for the SOUL.md
file. Hermes had read the docs carefully enough to know those things do not belong in SOUL.md. They belong in AGENTS.md. It made that call on its own without me saying anything about it.
One more thing I did not expect: running hermes dashboard
opens a web interface at http://localhost:9119
. Sessions, memory, skills, and cron jobs all laid out in a browser view. After spending the whole time in the terminal, seeing the full picture of what Hermes tracks was the clearest sign that there is a lot more here than what I had touched in a single session.
I started this challenge thinking "AI agent" just meant "ChatGPT with tools."
After this session, I think the real difference is not intelligence. It is not even continuity. It is control.
Let me be honest about what the SOUL.md
project does and does not prove. You could have a similar conversation with Claude or ChatGPT and ask them to summarize your working style. Both have memory features that persist across sessions. So the interview itself is not what makes Hermes different.
What makes Hermes different comes down to five things I did not fully appreciate until I started digging in.
You own everything. The SOUL.md
file lives at ~/.hermes/SOUL.md
on your machine. You edit it directly. You version control it. You know exactly what context the agent is using. With ChatGPT or Claude memory, you are trusting a black box you cannot inspect.
You are not locked to any model. Today I ran Gemini via OpenRouter. Tomorrow I could swap to Claude Sonnet, a local model running on Ollama, or any other supported model with a simple config change or command. No chat app gives you that. ChatGPT only runs on OpenAI models. Claude only runs on Anthropic models. Hermes wraps around whatever model you choose.
Scheduled autonomous tasks run without you. You can set Hermes to run a nightly GitHub sync, send you a morning briefing, or monitor something and alert you via Telegram. These jobs continue running whether you are at your laptop or not.
Some chat apps have started experimenting with reminders and scheduled actions, but Hermes treats long-running automation as part of the core local-agent workflow rather than an add-on feature.
The value compounds over time. Hermes is designed so completed tasks can become reusable skills, and that loop is part of the default architecture, not a separate plugin you bolt on. You refine it with edits and approvals as you go. After enough sessions in a domain, you start seeing it get faster and more accurate because it is drawing on skills it built from your work.
Terminal, files, browser, and messaging work natively. Running commands on your machine, editing files, scraping a site, sending a message to your Telegram: these are not features of any chat app.
They are built into Hermes from the start.
The SOUL.md
session was 30 minutes. But it was not the project. It was the setup before the project. I baked my developer DNA into the agent first: how I think, how I communicate, what I build with, how autonomous I want it to be. Now everything that follows runs on top of that foundation.
The Chief of Staff is set. What comes next is building on top of it: custom skills for recurring tasks, specialized agents for specific workflows, cron jobs that run without me, and eventually a team of agents where each one has a focused job and the Chief of Staff orchestrates them. None of that is possible without the foundation being right first. That is the point of starting here.
I have not touched the cron jobs yet. I have not built the automations. I have not connected the messaging gateway. All of that comes next. But none of it would feel right without the foundation being set first.
That is what I did not understand about agents before this. You do not just start using them. You configure yourself into them. Then they start working for you.
One honest expectation to set: users who run Hermes heavily estimate it takes roughly two weeks of active use before it starts feeling genuinely useful. The memory needs to accumulate. The skills need to build up. The first session is the foundation, not the payoff.
It stopped feeling like a smarter chat window and started feeling like a system designed to work alongside me over time.
Give it 30 minutes. Set the foundation. See what you build on top of it.