How I Deployed Hermes Agent on AWS

Nous Research deployed its open-source Hermes Agent on AWS using a zero-inbound-security-group architecture, relying on AWS Systems Manager Session Manager for admin access and Telegram long-polling for bot communication. The setup runs on a single m7g.medium EC2 instance with persistent agent data on EFS, costing approximately $35 per month.

My EC2 instance has a public IP address. It has zero inbound firewall rules. And yet I can reach my AI agent from my phone on Telegram, pull up a full web workspace in my browser, and run shell commands on it — all without opening a single port, without a VPN, and without SSH. The latest version also splits storage deliberately: persistent agent data stays on EFS, while the Hermes install and Python venv moved to the root EBS volume. That change keeps pip install / hermes update I/O off EFS and brings always-on infra to a highly predictable ~$35/mo . That's the setup this post is about. Hermes Agent https://hermes-agent.nousresearch.com/docs is an open-source AI agent from Nous Research. It's not a chatbot wrapper. It has persistent memory, skills, a file system, a sandboxed terminal backend, and a full web workspace UI. You point it at a model provider and it runs as a daemon — hermes-gateway — serving an OpenAI-compatible API. The web workspace looks like a proper IDE: chat panel, file browser, terminal, job queue. The Telegram integration is a long-polling bot that connects to the same gateway — no extra server, no webhook, no public URL. I wanted this running on AWS, backed by Amazon Bedrock no API keys to rotate, IAM role handles auth , with my agent's memory surviving instance replacements. Your phone Telegram └─► Telegram servers ──► hermes-gateway long-poll outbound HTTPS only Your laptop browser └─► aws ssm start-session ──► SSM port-forward :3000 └─► hermes-workspace loopback only EC2 m7g.medium · public subnet · ZERO inbound SG · dynamic public IP │ ├─ hermes-gateway :8642 127.0.0.1 only │ ├─ Bedrock inference via IAM role no API keys │ ├─ Telegram long-poll outbound HTTPS │ └─ OpenAI-compatible API │ ├─ hermes-dashboard :9119 127.0.0.1 only └─ hermes-workspace :3000 127.0.0.1 only │ ├── EFS /mnt/efs/hermes RETAIN · encrypted · uid=10000 access point │ .env · config.yaml · sessions · skills · SOUL.md · logs · state DBs │ ↑ persistent agent data — survives instance replacement │ ├── EBS root volume │ /opt/hermes-agent ← hermes venv pip I/O stays off EFS │ /opt/hermes-workspace ← workspace UI │ └── Secrets Manager hermes/runtime API SERVER KEY · TELEGRAM BOT TOKEN · TELEGRAM ALLOWED USERS Three CDK stacks, deployed in order: | Stack | What it provisions | |---|---| HermesNetworkStack | VPC 1 AZ , public subnet, IGW, S3 gateway endpoint, security groups | HermesStorageStack | EFS RETAIN, encrypted, uid=10000 access point , Secrets Manager | HermesComputeStack | EC2 m7g.medium , IAM Bedrock-scoped , bootstrap user-data, systemd units | The instinct when deploying anything on AWS is to reach for a private subnet, a NAT Gateway, and VPC interface endpoints. That's the enterprise posture. It's also ~$88/mo in endpoint costs alone before your instance even starts. For a personal deployment the actual security boundary is not the subnet type — it's what's listening on the instance. All three services bind to 127.0.0.1 only. The Security Group has zero inbound rules. The public IP on the instance rejects every connection attempt because there is nothing behind it. network stack.py — the entire inbound surface of the instance self.instance security group = ec2.SecurityGroup self, "InstanceSg", vpc=self.vpc, description="Hermes EC2 - zero inbound; egress via IGW. Admin via SSM.", allow all outbound=True, No add ingress rule calls. Ever. Admin access is via AWS Systems Manager Session Manager — outbound HTTPS to the SSM service endpoint, no inbound port required. SSM also handles port-forwarding, which is how the workspace reaches your browser. Telegram uses long-polling. The gateway opens an outbound connection to Telegram's servers and holds it. Telegram pushes messages down that connection. Again: zero inbound. The result: there is no attack surface on the public IP. Shodan can scan it all day. Persistent agent data — SOUL.md , skills, session history, state DBs, the .env with all secrets, the config.yaml — lives on an EFS volume mounted at /mnt/efs/hermes . The hermes binary and venv live on the root EBS volume at /opt/hermes-agent instead. Why split? EFS Elastic Throughput charges per GB accessed. Moving the venv to EBS removes that install/update path from EFS, keeping steady-state EFS I/O costs around ~$1/mo instead of paying for heavy throughput during dependency updates. See docs/STORAGE.md for the full reference. The EFS has RemovalPolicy.RETAIN . The access point locks the path to UID 10000. Automatic backups are on with a 35-day window. storage stack.py — the persistence layer self.file system = efs.FileSystem self, "HermesEfs", vpc=vpc, encrypted=True, removal policy=RemovalPolicy.RETAIN, survives cdk destroy lifecycle policy=efs.LifecyclePolicy.AFTER 30 DAYS, throughput mode=efs.ThroughputMode.ELASTIC, enable automatic backups=True, self.access point = self.file system.add access point "HermesAccessPointUid10000", path="/hermes", create acl=efs.Acl owner uid="10000", owner gid="10000", permissions="0750" , posix user=efs.PosixUser uid="10000", gid="10000" , What this means in practice: if the EC2 instance develops a problem, you run cdk deploy and get a fresh one. The new instance mounts the same EFS, reads the same .env , reinstalls the venv to EBS via user-data, and all three systemd services start with the agent's full memory intact. No manual data migration, no re-configuration. The EC2 root EBS is flagged delete on termination=True . Agent data is on EFS RETAIN ; install artifacts on EBS are recreated automatically on each deploy. Hermes connects to Bedrock via the Hermes Bedrock guide https://hermes-agent.nousresearch.com/docs/guides/aws-bedrock . The EC2 instance has an IAM role scoped to bedrock:InvokeModel , bedrock:Converse , and the streaming variants — on specific inference-profile and foundation-model ARNs only. No API keys anywhere. No key rotation. If the instance is compromised, the blast radius is bounded to Bedrock inference on two specific models. The role cannot touch S3, DynamoDB, other accounts, or anything else. Two models run in this stack: | Model | Role | Why | |---|---|---| us.anthropic.claude-sonnet-4-6 | Primary all main agent tasks | Best reasoning for the price on Bedrock | us.amazon.nova-lite-v1:0 | Auxiliary 5 background slots | ~85× cheaper than Sonnet for web extraction, vision, summarisation | The us. prefix is the cross-region inference profile — Bedrock routes to us-east-1 , us-east-2 , or us-west-2 automatically for throughput. You enable both models once in the Bedrock Model Access console https://us-east-1.console.aws.amazon.com/bedrock/home /modelaccess and never touch it again. | Component | Detail | ≈ Monthly | |---|---|---| EC2 m7g.medium Graviton, 2 vCPU / 4 GiB | 730 hrs × $0.0404/hr | ~$29.50 | | EBS gp3 root 30 GiB, encrypted | venv + workspace on EBS | $2.40 | | EFS Standard ~64 MB agent data | $0.30/GiB-mo storage | ~$0.02 | | EFS Elastic throughput I/O | venv/deps on EBS; steady-state session/state access only | ~$1/mo | | EFS automatic backups | ~$0.05/GiB-mo | ~$0.50 | | Secrets Manager | 1 secret × $0.40 | $0.40 | | CloudWatch Logs + metrics | ingestion + custom metrics | ~$2 | | NAT Gateway / VPC endpoints | none | $0 | Infra total always-on | ≈ $35/mo | No NAT Gateway. No interface VPC endpoints. The EC2 routes outbound directly through the Internet Gateway. That single architectural decision — public subnet, zero-inbound SG instead of private subnet + NAT — is 58% cheaper than the equivalent private-subnet setup with six VPC endpoints. aws ec2 stop-instances --instance-ids <InstanceId --region us-east-1 EC2 compute billing stops immediately, and most EFS data-access I/O should stop with the services. EFS storage, EBS, Secrets Manager, and CloudWatch keep billing at ~$8/mo. When you start it again, SSM is ready in ~60 seconds and all three hermes- systemd units restart automatically. No re-bootstrapping, no re-configuration, agent memory fully intact. Floor: ~$8/mo when off. ~$35/mo when always-on. | Model | Rate | Typical personal use | |---|---|---| | Claude Sonnet 4.x | ~$3/M in · $15/M out | $10–50/mo | | Nova Lite aux slots | ~$0.06/M in · $0.24/M out | < $2/mo | ChatGPT Plus is $20/mo. You get no persistent agent filesystem, no terminal backend, no Telegram long-polling, and far less control over where memory and logs live. The Hermes setup is more infrastructure to own, but that is the point: you own the memory, the skills, the SOUL.md that shapes the agent's persona, the logs, and the conversation history. Stop the instance today, redeploy in six months, and the agent picks up from the same EFS-backed state. cdk deploy --all hermes/runtime , sync to EFS, restart gateway Access the workspace from your laptop aws ssm start-session --target <InstanceId \ --document-name AWS-StartPortForwardingSession \ --parameters '{"portNumber": "3000" ,"localPortNumber": "3000" }' \ --region us-east-1 open http://localhost:3000 After step 4, Telegram just works. Message your bot, get a reply. No additional setup. I started with a private subnet, a NAT Gateway, and VPC interface endpoints for SSM, Bedrock, Secrets Manager, EFS, and CloudWatch. It's what every AWS security guide recommends. It's also ~$88/mo in endpoint costs before a single token is processed. The insight that unlocked this architecture: the security boundary for a personal agent isn't the subnet — it's what's reachable on the instance. With zero inbound SG rules and all services bound to loopback, the public IP is inert. SSM and Telegram's long-polling handle the two access patterns admin shell / bot messages over outbound HTTPS. No VPN, no bastion, no open ports. The most secure design for this use case turned out to be the simplest one. Built with Hermes Agent · AWS CDK Python · Amazon Bedrock · SSM Session Manager