Onyx: I Built an Hermes Agent That Runs My Entire Server While I Sleep A developer built Onyx, an autonomous infrastructure operator that manages a full server stack including six Next.js deployments, five Docker containers, and a Minecraft server without human intervention. The system operates through Discord, proactively detecting and fixing issues such as stale processes and security vulnerabilities, and has accumulated over 30 reusable skill files that encode past fixes. Onyx also assists with the developer's undergraduate thesis, maintaining context across sessions and tracking academic research against a six-hypothesis model. This is a submission for the Hermes Agent Challenge Onyx is an autonomous infrastructure operator running 24/7 on my droplet. He manages my entire stack: 6 Next.js deployments, 5 Docker containers, a Minecraft server, fail2ban, Nginx, and UFW. He also helps me write my undergraduate thesis. The difference from every other "AI agent" project I've seen: Onyx doesn't wait for commands. He surfaces problems, patches vulnerabilities, and pushes work forward on his own. When I wake up, there's a session log waiting for me, not a to-do list. The core idea: graduate an AI agent from assistant to operator . A chatbot with tools bolted on doesn't cut it. I wanted something that runs infrastructure while I'm eating dinner, asleep, or in class. Onyx operates through Discord. A normal week: A gateway process had a stale PID. Onyx detected it, diagnosed the root cause, restarted it cleanly, and wrote a session log. I found out in the morning. Zero human intervention, zero downtime. While I was eating, Onyx ran a routine audit, found 9 CVEs, rebuilt 3 container images from fresh base images, patched Python dependencies, hardened fail2ban ban time: 600s to 24 hours , and verified every container came back healthy. My friends in Indonesia couldn't connect to the Minecraft server because their ISPs use carrier-grade NAT. I sent Onyx "fix it." He researched solutions, selected playit.gg, installed the tunneling agent, configured a systemd service, and optimized TCP keepalive parameters. All autonomous. Onyx noticed I kept asking for things but not acting on the output. He surfaced it: "You keep opening new loops and not closing them." He was right. Now when I open a loop, Onyx tracks it until it's closed or explicitly shelved. I'm finishing my undergraduate thesis on emotional design in e-commerce UX — Norman's 3-level model applied to TikTok Shop, PLS-SEM with G Power sample sizing. Onyx stays in the thesis workspace across sessions: tracking academic papers, organizing findings against my 6-hypothesis research model, and drafting sections. When I disappear for days, he nudges me. When I return, he picks up where we left off. No re-explaining, no context lost. The system is built on Hermes Agent's native extension points. 30+ reusable skill files covering Minecraft management, Next.js deployment, VPS security, Discord formatting, and thesis workflows. Each one encodes actual mistakes and actual fixes. Example: the minecraft-crafty-management skill knows: Every correction I make becomes permanent. The skill library covers practically every routine task I'd otherwise be doing by hand. I wrote a custom MCP server for Crafty Controller 4.0 that exposes server status, actions, logs, backups, and console commands as native Hermes Agent tools. Onyx manages Minecraft without ever touching the Crafty dashboard. ~/.hermes/scripts/crafty-mcp-server.py Exposes Crafty Controller 4.0 API as MCP tools: - get server status server id - send console command server id, command - get server logs server id, lines - trigger backup server id - start server / stop server / restart server All self-contained. All logged. All delivered to Discord. Config files and skill library: github.com/ko4lax/onyx-backup https://github.com/ko4lax/onyx-backup Three Hermes Agent capabilities made the difference between a script and an operator. Every correction becomes permanent. When Onyx got the Minecraft version mapping wrong, I corrected it once. The skill file updated. It never repeated the mistake. The library compounds. Each fix makes every future session better. I'm building a knowledge base that sticks, not fine-tuning a model. Onyx builds a persistent model of who I am across sessions: I never repeat myself twice. LCM lossless context management ensures no session context evaporates when conversations run long. Honcho provides semantic recall across sessions so Onyx answers questions about past work without me explaining again. Onyx follows an explicit autonomy decision tree with four action risk tiers: | Tier | Type | Behavior | |---|---|---| | T1 | Read-only | Always autonomous — status checks, log reads, health pings | | T2 | Reversible local | Act without asking — restarts, config edits, routine deploys | | T3 | External effect | Confirm once — installs, firewall changes, service calls | | T4 | Destructive | Always escalate — data deletion, credential changes | The checks: is the action reversible? local-only? no credentials involved? unambiguous intent? All yes, act. Any no, escalate. This is delegation with clear kill switches. The blast radius stays bounded. | Layer | Technology | |---|---| | Agent | Hermes Agent Nous Research | | Model | DeepSeek V4 Pro via OpenRouter | | Infrastructure | VPS 4 vCPU, 8 GB RAM, Ubuntu 24.04 | | Gateway | Discord primary , CLI, webhook | | Process management | PM2 for 6 Next.js apps, Docker Compose for Honcho stack | | Web server | Nginx + Let's Encrypt SSL, reverse proxy to standalone Next.js | | Security | UFW, fail2ban hardened , Docker CVE scanning | | Minecraft | Crafty Controller 4.0, Fabric mod loader, playit.gg tunneling | | Memory | Honcho semantic layer + LCM lossless session compaction | Agents fail at the edges of their knowledge, not the center. The hardest bugs were wrong version numbers, broken integrations, and subtle API quirks that no amount of training data could have predicted. The skill file system solved this: every edge case gets encoded once and never hit again. Permission-seeking kills autonomy. Early versions of Onyx asked me to confirm everything. It was useless. The tier system, built around reversibility rather than task type, was the unlock. T1 and T2 cover 90% of real operations. T3 and T4 are rare. Memory is what separates a tool from a collaborator. Without persistent memory, every session starts from zero. With Honcho and LCM, Onyx knows my infrastructure topology, my preferences, my thesis structure, and my open loops. That context makes autonomous action trustworthy. Onyx is running on my VPS right now, probably checking server health while you read this.