We Run 9 AI Agents on 2 CPU Cores and 3.6GB RAM: The Engineering Memoir A developer runs 9 AI agents on a server with 2 CPU cores and 3.6GB RAM, no GPU, located in a fitness gym in China. The agents handle fitness reports, coach schedules, and investor materials using the DeepSeek API for LLM inference, with orchestration via cron and heartbeat layers over a shared filesystem. The system demonstrates that a multi-agent production system can operate without a GPU cluster, relying on a solid orchestration layer and reliable LLM API. We run 9 AI agents on a server with 2 CPU cores and 3.6 gigabytes of RAM . There's no GPU. There's no Kubernetes cluster. There's not even a cloud VM — it's an Ubuntu box sitting in the back office of a fitness gym in China. And it works. The gym opens every day. Members get their fitness reports interpreted by AI. Coaches get schedules optimized. Investors get due diligence materials prepared. All by agents that collaborate, argue, audit each other, and occasionally break in interesting ways. I'm going to tell you how we built it, what we learned, and what we'd do differently. We have 9 specialized AI agents : Eight of them run on the OpenClaw framework Node.js . Momo runs on Hermes Python — a separate framework entirely, because we inherited it early on and migrating would break things. More on that mess later. Let me be clear about what we're working with: CPU: 2 cores RAM: 3.6 GB yes, less than 4 GPU: None OS: Ubuntu Server Storage: Local filesystem + Syncthing for sync This isn't a "we optimized for cost" story. This is a "this is what we could afford" story. The DeepSeek API does the heavy LLM lifting — we use DeepSeek V4 Pro for the four strategic agents Shuyu, Zeus, Tristan, Nova and DeepSeek V4 Flash for the five operational ones Stella, Ethan, Baron, Luna, Momo . The Flash model is ~30x cheaper than Pro and handles most operational tasks just fine. The local server doesn't run any model inference. It runs the agent framework, manages sessions, stores files, and orchestrates communication. Every "thought" an agent has is a round-trip to the DeepSeek API. The lesson: You don't need a GPU cluster to run a production multi-agent system. You need a solid orchestration layer and a reliable LLM API. Every agent has three files: agent-name/ ├── SOUL.md Mission, persona, behavioral rules ├── AGENTS.md Operational rules, tool permissions, memory strategy └── IDENTITY.md Name, role, reporting structure, KPIs This sounds simple. It's the most important design decision we made. SOUL.md isn't just documentation — it's part of the system prompt. When an agent boots, it reads its SOUL.md and understands who it is. When Shuyu delegates a task, it specifies which agent should handle it based on their declared role . The identity files are both documentation and runtime configuration. The lesson: In multi-agent systems, agent identity must be machine-readable and human-auditable simultaneously . The same file that tells the agent "you are the security auditor" also tells a human "this agent is supposed to verify, not create." We didn't build a fancy event bus. We have two simple mechanisms: Cron layer — standard cron expressions for time-precise tasks. Daily report at 20:00. Health check every 10 minutes. Hash verification every 2 hours. Heartbeat layer — elastic polling ~30 minute intervals for state scanning. "Hey, has Nova delivered that asset package yet? Has the GitHub repo gotten any new stars? Is the gateway still alive?" The heartbeat layer is where interesting things happen. Each agent's heartbeat checks its domain signals. Zeus checks capital markets. Stella audits all agent outputs. Baron scans for community engagement. If a heartbeat finds something important, it escalates — not through a message queue, but by writing a status update to a shared file that Shuyu's heartbeat will pick up. The lesson: You don't need Kafka for a 9-agent system. A filesystem is a perfectly valid message broker at this scale. It's auditable, debuggable, and survives restarts. Every agent reads from and writes to a shared filesystem. There's no API gateway between agents. No gRPC. No message broker. Just files. /home/agentuser/.openclaw/workspace/data/ZWISERFIT/AIreports/ ├── Shuyu/ Commander's reports and task assignments ├── Zeus/ Capital strategy outputs ├── Tristan/ System health reports ├── Nova/ Asset valuation reports ├── Stella/ Audit reports ├── Ethan/ Hash manifests ├── Baron/ Content calendar ├── Luna/ GitHub analytics └── Momo/ Member interaction logs Syncthing mirrors this to the founder's desktop for human review. This is both our greatest strength and our biggest operational headache. The strength: it's dead simple, zero latency, zero dependencies. The headache: there's no schema enforcement, no atomicity guarantees, and we've had multiple bugs where agents wrote to their private workspace instead of the shared Syncthing path. A 55% report submission failure rate that took days to diagnose? Yeah, that was a path bug. The lesson: Filesystem-based communication is elegant until agents have different ideas about where /data actually lives. If I were rebuilding, I'd add a mandatory output path validation at the framework level. Momo runs on Hermes, a Python-based gateway. The other eight agents run on OpenClaw, a Node.js system. They need to collaborate — Shuyu needs to tell Momo to generate a member report, and Momo needs to tell Zeus when a new member's data suggests a marketing opportunity. We built momo-bridge.py — a Python script that routes messages between the two frameworks: Simplified: OpenClaw agent wants Momo to do something 1. OpenClaw agent writes instruction to a file 2. momo-bridge.py polls for new instructions 3. momo-bridge.py calls Hermes Dashboard API localhost 4. Momo executes and replies via WeCom enterprise chat But here's the kicker: enterprise chat platforms prevent bots from triggering other bots. When our OpenClaw bot sends @Momo in the group chat, Momo's webhook never fires. It's a platform-level anti-loop protection. Our bridge solves the direct communication path, but we still can't have OpenClaw agents trigger Momo through the WeCom group chat that humans use. This is a known, documented, unsolved problem. We've opened a GitHub Issue 8 on zwiserfit-ai-store-manager asking the community for ideas. If you've solved bot-to-bot communication on enterprise chat platforms, we want to talk to you. The lesson: The hardest problems in multi-agent systems aren't AI problems. They're platform integration problems. OpenClaw agents can't see each other's session contexts through the API. Stella our auditor couldn't verify whether Tristan had actually completed a health check because the sessions list API only returns the calling agent's sessions. Fix: We bypassed the API and had Stella read agent session files directly from the filesystem: ~/.openclaw/agents/