{"slug": "i-rebuilt-zo-computer-s-seven-subsystems-in-800-lines-of-python-here-s-the-the-i", "title": "I rebuilt Zo Computer's seven subsystems in 800 lines of Python — here's the architecture, the tradeoffs, and what I cut", "summary": "A developer rebuilt Zo Computer's seven subsystems in 775 lines of Python, creating an open-source package called ZoClone that replicates the AI workspace's agent manager, skills registry, memory engine, scheduler, compute pool, BYOK client, and headless browser without daemons, Docker, or Postgres. The project demonstrates that a single Python package on a laptop can provide 80% of the same functionality using only ~800 lines of dependency-light code.", "body_md": "I've been using [Zo Computer](https://zo.computer) as my primary AI workspace for a few months. The piece I kept coming back to wasn't the model — it was the *substrate*: the agent manager that spawns parallel sessions, the skills registry that auto-loads `SKILL.md`\n\nfiles, the memory engine that compresses old context, the rrule-based scheduler, the compute pool that turns idle machines into workers, the BYOK client that swaps between Groq/OpenAI/Anthropic, and the headless browser that actually clicks things.\n\nSo I asked the obvious question: how much of that is *concept* and how much is platform glue? Could a single Python package on a laptop give a developer 80% of the same shape?\n\n[ZoClone](https://github.com/AmSach/ZoClone) is my answer. Seven files in `src/`\n\n, ~800 lines of dependency-light Python, and every subsystem above is wired up. No daemon, no Docker, no Postgres — just `~/.zoclone/*.db`\n\nand a `ThreadPoolExecutor`\n\n.\n\nHere's the architecture, what I learned about which parts are easy to clone and which ones are doing real work, and the shortcuts I had to take to fit the whole thing in a single repo.\n\n```\nZoClone/\n├── src/\n│   ├── zo.py              # top-level orchestrator + ask() loop\n│   ├── agent_manager.py   # parallel async agents via Zo /zo/ask\n│   ├── skills.py          # SKILL.md auto-loader + handler dispatch\n│   ├── memory.py          # TF-IDF fallback embeddings + context recall\n│   ├── automation.py      # rrule scheduler with minute/hour/day cadences\n│   ├── compute_pool.py    # node registry + priority FIFO dispatch\n│   ├── browser.py         # Playwright headless + navigate/screenshot/eval\n│   ├── byok.py            # key vault for Groq/OpenAI/Anthropic/Ollama\n│   ├── zo_client.py       # OpenAI-compatible chat() abstraction\n│   └── services.py        # process supervisor (start/stop/logs)\n```\n\nTotal LoC: **775**. No `__init__.py`\n\nmagic, no metaclass tricks, no plugin discovery beyond a directory scan. The constraint forced every interface to be a plain function or a class with three methods.\n\n`zo.py`\n\nEverything threads through a single `ZoClone`\n\nclass that owns the DB connection, a thread pool, and a `AIClient`\n\nthat's lazily constructed on first call to `ask()`\n\n.\n\n``` python\nclass ZoClone:\n    def __init__(self):\n        self.db = init_db()\n        self.executor = ThreadPoolExecutor(max_workers=10)\n        self.ai_client = None\n        self.pool = pool        # module-level singleton\n        self.hosting = hosting  # module-level singleton\n        self.memory = memory\n        self.scheduler = scheduler\n\n    def ask(self, conv_id: str, message: str, provider: str = \"groq\",\n            model: str = \"\", tools: list[dict] = None) -> dict:\n        if not self.ai_client:\n            key = get_key(provider)\n            m = model or PROVIDERS[provider][\"models\"][0]\n            self.ai_client = AIClient(provider, m, key)\n\n        messages = self.memory.get_context(conv_id)\n        messages.append({\"role\": \"user\", \"content\": message})\n        system = f\"You are Sentience, an advanced AI running locally. Workspace: {os.getcwd()}.\"\n\n        resp = self.ai_client.chat(\n            [{\"role\": \"system\", \"content\": system}] + messages[-20:],\n            tools or [],\n        )\n        # ... persist + return\n```\n\nThe trick is `AIClient`\n\n— it's the *only* piece that has to be OpenAI-compatible, because every modern provider (Groq, Together, OpenRouter, Ollama, LM Studio) has converged on the chat completions schema. Anthropic needed a tiny shim, but Groq works out of the box.\n\n`SKILL.md`\n\nThis is the part I'm proudest of. The directory scan is six lines:\n\n``` python\ndef load_all_skills():\n    global SKILLS\n    SKILLS = {}\n    if not SKILL_DIR.exists():\n        return\n    for item in SKILL_DIR.iterdir():\n        if item.is_dir() and (item / \"SKILL.md\").exists():\n            skill = load_skill(item.name, item / \"SKILL.md\")\n            if skill:\n                SKILLS[skill.name] = skill\n```\n\nThe interesting bit is the SKILL.md parser. It accepts the same frontmatter shape as the Agent Skills spec — `name`\n\n, `description`\n\n, `triggers`\n\n(comma-separated) — and looks for `scripts/<name>.py`\n\nto find a `run()`\n\nor `execute()`\n\ncallable. That's the entire plugin API. There's no registration, no decorator, no manifest; drop a folder in `skills/`\n\nand the next `import`\n\npicks it up.\n\nThe price: there's no versioning, no dependency declaration, no per-skill sandbox. If you want a skill to be hermetic, you have to do that yourself. For a single-user laptop, that's fine. For a multi-tenant platform, it's not.\n\n`aiohttp`\n\nover `/zo/ask`\n\nI cheated here, and I'm fine with it. The original \"spawn a parallel agent\" primitive is *itself* a remote call to a model, and Zo's `/zo/ask`\n\nendpoint is open to anyone with a token. So:\n\n``` python\nasync def spawn(self, agent_id: str, prompt: str, callback=None):\n    async with aiohttp.ClientSession() as session:\n        async with session.post(\n            \"https://api.zo.computer/zo/ask\",\n            headers={\"authorization\": self.api_token, \"content-type\": \"application/json\"},\n            json={\"input\": prompt, \"model_name\": \"vercel:minimax/minimax-m2.7\"},\n        ) as resp:\n            return {\"agent_id\": agent_id, \"output\": (await resp.json())[\"output\"]}\n\nasync def spawn_all(self, agents: list) -> list:\n    return await asyncio.gather(*[self.spawn(a[\"id\"], a[\"prompt\"]) for a in agents])\n```\n\n`spawn_all`\n\nfires N concurrent requests, asyncio.gather waits for the slowest, and you get a list of outputs back. A `ThreadPoolExecutor(max_workers=10)`\n\nis the sync equivalent for callers that don't want to be async. In practice the bottleneck is the model, not the network — 10 parallel calls saturate the rate limiter long before they saturate `asyncio`\n\n.\n\nI'll be honest: this is the weakest subsystem. `embed_tfidf`\n\nhashes tokens into a 512-dim vector, `cosine`\n\ndoes the math, and `recall()`\n\nreturns the top-k nodes whose embedding has the highest similarity. It works for short prompts and small corpora, but it is *not* semantic — `database`\n\nand `sql`\n\ndon't cluster the way they would with a real embedding model.\n\nThe reason I shipped it anyway: a real embedding model (sentence-transformers, or a remote call) is one swap away, and the *interface* — `memorize(content, meta) -> nid`\n\n, `recall(query, top_k) -> [{id, content, meta}]`\n\n— doesn't change. When I get around to plugging in `nomic-embed-text`\n\nvia Ollama, nothing in `zo.py`\n\nneeds to move. The trick was defining the right shape first and being honest about which fields the placeholder is faking.\n\nThe rrule spec is a 50-page document. I needed three frequencies and a count. So:\n\n``` php\ndef parse_rrule(rrule: str) -> dict:\n    result = {\"interval\": 86400, \"count\": 0}  # default daily\n    if \"FREQ=DAILY\" in rrule: result[\"interval\"] = 86400\n    elif \"FREQ=HOURLY\" in rrule: result[\"interval\"] = 3600\n    elif \"FREQ=MINUTELY\" in rrule: result[\"interval\"] = 60\n    if \"COUNT=\" in rrule:\n        m = re.search(r\"COUNT=(\\d+)\", rrule)\n        if m: result[\"count\"] = int(m.group(1))\n    return result\n```\n\nA daemon thread wakes once a minute, asks SQLite for `WHERE enabled=1 AND next_run <= now`\n\n, fires each one's `handler`\n\n, and bumps `next_run`\n\nby the interval. That's the entire automation system. It's missing timezones, exceptions, and DST handling, but for \"run this every hour\" it is correct and reliable.\n\n`ComputePool`\n\nkeeps `self.jobs`\n\nand `self.nodes`\n\nas in-memory dicts protected by a `threading.Lock`\n\n. Heartbeats update `last_heartbeat`\n\n; dispatch sorts pending jobs by `-priority`\n\nand assigns the top one to the next polling node. No leader election, no Raft, no gossip protocol.\n\n``` php\ndef assign_job(self, node_id: str) -> dict | None:\n    with self.lock:\n        pending = [j for j in self.jobs.values() if j[\"status\"] == \"pending\"]\n        if not pending: return None\n        pending.sort(key=lambda x: -x[\"priority\"])\n        job = pending[0]\n        job[\"status\"] = \"assigned\"\n        job[\"assigned_node\"] = node_id\n        if node_id in self.nodes:\n            self.nodes[node_id][\"status\"] = \"busy\"\n        return job\n```\n\nThis is a real footgun: in-process state means a process restart loses every pending job. For a *real* grid you'd want this in Postgres with row-level locks. But for \"let me run a job on my second laptop\", `pip install`\n\nis the whole onboarding.\n\nThree things are *not* in the package and probably never will be:\n\n`zo`\n\nand call `zo.ask(...)`\n\nfrom a Flask route, a Tk window, a Discord bot, a cron job.`whoami()`\n\nreturns the local username. If you want a team plan, fork the repo.`nomic-embed-text`\n\n(private, free, runs on the same box) and the interface stays the same.\n\n```\ngit clone https://github.com/AmSach/ZoClone\ncd ZoClone && pip install aiohttp playwright\npython -m playwright install chromium\npython -c \"from src.zo import zo; print(zo.ask('test-conv', 'hi'))\"\n```\n\nIf you want a skill added, drop a folder in `skills/`\n\nwith a `SKILL.md`\n\n+ `scripts/foo.py`\n\nand open a PR. I merge in 24 hours. If you find a real bug in one of the seven subsystems, open an issue with a minimal repro — there are only 775 lines to search.\n\n*Seven files, one Python process, no cloud dependency. The shape matters more than the scale.*", "url": "https://wpnews.pro/news/i-rebuilt-zo-computer-s-seven-subsystems-in-800-lines-of-python-here-s-the-the-i", "canonical_source": "https://dev.to/aman_sachan_126d19c4a2773/i-rebuilt-zo-computers-seven-subsystems-in-800-lines-of-python-heres-the-architecture-the-2757", "published_at": "2026-06-13 01:48:38+00:00", "updated_at": "2026-06-13 02:17:17.811629+00:00", "lang": "en", "topics": ["ai-agents", "developer-tools", "artificial-intelligence", "ai-infrastructure"], "entities": ["Zo Computer", "ZoClone", "Groq", "OpenAI", "Anthropic", "Ollama", "Playwright", "AmSach"], "alternates": {"html": "https://wpnews.pro/news/i-rebuilt-zo-computer-s-seven-subsystems-in-800-lines-of-python-here-s-the-the-i", "markdown": "https://wpnews.pro/news/i-rebuilt-zo-computer-s-seven-subsystems-in-800-lines-of-python-here-s-the-the-i.md", "text": "https://wpnews.pro/news/i-rebuilt-zo-computer-s-seven-subsystems-in-800-lines-of-python-here-s-the-the-i.txt", "jsonld": "https://wpnews.pro/news/i-rebuilt-zo-computer-s-seven-subsystems-in-800-lines-of-python-here-s-the-the-i.jsonld"}}