I rebuilt Zo Computer from scratch in 775 lines of Python — here's what stuck and what snapped

wpnews.pro

Zo Computer gives you an AI agent, a skills registry, a compute pool, browser automation, file hosting, scheduled automations, and persistent memory — all on a personal server. I wanted to understand every seam, so I rebuilt the whole thing in vanilla Python 3 with no web framework and no Docker. The result is ZoClone: 10 modules, 775 lines, 4 SQLite tables, one ThreadPoolExecutor. This is what the architecture actually looks like when you strip out the platform.

The main module is ZoClone.__init__

— and that's the entire dependency graph. Each subsystem is an attribute:

class ZoClone:
    def __init__(self):
        self.db = init_db()
        self.executor = ThreadPoolExecutor(max_workers=10)
        self.ai_client = None
        self.pool = pool              # ComputePool singleton
        self.hosting = hosting        # HostingService singleton
        self.memory = memory          # SQLite-backed memory
        self.scheduler = scheduler    # cron-like automations

No DI container, no event bus, no message queue. Every tool is a method on the same object. If you're coming from a microservice background, this is going to look like a 2014 Django app — and that's the point. When you can fit the whole mental model on one screen, you stop second-guessing where a bug lives.

Four tables. No ORM. No migrations. The schema is in a single executescript

block:

CREATE TABLE IF NOT EXISTS conversations(id TEXT PRIMARY KEY, title TEXT, updated_at INTEGER);
CREATE TABLE IF NOT EXISTS messages(id TEXT PRIMARY KEY, conv_id TEXT, role TEXT, content TEXT, tools TEXT, created_at INTEGER);
CREATE TABLE IF NOT EXISTS memory(id TEXT PRIMARY KEY, key TEXT UNIQUE, value TEXT, updated_at INTEGER);
CREATE TABLE IF NOT EXISTS files(id TEXT PRIMARY KEY, path TEXT UNIQUE, content TEXT, encoding TEXT, updated_at INTEGER);

IDs are SHA-256 hashes of (timestamp, content)

truncated to 24 chars. The tools

column on messages

is a freeform JSON blob. The memory

table is a key-value store with UNIQUE

on key

, which forces last-write-wins semantics. When your entire data model is four tables, schema design becomes a five-minute conversation instead of a five-day one.

Skills in Zo are a folder with a SKILL.md

(frontmatter) and a scripts/<name>.py

(handler). I auto-discover them at import time:

def load_skill(name: str, path: Path) -> Skill:
    md_content = path.read_text()
    frontmatter = {}
    if md_content.startswith("---"):
        end = md_content.find("---", 3)
        for line in md_content[3:end].strip().split("\n"):
            if ":" in line:
                k, v = line.split(":", 1)
                frontmatter[k.strip()] = v.strip()

    py_file = path.parent / "scripts" / f"{name}.py"
    spec = importlib.util.spec_from_file_location(name, py_file)
    module = importlib.util.module_from_spec(spec)
    spec..exec_module(module)
    handler = getattr(module, "run", getattr(module, "execute", None))
    return Skill(name=name, description=..., triggers=..., handler=handler)

No registry service, no API call to discover skills. The filesystem is the registry. Drop a folder, restart, it's loaded. The triggers

field in frontmatter is just a comma-separated string — the LLM gets all skill descriptions in its system prompt and decides which one to call. There's no embedding-based retrieval because, at 30 skills, exact-match triggers work fine.

The peer-to-peer compute mesh in ZoClone is a dict of jobs, a dict of nodes, and one threading.Lock

:

def assign_job(self, node_id: str) -> Optional[Dict]:
    with self.lock:
        pending = [j for j in self.jobs.values() if j["status"] == "pending"]
        if not pending:
            return None
        pending.sort(key=lambda x: -x["priority"])
        job = pending[0]
        job["status"] = "assigned"
        job["assigned_node"] = node_id
        return job

That's it. The hub polls, picks the highest-priority pending job, marks the node busy, returns the work. No Redis Streams, no RabbitMQ, no Kafka. The trade-off is obvious: this is a single-process orchestrator, not a horizontally-scalable scheduler. But for a 50-node grid running nightly ML batch jobs, you don't need Kafka. You need a lock and a sort.

GPU tier multipliers, regional pricing, and reputation decay are all JSON columns in the nodes

dict. When you need to add a new pricing rule, you change one line of assign_job

. Compare that to a Kubernetes operator with custom resource definitions, admission webhooks, and reconciler loops.

Zo has a /zo/ask

API that spawns child agent invocations. The clone just calls it:

async def spawn(self, agent_id: str, prompt: str):
    async with aiohttp.ClientSession() as session:
        async with session.post(
            "https://api.zo.computer/zo/ask",
            headers={"authorization": self.api_token, "content-type": "application/json"},
            json={"input": prompt, "model_name": self.model}
        ) as resp:
            return {"agent_id": agent_id, "output": (await resp.json()).get("output", "")}

async def spawn_all(self, agents: list):
    return await asyncio.gather(*[self.spawn(a["id"], a["prompt"]) for a in agents])

Five agent invocations in parallel is asyncio.gather

. No Celery, no RQ, no Dask. The model_name

is hardcoded — there's exactly one LLM driver, and it's whatever Zo gives you. If you want a different model, change one string.

run_command

is subprocess.run(cmd, shell=True)

. The agent can rm -rf ~

and it will. Production Zo wraps this in gVisor; I don't.LIKE '%query%'

scan. Fine at 1k rows, embarrassing at 100k.chat()

call is blocking. You see the full response or nothing.set_key()

writes API keys to a flat JSON file in ~/.zoclone/

. Multi-user means multi-disaster.if __name__ == "__main__"

block that prints the pool status.run_command

in a gVisor container, or at minimum a chroot + seccomp.memory

table for SQLite-vec0 and do real semantic recall.Authorization

header check on every API endpoint. Even internal services.The real lesson wasn't "look how short the code is" — it was "look how much of the platform is just a thin layer over a database, a thread pool, and a few HTTP calls." The parts that are genuinely hard (the LLM orchestration loop, the skill discovery) are maybe 100 lines. The rest is plumbing, and most of the plumbing doesn't need to exist.

Repo: github.com/AmSach/ZoClone

License: MIT

Stack: Python 3.10+, SQLite, requests, aiohttp, no web framework

If you've built a personal-AI clone of your own, drop the repo link in the comments. I want to see how other people split the agent loop from the storage layer.

source & further reading

dev.to — original article Building a Legal Document Analyzer in typescript with NodeJS RAG - Query Transformation and Expansion Building a WhatsApp AI Agent with Gemini Using Gemini as Your Copilot

I rebuilt Zo Computer from scratch in 775 lines of Python — here's what stuck and what snapped

Run your AI side-project on zahid.host