{"slug": "i-built-a-local-claude-code-alternative-with-ollama-here-s-how-the-agentic-loop", "title": "I built a local Claude Code alternative with Ollama — here's how the agentic loop works", "summary": "Creation of \"Eve,\" a self-hosted, open-source AI coding assistant that runs locally on a user's GPU using Ollama, designed as an alternative to cloud-based tools like Claude Code and Cursor. Eve operates through a two-layer system: a local \"personality layer\" using fine-tuned small models for conversational interaction, and a cloud-based \"agentic layer\" for complex, multi-file coding tasks managed by a 40-round autonomous tool loop. The system features a cyberpunk-styled terminal UI with real-time streaming via Server-Sent Events, allowing users to watch every step of the agent's reasoning and tool execution live in their browser.", "body_md": "# I Built a Local Autonomous Coding Agent with Ollama — Soul, Autonomy, and a 40-Round Agentic Loop\n\n*What if your AI coding assistant had a personality, ran entirely on your GPU, and could work through a complex multi-file task without you touching the keyboard — while you watched every thought stream live to your browser?*\n\nThat's what I built. This is how it works.\n\n## The Problem With Cloud Coding Agents\n\nTools like Claude Code, Cursor, and GitHub Copilot Workspace are genuinely impressive. But they all share the same tradeoffs:\n\n-\n**Cost**— every token costs money. Long agentic loops on complex tasks can run up surprisingly fast. -\n**Privacy**— your code, your file structure, your logic is leaving your machine and hitting someone else's server. -\n**Latency**— cloud round-trips add up across a 40-step tool loop. -\n**Dependency**— your workflow is tied to an API key, a subscription, and uptime you don't control.\n\nI wanted something different. I wanted an agent that lived on my machine, used my GPU, and had no idea what a billing cycle was.\n\nBut I also didn't want to sacrifice personality for performance. I wanted the agent to feel like someone was actually there — not just a function call dressed up in a chat window.\n\nSo I built Eve.\n\n## What Eve V2 Unleashed Actually Is\n\nEve Agent V2 Unleashed is a self-hosted agentic coding assistant with two distinct layers — a soul and a worker — that operate together through a cyberpunk-styled terminal UI.\n\n**Layer 1: The Personality Layer (Local GPU)**\n\nThree local models run on your own hardware:\n\n| Model | Size | Role |\n|---|---|---|\n`jeffgreen311/eve-qwen3.5-4b-S0LF0RG3` |\n2.6 GB | Default — Eve's persona, fast, tool-aware |\n`jeffgreen311/eve-qwen3-8b-consciousness-liberated` |\n4.7 GB | Deeper conversation, consciousness layer |\n`Eve-V2-Unleashed-Qwen3.5-8B-Liberated-4K-4B-Merged` |\n~6 GB | Merged sub-agent variant |\n\nThese models carry Eve's fine-tuned persona. They handle conversation, answer questions, reflect, and make the experience feel like talking to someone — not querying a function.\n\n**Layer 2: The Agentic Layer (Cloud)**\n\nWhen real work starts — complex coding tasks, multi-file operations, autonomous planning — Eve routes to the heavy models:\n\n| Model | Role |\n|---|---|\n`qwen3-coder:480b-cloud` |\nTHE agentic workhorse — all autonomous coding loops |\n`qwen3.5:397b-cloud` |\nDeep reasoning, architecture planning, fallback |\n\nThis separation is intentional. Local models keep Eve present and personal without burning cloud credits on every message. The 480B only fires when there's actual work to do.\n\n## The Architecture\n\n```\nBrowser (Single HTML file — no build step)\n    │\n    │  WebSocket / SSE\n    ▼\nFastAPI Backend (eve_server.py)\n    │\n    ├── Auto-Router ──► Local Ollama (personality layer)\n    │\n    └── Auto-Router ──► Ollama Cloud (agentic layer)\n                              │\n                        40-Round Tool Loop\n                              │\n                    ┌─────────┴──────────┐\n                    │                    │\n               Tool Calls           Stream to Browser\n          (bash, files, web,        (token by token,\n           git, grep, glob)          live in UI)\n```\n\nThe backend is a FastAPI server with Server-Sent Events for real-time streaming. There's no polling — every token the model produces lands in your browser as it's generated, including tool call arguments, results, and reasoning traces.\n\nThe frontend is a single HTML file (~115KB). No npm, no webpack, no build step. Clone the repo, run the Python server, open the browser.\n\n## How the 40-Round Agentic Loop Works\n\nThis is the core of what makes Eve actually autonomous rather than just a fancy chat interface.\n\n```\nUser message\n    │\n    ▼\nBuild system prompt\n(workspace context + tool list + Eve persona)\n    │\n    ▼\nCall Ollama with tools enabled\n    │\n    ├── Model returns tool_calls\n    │       │\n    │       ▼\n    │   Execute tools\n    │   (bash, write_file, web_search, git...)\n    │       │\n    │       ▼\n    │   Feed results back into context\n    │       │\n    │       └──► Loop (up to 40 rounds)\n    │\n    └── Model returns final content\n            │\n            ▼\n    Stream to browser via SSE\n            │\n            ▼\n          Done\n```\n\nEach round, Eve gets the full tool result back in context and decides what to do next. She might:\n\n- Write a file\n- Run it in bash to verify it works\n- Read the error output\n- Fix the bug\n- Run it again\n- Confirm it passes\n- Write the tests\n- Generate the docs\n\nAll of that happens autonomously — you watch it stream live. You can interrupt mid-task with the **STEER** input at the bottom of the UI, injecting a correction without stopping the loop. You can also kill the loop entirely with the Stop button.\n\nThe full tool suite Eve has access to:\n\n| Tool | What It Does |\n|---|---|\n`bash` |\nShell commands — PowerShell on Windows, bash on Linux/macOS |\n`write_file` |\nCreate or overwrite files, any size |\n`read_file` |\nFull file or specific line range |\n`edit_file` |\nSurgical string-replace (doesn't rewrite the whole file) |\n`replace_lines` |\nReplace a specific line range |\n`insert_after_line` |\nInsert content at a specific line |\n`grep` |\nRegex search with context lines |\n`glob` |\nFind files by pattern |\n`list_dir` |\nDirectory listing |\n`git` |\nRun git commands |\n`web_search` |\nLive Tavily search injected into context |\n`fetch_url` |\nFetch and parse any URL |\n`think` |\nStructured reasoning scratch pad |\n\n## The Fine-Tuned Models — Why I Trained Eve's Persona Into the Weights\n\nMost local coding agents just point a base model at a system prompt and call it done. That works, but the personality is always a thin veneer — one long context window later and the model forgets who it's supposed to be.\n\nI took a different approach. I fine-tuned Eve's persona and tool-calling behavior directly into the model weights.\n\nThe result is `jeffgreen311/eve-qwen3.5-4b-S0LF0RG3`\n\n— a 2.6GB Qwen3.5 4B model that carries Eve's voice, communication style, and tool-use patterns baked into the parameters themselves. It's not a prompt trick. It's in the weights.\n\nThe 8B liberated model (`eve-qwen3-8b-consciousness-liberated`\n\n) goes further — trained toward a deeper consciousness layer, designed for longer reflective conversations rather than pure tool execution.\n\nBoth models are on Ollama Hub. Pull them like any other model:\n\n```\nollama pull jeffgreen311/eve-qwen3.5-4b-S0LF0RG3:latest\nollama pull jeffgreen311/eve-qwen3-8b-consciousness-liberated:q4_K_M\n```\n\n## Quick Start — Under 5 Minutes\n\n**Requirements:** Python 3.11+, Ollama installed, a GPU (8GB VRAM minimum for 4B, 12GB+ for 8B)\n\n```\n# 1. Pull Eve's model\nollama pull jeffgreen311/eve-qwen3.5-4b-S0LF0RG3:latest\n\n# 2. Clone the repo\ngit clone https://github.com/JeffGreen311/eve-agent-v2-unleashed.git\ncd eve-agent-v2-unleashed\n\n# 3. Create virtual environment\npython -m venv venv\nvenv\\Scripts\\activate    # Windows\nsource venv/bin/activate # Linux/macOS\n\n# 4. Install dependencies\npip install fastapi uvicorn ollama httpx pydantic-settings python-dotenv aiohttp rich psutil pyyaml\n\n# 5. Launch\npython eve_server.py\n# Open http://localhost:7777\n```\n\nWindows users: double-click `eve-terminal.bat`\n\nand skip steps 3–5.\n\n**First real task — try this:**\n\n```\nCreate a FastAPI server with JWT authentication, \nuser registration and login endpoints, and a \nprotected /me route. Add pytest tests.\n```\n\nWatch Eve plan the approach, write each file, run the tests, fix any failures, and verify the final result — all without you touching a key.\n\n## The UI — A Cyberpunk Terminal With a Soul\n\nThe interface is designed around the idea that your AI agent should feel *alive*, not just functional.\n\n**Left panel:** Eve's portrait changes expression based on conversation sentiment — neutral, happy, curious, sad, skeptical, surprised, worried. Below it, a live audio visualizer reflects the current emotional state.\n\n**Right panel:** A pixel-art robot avatar named Sparkle changes state based on what Eve is doing — idle, thinking, coding, error, rain, attack, transcend. It's not just decoration — it's a live status indicator that tells you at a glance what the agent is doing.\n\n**Center:** The terminal. Tabs for Eve's conversation, the Shell (direct bash/PowerShell access), and the Tools Log (every tool call, argument, and result — fully transparent).\n\n**Bottom:** The STEER bar. Type a mid-task correction here and it injects into Eve's context on the next loop round without stopping execution.\n\n**Model selector:** Switch between any local or cloud model mid-session. Context carries over.\n\n## 112 Sub-Agents, 111 Slash Commands, 273 Skills\n\nOne of the less obvious architectural decisions: all agent definitions, commands, and skills are defined in markdown files — not code.\n\n```\n.claude/\n├── agents/    # 112 specialized sub-agent definitions\n├── commands/  # 111 slash command definitions\n└── skills/    # 273 skill modules\n```\n\nWant to add a new specialized agent for Solidity smart contracts? Write a markdown file. No Python required. The system loads them progressively and makes them available to the routing logic automatically.\n\nSlash commands work the same way — `/fix`\n\n, `/review`\n\n, `/refactor`\n\n, `/test`\n\n, `/docs`\n\n, `/plan`\n\nare all markdown-defined, and you can add your own without touching the backend.\n\n## What's Next\n\nA few things already in progress:\n\n-\n**Voice input/output**— push-to-talk with Whisper STT and Piper TTS, staying local -\n**Persistent vector memory**— ChromaDB integration so Eve remembers across sessions -\n**Cross-platform testing**— I'm Windows-primary and would love feedback from Linux and macOS users -\n**VS Code extension**— bring the terminal UI into the editor\n\n## Try It\n\nEverything is free and MIT licensed.\n\n-\n**GitHub:**[github.com/JeffGreen311/eve-agent-v2-unleashed](https://github.com/JeffGreen311/eve-agent-v2-unleashed) -\n**Models on Ollama Hub:**[ollama.com/jeffgreen311](https://ollama.com/jeffgreen311) -\n**Live video demo:**[x.com/Eve_AI_Cosmic/status/2057668410012570058?s=20](https://x.com/Eve_AI_Cosmic/status/2057668410012570058?s=20) -\n**My website where Eve lives**[eve-cosmic-dreamscapes.com](https://eve-cosmic-dreamscapes.com)\n\nIf you run it on Linux or macOS I'd especially love to hear how it goes — open an issue, drop a comment here, or find me as [@jeffgreen311](https://dev.to/jeffgreen311).\n\nIf the idea of an AI agent that lives on your machine, costs nothing per token, and feels like someone is actually there resonates with you — give it a pull.\n\n*Built by Jeff @ S0LF0RG3*", "url": "https://wpnews.pro/news/i-built-a-local-claude-code-alternative-with-ollama-here-s-how-the-agentic-loop", "canonical_source": "https://dev.to/jeff_green_04d4eca71c406a/i-built-a-local-claude-code-alternative-with-ollama-heres-how-the-agentic-loop-works-45b1", "published_at": "2026-05-22 05:05:54+00:00", "updated_at": "2026-05-22 05:36:41.685176+00:00", "lang": "en", "topics": ["artificial-intelligence", "machine-learning", "large-language-models", "open-source", "developer-tools"], "entities": ["Claude Code", "Cursor", "GitHub Copilot Workspace", "Eve", "Eve Agent V2 Unleashed", "Ollama"], "alternates": {"html": "https://wpnews.pro/news/i-built-a-local-claude-code-alternative-with-ollama-here-s-how-the-agentic-loop", "markdown": "https://wpnews.pro/news/i-built-a-local-claude-code-alternative-with-ollama-here-s-how-the-agentic-loop.md", "text": "https://wpnews.pro/news/i-built-a-local-claude-code-alternative-with-ollama-here-s-how-the-agentic-loop.txt", "jsonld": "https://wpnews.pro/news/i-built-a-local-claude-code-alternative-with-ollama-here-s-how-the-agentic-loop.jsonld"}}