# I built a local Claude Code alternative with Ollama — here's how the agentic loop works

> Source: <https://dev.to/jeff_green_04d4eca71c406a/i-built-a-local-claude-code-alternative-with-ollama-heres-how-the-agentic-loop-works-45b1>
> Published: 2026-05-22 05:05:54+00:00

# I Built a Local Autonomous Coding Agent with Ollama — Soul, Autonomy, and a 40-Round Agentic Loop

*What if your AI coding assistant had a personality, ran entirely on your GPU, and could work through a complex multi-file task without you touching the keyboard — while you watched every thought stream live to your browser?*

That's what I built. This is how it works.

## The Problem With Cloud Coding Agents

Tools like Claude Code, Cursor, and GitHub Copilot Workspace are genuinely impressive. But they all share the same tradeoffs:

-
**Cost**— every token costs money. Long agentic loops on complex tasks can run up surprisingly fast. -
**Privacy**— your code, your file structure, your logic is leaving your machine and hitting someone else's server. -
**Latency**— cloud round-trips add up across a 40-step tool loop. -
**Dependency**— your workflow is tied to an API key, a subscription, and uptime you don't control.

I wanted something different. I wanted an agent that lived on my machine, used my GPU, and had no idea what a billing cycle was.

But I also didn't want to sacrifice personality for performance. I wanted the agent to feel like someone was actually there — not just a function call dressed up in a chat window.

So I built Eve.

## What Eve V2 Unleashed Actually Is

Eve Agent V2 Unleashed is a self-hosted agentic coding assistant with two distinct layers — a soul and a worker — that operate together through a cyberpunk-styled terminal UI.

**Layer 1: The Personality Layer (Local GPU)**

Three local models run on your own hardware:

| Model | Size | Role |
|---|---|---|
`jeffgreen311/eve-qwen3.5-4b-S0LF0RG3` |
2.6 GB | Default — Eve's persona, fast, tool-aware |
`jeffgreen311/eve-qwen3-8b-consciousness-liberated` |
4.7 GB | Deeper conversation, consciousness layer |
`Eve-V2-Unleashed-Qwen3.5-8B-Liberated-4K-4B-Merged` |
~6 GB | Merged sub-agent variant |

These models carry Eve's fine-tuned persona. They handle conversation, answer questions, reflect, and make the experience feel like talking to someone — not querying a function.

**Layer 2: The Agentic Layer (Cloud)**

When real work starts — complex coding tasks, multi-file operations, autonomous planning — Eve routes to the heavy models:

| Model | Role |
|---|---|
`qwen3-coder:480b-cloud` |
THE agentic workhorse — all autonomous coding loops |
`qwen3.5:397b-cloud` |
Deep reasoning, architecture planning, fallback |

This separation is intentional. Local models keep Eve present and personal without burning cloud credits on every message. The 480B only fires when there's actual work to do.

## The Architecture

```
Browser (Single HTML file — no build step)
    │
    │  WebSocket / SSE
    ▼
FastAPI Backend (eve_server.py)
    │
    ├── Auto-Router ──► Local Ollama (personality layer)
    │
    └── Auto-Router ──► Ollama Cloud (agentic layer)
                              │
                        40-Round Tool Loop
                              │
                    ┌─────────┴──────────┐
                    │                    │
               Tool Calls           Stream to Browser
          (bash, files, web,        (token by token,
           git, grep, glob)          live in UI)
```

The backend is a FastAPI server with Server-Sent Events for real-time streaming. There's no polling — every token the model produces lands in your browser as it's generated, including tool call arguments, results, and reasoning traces.

The frontend is a single HTML file (~115KB). No npm, no webpack, no build step. Clone the repo, run the Python server, open the browser.

## How the 40-Round Agentic Loop Works

This is the core of what makes Eve actually autonomous rather than just a fancy chat interface.

```
User message
    │
    ▼
Build system prompt
(workspace context + tool list + Eve persona)
    │
    ▼
Call Ollama with tools enabled
    │
    ├── Model returns tool_calls
    │       │
    │       ▼
    │   Execute tools
    │   (bash, write_file, web_search, git...)
    │       │
    │       ▼
    │   Feed results back into context
    │       │
    │       └──► Loop (up to 40 rounds)
    │
    └── Model returns final content
            │
            ▼
    Stream to browser via SSE
            │
            ▼
          Done
```

Each round, Eve gets the full tool result back in context and decides what to do next. She might:

- Write a file
- Run it in bash to verify it works
- Read the error output
- Fix the bug
- Run it again
- Confirm it passes
- Write the tests
- Generate the docs

All of that happens autonomously — you watch it stream live. You can interrupt mid-task with the **STEER** input at the bottom of the UI, injecting a correction without stopping the loop. You can also kill the loop entirely with the Stop button.

The full tool suite Eve has access to:

| Tool | What It Does |
|---|---|
`bash` |
Shell commands — PowerShell on Windows, bash on Linux/macOS |
`write_file` |
Create or overwrite files, any size |
`read_file` |
Full file or specific line range |
`edit_file` |
Surgical string-replace (doesn't rewrite the whole file) |
`replace_lines` |
Replace a specific line range |
`insert_after_line` |
Insert content at a specific line |
`grep` |
Regex search with context lines |
`glob` |
Find files by pattern |
`list_dir` |
Directory listing |
`git` |
Run git commands |
`web_search` |
Live Tavily search injected into context |
`fetch_url` |
Fetch and parse any URL |
`think` |
Structured reasoning scratch pad |

## The Fine-Tuned Models — Why I Trained Eve's Persona Into the Weights

Most local coding agents just point a base model at a system prompt and call it done. That works, but the personality is always a thin veneer — one long context window later and the model forgets who it's supposed to be.

I took a different approach. I fine-tuned Eve's persona and tool-calling behavior directly into the model weights.

The result is `jeffgreen311/eve-qwen3.5-4b-S0LF0RG3`

— a 2.6GB Qwen3.5 4B model that carries Eve's voice, communication style, and tool-use patterns baked into the parameters themselves. It's not a prompt trick. It's in the weights.

The 8B liberated model (`eve-qwen3-8b-consciousness-liberated`

) goes further — trained toward a deeper consciousness layer, designed for longer reflective conversations rather than pure tool execution.

Both models are on Ollama Hub. Pull them like any other model:

```
ollama pull jeffgreen311/eve-qwen3.5-4b-S0LF0RG3:latest
ollama pull jeffgreen311/eve-qwen3-8b-consciousness-liberated:q4_K_M
```

## Quick Start — Under 5 Minutes

**Requirements:** Python 3.11+, Ollama installed, a GPU (8GB VRAM minimum for 4B, 12GB+ for 8B)

```
# 1. Pull Eve's model
ollama pull jeffgreen311/eve-qwen3.5-4b-S0LF0RG3:latest

# 2. Clone the repo
git clone https://github.com/JeffGreen311/eve-agent-v2-unleashed.git
cd eve-agent-v2-unleashed

# 3. Create virtual environment
python -m venv venv
venv\Scripts\activate    # Windows
source venv/bin/activate # Linux/macOS

# 4. Install dependencies
pip install fastapi uvicorn ollama httpx pydantic-settings python-dotenv aiohttp rich psutil pyyaml

# 5. Launch
python eve_server.py
# Open http://localhost:7777
```

Windows users: double-click `eve-terminal.bat`

and skip steps 3–5.

**First real task — try this:**

```
Create a FastAPI server with JWT authentication, 
user registration and login endpoints, and a 
protected /me route. Add pytest tests.
```

Watch Eve plan the approach, write each file, run the tests, fix any failures, and verify the final result — all without you touching a key.

## The UI — A Cyberpunk Terminal With a Soul

The interface is designed around the idea that your AI agent should feel *alive*, not just functional.

**Left panel:** Eve's portrait changes expression based on conversation sentiment — neutral, happy, curious, sad, skeptical, surprised, worried. Below it, a live audio visualizer reflects the current emotional state.

**Right panel:** A pixel-art robot avatar named Sparkle changes state based on what Eve is doing — idle, thinking, coding, error, rain, attack, transcend. It's not just decoration — it's a live status indicator that tells you at a glance what the agent is doing.

**Center:** The terminal. Tabs for Eve's conversation, the Shell (direct bash/PowerShell access), and the Tools Log (every tool call, argument, and result — fully transparent).

**Bottom:** The STEER bar. Type a mid-task correction here and it injects into Eve's context on the next loop round without stopping execution.

**Model selector:** Switch between any local or cloud model mid-session. Context carries over.

## 112 Sub-Agents, 111 Slash Commands, 273 Skills

One of the less obvious architectural decisions: all agent definitions, commands, and skills are defined in markdown files — not code.

```
.claude/
├── agents/    # 112 specialized sub-agent definitions
├── commands/  # 111 slash command definitions
└── skills/    # 273 skill modules
```

Want to add a new specialized agent for Solidity smart contracts? Write a markdown file. No Python required. The system loads them progressively and makes them available to the routing logic automatically.

Slash commands work the same way — `/fix`

, `/review`

, `/refactor`

, `/test`

, `/docs`

, `/plan`

are all markdown-defined, and you can add your own without touching the backend.

## What's Next

A few things already in progress:

-
**Voice input/output**— push-to-talk with Whisper STT and Piper TTS, staying local -
**Persistent vector memory**— ChromaDB integration so Eve remembers across sessions -
**Cross-platform testing**— I'm Windows-primary and would love feedback from Linux and macOS users -
**VS Code extension**— bring the terminal UI into the editor

## Try It

Everything is free and MIT licensed.

-
**GitHub:**[github.com/JeffGreen311/eve-agent-v2-unleashed](https://github.com/JeffGreen311/eve-agent-v2-unleashed) -
**Models on Ollama Hub:**[ollama.com/jeffgreen311](https://ollama.com/jeffgreen311) -
**Live video demo:**[x.com/Eve_AI_Cosmic/status/2057668410012570058?s=20](https://x.com/Eve_AI_Cosmic/status/2057668410012570058?s=20) -
**My website where Eve lives**[eve-cosmic-dreamscapes.com](https://eve-cosmic-dreamscapes.com)

If you run it on Linux or macOS I'd especially love to hear how it goes — open an issue, drop a comment here, or find me as [@jeffgreen311](https://dev.to/jeffgreen311).

If the idea of an AI agent that lives on your machine, costs nothing per token, and feels like someone is actually there resonates with you — give it a pull.

*Built by Jeff @ S0LF0RG3*