# Show HN: Coding agent with algebraic memory (VSA) instead of RAG

> Source: <https://github.com/vitaliyfedotovpro-art/raidho>
> Published: 2026-06-14 23:44:56+00:00

**A coding agent that plans with one model, executes with another, and remembers what it learns.**

Most coding agents are one model in a tool loop. Raidho splits the work: use a
**smart, expensive model to reason and plan**, a **cheap, fast model to execute**,
and a **durable memory** that carries facts across runs — all
provider-agnostic, with your own API key.

The name is the rune

Raidho(ᚱ) — "journey / movement".

Status: alpha.Tested end-to-end live against both backends (DeepSeek and Claude through the official Anthropic SDK, including the agentic tool-loop). A reproducible real-API benchmark ships in`benchmarks/real_task_opus.py`

with full evidence (`evidence/2026-06-11_opus_vs_raidho/`

): same task, same model — deterministic procedure $0.05 / context-first hybrid $0.116 / pure tool-loop $0.301; the hybrid matched the loop's report quality at ×2.6 less cost. APIs may change before 1.0.

**Reasoning ≠ execution.**`text`

mode (reasoning, no tools) and`code`

mode (agentic tool loop) can run on**different providers**. Plan on Claude, grind on DeepSeek — you choose where the expensive thinking happens and where the cheap doing happens.** Council mode.**Have two providers*debate*a question and a neutral pass distill the consensus (points of agreement, residual disagreements, recommendation) — e.g. Claude vs DeepSeek. Depersonalized and provider-pluggable; no built-in personas.**Durable, structural memory — persists across runs.** The agent remembers`(subject, relation, object)`

facts and recalls the relevant ones into its prompt each turn; it saves new ones itself via a`remember`

tool, and**council verdicts are distilled into facts automatically**. Memory is written to disk per project (`<workdir>/.raidho/memory`

) and reloaded next run — so a decision reached today resurfaces tomorrow, recalled only when relevant (cheap; no history bloat) and across languages (a Russian query finds an English fact). It's a Vector Symbolic Architecture (VSA), not RAG: facts are composed algebraically, similarity is bit-packed (32× less RAM than float, identical ranking). You don't need to know any of that to use it — see[docs/MEMORY.md](/vitaliyfedotovpro-art/raidho/blob/main/docs/MEMORY.md)if you want to.**Gets cheaper with repetition (opt-in).** Turn on auto-distillation and a successful read-only tool-loop is captured as a deterministic procedure: the next similar task replaces the multi-iteration LLM loop with deterministic data-collection + one synthesis call. Heavily gated for safety (read-only commands and pipelines only, a safety-verify pass, neutral fitness that sinks a bad procedure; writes always stay on the LLM path). Measured live (deepseek-chat,`evidence/2026-06-12_autodistill_curve/`

): the win scales with**iteration overhead**, not task size — a repeated multi-step task over small data dropped**×9.6 per repeat (70% over 5 runs)**, while a data-heavy audit (cost dominated by file contents, few iterations to cut) saved ~nothing. Honest rule: it removes repeated per-iteration context cost, not the cost of the data itself.**Tiny and hackable.** The memory core depends only on`numpy`

; the whole agent is a handful of files. Swap providers, tools, or the embedder without fighting a framework.**Bring your own key.** Claude (default), DeepSeek, OpenAI, or any OpenAI-compatible endpoint.

**Guided (recommended)** — one interactive script that explains every step,
verifies your API key live, runs a real smoke test and shows how to use the
agent (concept: [MavKa](https://github.com/MozgAI/MavKa) by MozgAI):

```
bash install.sh
```

**Manual:**

```
pip install -e '.[anthropic]'      # Claude backend (official Anthropic SDK)
pip install -e '.[openai-compat]'  # DeepSeek / OpenAI-compatible (httpx)
pip install -e '.[embed]'          # semantic memory (sentence-transformers)
pip install -e '.[dev]'            # + pytest
```

Python ≥ 3.11.

```
export CODER_PROVIDER=deepseek
export DEEPSEEK_API_KEY=sk-...
coder "create a FastAPI hello-world app and run it"
export CODER_PROVIDER=deepseek          # execution (code mode, tool loop)
export DEEPSEEK_API_KEY=sk-...
export CODER_REASON_PROVIDER=anthropic  # reasoning (text mode)
export ANTHROPIC_API_KEY=sk-ant-...
coder                                    # REPL: /text plans on Claude, /code executes on DeepSeek
```

The expensive model is used only where it earns its keep; the token-heavy tool loop runs on the cheap one.

```
coder                 # interactive REPL (default mode: code)
coder "<task>"        # headless: run one task, print result, exit
```

In the REPL: `/code`

agentic coding, `/text`

reasoning chat, `/ctx`

toggle
context-first, `/learn`

toggle auto-distill, `/council <question>`

two-provider debate → consensus, `/quit`

to
exit. Memory persists per project at `<workdir>/.raidho/memory`

— the REPL shows
how many facts it loaded on start.

``` python
import asyncio
from agent.providers import get_provider
from agent.loop import Session
from agent.memory import AgentMemory

reason = get_provider({"provider": "anthropic", "api_key": "sk-ant-..."})        # smart
execute = get_provider({"provider": "deepseek",  "api_key": "sk-...",            # cheap
                        "model": "deepseek-chat"})

# path=... makes memory persist across runs (omit it for an in-RAM, ephemeral memory)
memory = AgentMemory(path=".raidho/memory")
session = Session(execute, workdir=".", memory=memory, reason_provider=reason)

asyncio.run(session.chat("plan how to add auth to this app"))   # → reason provider
asyncio.run(session.code("implement the plan and add a test"))  # → execution provider
# facts the agent stored are now on disk; a new Session(path=...) reloads them
```

Omit `reason_provider`

and both modes use the single provider.

``` python
from agent.council import Council

council = Council(reason, execute, name_a="claude", name_b="deepseek")
result = await council.consensus("pin exact deps or use ranges?", rounds=2)
print(result["verdict"])      # points of agreement / residual disagreements / recommendation
# result["transcript"] holds the full exchange

# Via a Session with memory, the verdict is auto-distilled into facts and stored:
res = await session.council("pin exact deps or use ranges?")
print(res["remembered"])      # e.g. [("dependencies", "pinned", "exact")] — recalled later
```

Or `Session(...).council("...")`

, which seats `reason_provider`

vs `provider`

.

| Variable | Meaning | Default |
|---|---|---|
`CODER_PROVIDER` |
execution provider: `anthropic` | `deepseek` | `openai` | `openai-compat` |
`anthropic` |
`CODER_MODEL` |
override execution model | provider default |
`CODER_REASON_PROVIDER` |
optional separate provider for `text` /reasoning |
= `CODER_PROVIDER` |
`CODER_REASON_MODEL` |
reasoning model | provider default |
`CODER_BASE_URL` |
endpoint URL for `openai-compat` |
— |
`CODER_CONTEXT_FIRST` |
`1` packs the workspace into the first call (fewer tool iterations) |
off |
`CODER_AUTODISTILL` |
`1` learns read-only procedures from successful runs (gets cheaper with repetition) |
off |
`CODER_MEMORY` |
memory file path; `off` disables persistence |
`<workdir>/.raidho/memory` |
`ANTHROPIC_API_KEY` / `DEEPSEEK_API_KEY` / `OPENAI_API_KEY` / `CODER_API_KEY` |
API keys (provider-specific first, then `CODER_API_KEY` ) |
— |

See [docs/PROVIDERS.md](/vitaliyfedotovpro-art/raidho/blob/main/docs/PROVIDERS.md) for adding a provider and the auth hook.

[docs/ARCHITECTURE.md](/vitaliyfedotovpro-art/raidho/blob/main/docs/ARCHITECTURE.md)— components and data flow.[docs/MEMORY.md](/vitaliyfedotovpro-art/raidho/blob/main/docs/MEMORY.md)— the VSA memory model and bit-packing.[docs/OPENWEBUI.md](/vitaliyfedotovpro-art/raidho/blob/main/docs/OPENWEBUI.md)— drive Raidho from Open WebUI (chat / council / code as selectable models).

- Broader benchmark coverage (success rate on a task set vs. single-model baseline; SWE-bench-style eval) — a first real-API cost benchmark with evidence is already in
`benchmarks/`

+`evidence/`

. - Streaming responses in the Open WebUI plugin (currently the reply lands at once).

Recently shipped: persistent memory across runs · council verdicts saved as facts · context-first mode · auto-picked semantic embedder · automatic Open WebUI setup.

The `bash`

tool runs **unsandboxed** in the working directory; in `code`

mode the
model decides which commands to run. Use Raidho only on code and tasks you trust —
ideally inside a container or a throwaway directory. See [SECURITY.md](/vitaliyfedotovpro-art/raidho/blob/main/SECURITY.md).

Dual-licensed: **AGPL-3.0-or-later** for open-source / research / non-commercial use,
or a commercial license — see [COMMERCIAL.md](/vitaliyfedotovpro-art/raidho/blob/main/COMMERCIAL.md).

See [CONTRIBUTING.md](/vitaliyfedotovpro-art/raidho/blob/main/CONTRIBUTING.md). Issues and pull requests welcome.

— the web interface Raidho plugs into. It's an excellent, polished chat UI and a perfect fit for this agent; rather than reinvent it, Raidho ships a Pipe plugin and the installer can wire itself in automatically. Thanks to the Open WebUI team.[Open WebUI](https://github.com/open-webui/open-webui)— this project's critic throughout its path: his reviews shaped the retry layer, the embedder honesty, the history budget and more. The guided installer ([Oles Lytvyn (MozgAI)](https://github.com/MozgAI)`install.sh`

) follows the concept he pioneered in— an installer that explains everything out of the box ("AI installs itself").[MavKa](https://github.com/MozgAI/MavKa)
