# ICECODE — Self-hosted AI Agent Platform: Multi-Agent Swarm, Local RAG, 26-Page Web UI, 34 Platform Gateways

> Source: <https://dev.to/soros_02159c95a2582796088/icecode-self-hosted-ai-agent-platform-multi-agent-swarm-local-rag-26-page-web-ui-34-platform-ig3>
> Published: 2026-05-21 05:36:50+00:00

ICECODE is a self-hosted AI agent platform I built by unifying three open-source projects (Hermes, OpenCode, ClawX) into one cohesive system. It runs entirely on your machine — no data leaves your network.

## What it does

-
**Multi-Agent Swarm**— pipeline and parallel orchestration: agents collaborate, pass context, and vote on answers -** Local RAG**— FAISS + sentence-transformers, 100% offline — index any file, search semantically -** 26-Page Web UI**— dashboard, chat, kanban, goals, swarm, knowledge, MCP, benchmark, and more -** Cost Optimizer**— semantic cache (skip duplicate API calls), context compressor, smart model router -** 174 REST API routes** with auto-docs (FastAPI) -**34 platform gateways**— WhatsApp, Telegram, Discord, Slack, Email, WeChat, Feishu, and 27 more -** 14 LLM providers**— Anthropic, OpenAI, Ollama, Gemini, Mistral, Bedrock, Azure, OpenRouter... -** 90+ agent tools**— file, web, browser, terminal, vision, kanban, MCP, code execution -** Self-learning skills**— agents learn new skills at runtime, stored across sessions -** Reinforcement learning** environment for agent improvement -**WebSocket chat**— bidirectional with cancel support -** Token tracking + cost**— per-session usage across all providers

## Quick start

[1m

╔══════════════════════════════════════════════════════╗

║ ICECODE Super-Agent Network — Installer ║

║ v2.0.0 ║

╚══════════════════════════════════════════════════════╝

[0m

[0;33m→[0m Checking Python version...

[0;32m✓[0m Python 3.12 found

[0;32m✓[0m Virtual environment already exists

[0;33m→[0m Installing Python dependencies...

Starting ICECODE server on [http://localhost:13210](http://localhost:13210)...

[32m23:36:33[0m | [1mINFO [0m | React UI served at /desktop/

INFO: Started server process [423207]

INFO: Waiting for application startup.

[32m23:36:33[0m | [1mINFO [0m | ============================================================

[32m23:36:33[0m | [1mINFO [0m | ICECODE Super-Agent Network v2.0.0 starting...

[32m23:36:33[0m | [1mINFO [0m | Port: 13210

[32m23:36:33[0m | [1mINFO [0m | DB: ~/.icecode/data/icecode.db

[32m23:36:33[0m | [1mINFO [0m | Home: /home/claudiu/.icecode

[32m23:36:33[0m | [1mINFO [0m | ============================================================

[32m23:36:33[0m | [1mINFO [0m | [✓] Database initialized at ~/.icecode/data/icecode.db

[32m23:36:33[0m | [1mINFO [0m | [✓] Self-learning system ready

[32m23:36:33[0m | [1mINFO [0m | [✓] Cron scheduler ready

[32m23:36:33[0m | [1mINFO [0m | [✓] Goals system (Ralph Loop) ready

[32m23:36:33[0m | [1mINFO [0m | [✓] Knowledge auto-index task started

[32m23:36:33[0m | [1mINFO [0m | All ICECODE systems online.

INFO: Application startup complete.

ERROR: [Errno 98] error while attempting to bind on address ('0.0.0.0', 13210): address already in use

INFO: Waiting for application shutdown.

[32m23:36:33[0m | [1mINFO [0m | ICECODE shutting down...

INFO: Application shutdown complete.

## Architecture

## Cost Optimizer (new in v2)

The cost optimizer has three components that work together automatically:**Semantic Cache**— uses sentence-transformer embeddings + cosine similarity (≥0.92 threshold). If you ask a semantically similar question to a previous one, it returns the cached answer instantly — zero API tokens consumed.**Context Compressor**— when conversation history exceeds 3000 tokens, older messages are summarized instead of sent verbatim. Keeps the last 6 messages intact.**Smart Model Router**— analyzes prompt complexity (score 1-10) and routes to the cheapest capable model. Simple questions go to cheap models, complex reasoning goes to powerful ones.

## Multi-Agent Swarm

Two orchestration modes:**Pipeline**: agents run sequentially, each building on the previous output.** Parallel**: all agents receive the same input simultaneously, results are merged.

Built-in templates: Research & Write, Code Review, Brainstorm, Security Audit.

## Local RAG

No API keys needed. No data leaves your machine.

- Supports , , , , , , ,
- Chunk strategy: sliding window (512 tokens, 50 overlap)
- Embeddings: (90MB, runs locally)
- Vector store: FAISS IndexFlatL2
- Persistence:

Index a directory:

{"detail":[{"type":"json_invalid","loc":["body",0],"msg":"JSON decode error","input":{},"ctx":{"error":"Expecting value"}}]}

## Tech stack**Backend**— Python 3.12 · FastAPI · Uvicorn · SQLite · Pydantic v2 · FAISS · sentence-transformers · Loguru** Frontend**— Single HTML file, no build step · Pure JS ES2022 · CSS variables · Server-Sent Events + WebSocket** TypeScript**— pnpm workspaces · turbo · CLI with Ink TUI · 14 LLM provider protocols · MCP client** Desktop**— Electron 33** Infrastructure** — Docker · GitHub Actions CI · pytest (108 tests) · ruff

## Links

- GitHub:
[https://github.com/iceslim409/icecode](https://github.com/iceslim409/icecode) - Demo GIF in README shows all major pages
- MIT-style non-commercial license (ICECODE-NC-1.0)

If you try it, I'd love to hear what you think. Issues and PRs welcome.