ICECODE — Self-hosted AI Agent Platform: Multi-Agent Swarm, Local RAG, 26-Page Web UI, 34 Platform Gateways ICECODE is a self-hosted AI agent platform that integrates three open-source projects (Hermes, OpenCode, ClawX) into a single system running entirely on a local machine, ensuring no data leaves the user's network. It features a multi-agent swarm with pipeline and parallel orchestration modes, local RAG capabilities, a 26-page web UI, and support for 34 platform gateways. The platform includes a cost optimizer with semantic caching, context compression, and a smart model router, and requires no API keys to operate. ICECODE is a self-hosted AI agent platform I built by unifying three open-source projects Hermes, OpenCode, ClawX into one cohesive system. It runs entirely on your machine — no data leaves your network. What it does - Multi-Agent Swarm — pipeline and parallel orchestration: agents collaborate, pass context, and vote on answers - Local RAG — FAISS + sentence-transformers, 100% offline — index any file, search semantically - 26-Page Web UI — dashboard, chat, kanban, goals, swarm, knowledge, MCP, benchmark, and more - Cost Optimizer — semantic cache skip duplicate API calls , context compressor, smart model router - 174 REST API routes with auto-docs FastAPI - 34 platform gateways — WhatsApp, Telegram, Discord, Slack, Email, WeChat, Feishu, and 27 more - 14 LLM providers — Anthropic, OpenAI, Ollama, Gemini, Mistral, Bedrock, Azure, OpenRouter... - 90+ agent tools — file, web, browser, terminal, vision, kanban, MCP, code execution - Self-learning skills — agents learn new skills at runtime, stored across sessions - Reinforcement learning environment for agent improvement - WebSocket chat — bidirectional with cancel support - Token tracking + cost — per-session usage across all providers Quick start 1m ╔══════════════════════════════════════════════════════╗ ║ ICECODE Super-Agent Network — Installer ║ ║ v2.0.0 ║ ╚══════════════════════════════════════════════════════╝ 0m 0;33m→ 0m Checking Python version... 0;32m✓ 0m Python 3.12 found 0;32m✓ 0m Virtual environment already exists 0;33m→ 0m Installing Python dependencies... Starting ICECODE server on http://localhost:13210 http://localhost:13210 ... 32m23:36:33 0m | 1mINFO 0m | React UI served at /desktop/ INFO: Started server process 423207 INFO: Waiting for application startup. 32m23:36:33 0m | 1mINFO 0m | ============================================================ 32m23:36:33 0m | 1mINFO 0m | ICECODE Super-Agent Network v2.0.0 starting... 32m23:36:33 0m | 1mINFO 0m | Port: 13210 32m23:36:33 0m | 1mINFO 0m | DB: ~/.icecode/data/icecode.db 32m23:36:33 0m | 1mINFO 0m | Home: /home/claudiu/.icecode 32m23:36:33 0m | 1mINFO 0m | ============================================================ 32m23:36:33 0m | 1mINFO 0m | ✓ Database initialized at ~/.icecode/data/icecode.db 32m23:36:33 0m | 1mINFO 0m | ✓ Self-learning system ready 32m23:36:33 0m | 1mINFO 0m | ✓ Cron scheduler ready 32m23:36:33 0m | 1mINFO 0m | ✓ Goals system Ralph Loop ready 32m23:36:33 0m | 1mINFO 0m | ✓ Knowledge auto-index task started 32m23:36:33 0m | 1mINFO 0m | All ICECODE systems online. INFO: Application startup complete. ERROR: Errno 98 error while attempting to bind on address '0.0.0.0', 13210 : address already in use INFO: Waiting for application shutdown. 32m23:36:33 0m | 1mINFO 0m | ICECODE shutting down... INFO: Application shutdown complete. Architecture Cost Optimizer new in v2 The cost optimizer has three components that work together automatically: Semantic Cache — uses sentence-transformer embeddings + cosine similarity ≥0.92 threshold . If you ask a semantically similar question to a previous one, it returns the cached answer instantly — zero API tokens consumed. Context Compressor — when conversation history exceeds 3000 tokens, older messages are summarized instead of sent verbatim. Keeps the last 6 messages intact. Smart Model Router — analyzes prompt complexity score 1-10 and routes to the cheapest capable model. Simple questions go to cheap models, complex reasoning goes to powerful ones. Multi-Agent Swarm Two orchestration modes: Pipeline : agents run sequentially, each building on the previous output. Parallel : all agents receive the same input simultaneously, results are merged. Built-in templates: Research & Write, Code Review, Brainstorm, Security Audit. Local RAG No API keys needed. No data leaves your machine. - Supports , , , , , , , - Chunk strategy: sliding window 512 tokens, 50 overlap - Embeddings: 90MB, runs locally - Vector store: FAISS IndexFlatL2 - Persistence: Index a directory: {"detail": {"type":"json invalid","loc": "body",0 ,"msg":"JSON decode error","input":{},"ctx":{"error":"Expecting value"}} } Tech stack Backend — Python 3.12 · FastAPI · Uvicorn · SQLite · Pydantic v2 · FAISS · sentence-transformers · Loguru Frontend — Single HTML file, no build step · Pure JS ES2022 · CSS variables · Server-Sent Events + WebSocket TypeScript — pnpm workspaces · turbo · CLI with Ink TUI · 14 LLM provider protocols · MCP client Desktop — Electron 33 Infrastructure — Docker · GitHub Actions CI · pytest 108 tests · ruff Links - GitHub: https://github.com/iceslim409/icecode https://github.com/iceslim409/icecode - Demo GIF in README shows all major pages - MIT-style non-commercial license ICECODE-NC-1.0 If you try it, I'd love to hear what you think. Issues and PRs welcome.