Show HN: Phlox – Open-source self-hosted agentic web chat

wpnews.pro

Phlox is a self-hostable chat application with an agentic harness, document RAG, code execution, and MCP integration — running over any model provider: AWS Bedrock or any OpenAI-compatible endpoint (OpenAI, Ollama, vLLM, LiteLLM, LM Studio, local models).

💬 Streaming chat with conversation history, rename/delete, search & export, message edit/regenerate, markdown with highlighted/copyable code,Mermaid diagrams andLaTeX math. - 🤖 Agentic harness(inspired by PI Coder): the model uses tools in a loop — filesystem (read_file

/write_file

/edit_file

/glob

/grep

),run_shell

,execute_python

/execute_node

,search_documents

,web_fetch

, plusplanning(update_todos

),sub-agents(spawn_subagent

),memory(save_memory

), andcheckpoints— each scoped to a per-conversation sandboxed workspace. - 🤝 Human-in-the-loop approvals— on sensitive tools, approve/deny, resume. - 🧰 Code execution with captured output andartifacts shown inline + aWorkspace Files panel to browse/download everything the agent created. - 🗂️ Workspace checkpoints— git-backed snapshots with one-click restore. - 📚 Documents / RAG— upload PDF/DOCX/TXT/MD/code;** hybrid (dense+sparse) searchover Qdrantwith reranking + citations; global or per-conversation scoping. Works offline via a fallback embedder. - 🧠 Cross-conversation memory— durable facts recalled across chats. - 🖼️ Multimodal— attach images to messages for vision models. - 🔌 MCP integration— connect Model Context Protocol servers; their tools join automatically. - 🔀 Any provider— named profiles for Bedrock / OpenAI-compatible endpoints, switchable live, with a connection tester. - 🏠 Runs fully local— point at Ollama**,** LM Studio**, or** vLLM**(any OpenAI-compatible server) for offline, self-hosted inference with no cloud API key; RAG embeddings can run locally too. - 🔐 Auth & multi-user— local accounts (or** Entra ID SSO**),user

/admin

roles, per-user data isolation, anadmin panel(users, MCP, tools, auth). Seedocs/AUTH.md. - 💵 Usage & cost accounting— per-message token/cost in the UI, plus an admin** chargebackview: usage by month × user × department × model**, CSV export for finance, and a durable ledger that keeps a departed user's costs billable after their account is deleted. Seedocs/OBSERVABILITY.md. - ⚙️ Live admin configuration— edit provider profiles (keys write-only), model pricing, resilience, generation defaults, and sandbox limits from an admin-onlyConfiguration panel, applied without a server restart.config.yml

remains the seed. - 📦 Container sandbox— run code in an isolated** Podman/Docker**container with resource limits + network isolation. Seedocs/SANDBOX.md. - 🎨 Theming— Phlox Dark (default) + Phlox Light/Light/Dark/Fred Hutch/Hutch Night/Sandstone, instant switching. Seedocs/THEMING.md. - 🛡️ Per-tool permissions—auto | ask | deny

, with an "Agent mode" toggle.

Doc	What it covers

start heredocs/ROADMAP.md docs/AUTH.mdEntra ID SSO setupdocs/SANDBOX.mdPodman/Docker container code-execution sandboxdocs/OBSERVABILITY.md docs/MCP.md docs/THEMING.md docs/ADDING_A_TOOL.md·docs/ADDING_A_PROVIDER.md AGENTS.mdTwo processes: a FastAPI backend (LLM orchestration, agent harness, MCP, RAG, code exec, auth, SQLite persistence) and a React/Vite frontend. Full details in ** docs/ARCHITECTURE.md**.

backend/   FastAPI app (app/), config.yml, SQLite + Qdrant under data/
frontend/  React + Vite + Tailwind SPA
docs/      ARCHITECTURE, ROADMAP, AUTH, SANDBOX, MCP, THEMING, ADDING_A_*
scripts/   dev.ps1 / dev.sh

Prerequisites: Python 3.11+ with uv,

Node 18+, and a model provider (a local

Ollamais the easiest).

cd backend
uv sync
cp config.yml.example config.yml        # edit: set your provider profile(s)
uv run uvicorn app.main:app --reload --port 8000

cd frontend
npm install
npm run dev                              # open http://localhost:5173

On Windows you can run both with ./scripts/dev.ps1

; on macOS/Linux ./scripts/dev.sh

.

Edit backend/config.yml

(full examples in config.yml.example

). Any OpenAI-compatible server works with type: openai

— just point endpoint

at it. That covers the popular local runtimes, so Phlox can run entirely offline with no cloud API key:

default_profile: local-ollama
profiles:
  local-ollama:
    type: openai
    label: "Ollama (local)"
    endpoint: http://localhost:11434/v1
    api_key: ollama            # required by the client, ignored by Ollama
    model: qwen3.6:35b
    models: [qwen3.6:35b, glm-4.7-flash:latest]
    supports_tools: true       # set false for models without tool-calling

  lmstudio:
    type: openai
    label: "LM-Studio (local)"
    endpoint: http://localhost:1234/v1
    api_key: none            # required by the client, ignored by LM-Studio
    model: qwen/qwen3.6-27b
    models: [qwen/qwen3.6-27b]
    supports_tools: true       # set false for models without tool-calling

The same type: openai

shape also covers OpenAI, LiteLLM, and any other OpenAI-compatible gateway — set the endpoint

and api_key

. For AWS Bedrock, use type: bedrock

with a model

id and aws_region

(credentials resolve via the standard AWS chain; for temporary STS creds also set aws_session_token

).

Define as many profiles as you like and switch between them live in Settings → Model (there's a built-in connection tester). Embeddings for document RAG can also run locally — e.g. Ollama's nomic-embed-text

— so the whole stack stays offline.

Edit config without a restart.config.yml

is the seed; an admin can edit provider profiles, model pricing, resilience, generation defaults, and sandbox limitsliveinSettings → (Admin) Configuration(overrides are stored in the DB and applied immediately). API keys there are write-only/masked. Bootstrap-sensitive settings (auth

,vector_store

, the sandbox runnertype, OTel) stay file-only and need a restart. See[docs/AUTH.md]§admin config.

Auth is on by default with a seeded admin: ** admin / admin**. Manage users, reset passwords, and view/configure SSO under

Settings → (Admin) Users / Authentication.

Change the default admin password and set a real before sharing access — see

auth.jwt_secret

docs/AUTH.md. To run single-user with no login, set

auth.enabled: false

.By default code runs in a local subprocess (fast, trusts the host). For isolation, set sandbox.runner: container

to run each execution in an ephemeral Podman/Docker container with CPU/memory/PID limits and network isolation — see docs/SANDBOX.md.

cd frontend && npm run build      # outputs frontend/dist
cd ../backend && uv run uvicorn app.main:app --port 8000

FastAPI serves the built SPA from frontend/dist

at /

.

The backend has a pytest suite (unit + FastAPI TestClient

API tests + scripted-provider agent-loop/fallback tests); the frontend is verified by a production build. The same checks run in GitHub Actions CI (.github/workflows/ci.yml

) on every push/PR.

cd backend
uv sync --extra dev          # installs ruff + pytest
uv run ruff check app tests
uv run pytest                # or: uv run pytest -k usage   to run a subset

cd ../frontend && npm run build

The tests run against an in-memory/temp SQLite DB with auth.enabled

off (a synthetic dev admin), so no provider credentials or network are needed — agent-loop tests use a built-in scripted "test" provider. Coverage includes the chargeback ledger surviving user deletion (tests/test_api.py::test_usage_ledger_survives_user_deletion

).

backend/evals/run_evals.py

exercises the agent against a real configured provider (tool use, RAG, multi-step). It needs a working config.yml

profile and is not part of CI:

cd backend && uv run python -m evals.run_evals

Auth: change the seededadmin

/admin

and set a strongauth.jwt_secret

(envPHLOX_JWT_SECRET

) before any shared use. Data is isolated per user; admin features are role-gated.Sandbox: the local runner trusts the host (fine for single-user/local). For untrusted/multi-user execution usesandbox.runner: container

(docs/SANDBOX.md).- Mutating/execution tools default to the permission policy; "Agent mode" auto-approves for a turn.ask

Sensitive data (PHI): Postgres, audit logging, secrets management, and data governance are tracked asTier 5 in theroadmapand are required before any deployment touching sensitive data.

source & further reading

github.com — original article

Show HN: Phlox – Open-source self-hosted agentic web chat

Run your AI side-project on zahid.host