Show HN: Phlox – Open-source self-hosted agentic web chat Phlox, an open-source self-hosted agentic web chat application, has been released on GitHub. It supports any model provider including AWS Bedrock and OpenAI-compatible endpoints, and features agentic tool use, document RAG, code execution, MCP integration, and multi-user authentication with cost accounting. Phlox is a self-hostable chat application with an agentic harness, document RAG, code execution, and MCP integration β€” running over any model provider: AWS Bedrock or any OpenAI-compatible endpoint OpenAI, Ollama, vLLM, LiteLLM, LM Studio, local models . - πŸ’¬ Streaming chat with conversation history, rename/delete, search & export, message edit/regenerate, markdown with highlighted/copyable code, Mermaid diagrams and LaTeX math . - πŸ€– Agentic harness inspired by PI Coder : the model uses tools in a loop β€” filesystem read file / write file / edit file / glob / grep , run shell , execute python / execute node , search documents , web fetch , plus planning update todos , sub-agents spawn subagent , memory save memory , and checkpoints β€” each scoped to a per-conversation sandboxed workspace. - 🀝 Human-in-the-loop approvals β€” pause on sensitive tools, approve/deny, resume. - 🧰 Code execution with captured output and artifacts shown inline + a Workspace Files panel to browse/download everything the agent created. - πŸ—‚οΈ Workspace checkpoints β€” git-backed snapshots with one-click restore. - πŸ“š Documents / RAG β€” upload PDF/DOCX/TXT/MD/code; hybrid dense+sparse search over Qdrant with reranking + citations; global or per-conversation scoping. Works offline via a fallback embedder. - 🧠 Cross-conversation memory β€” durable facts recalled across chats. - πŸ–ΌοΈ Multimodal β€” attach images to messages for vision models. - πŸ”Œ MCP integration β€” connect Model Context Protocol servers; their tools join automatically. - πŸ”€ Any provider β€” named profiles for Bedrock / OpenAI-compatible endpoints, switchable live, with a connection tester. - 🏠 Runs fully local β€” point at Ollama , LM Studio , or vLLM any OpenAI-compatible server for offline, self-hosted inference with no cloud API key; RAG embeddings can run locally too. - πŸ” Auth & multi-user β€” local accounts or Entra ID SSO , user / admin roles, per-user data isolation, an admin panel users, MCP, tools, auth . See docs/AUTH.md /robert-mcdermott/phlox/blob/main/docs/AUTH.md . - πŸ’΅ Usage & cost accounting β€” per-message token/cost in the UI, plus an admin chargeback view: usage by month Γ— user Γ— department Γ— model , CSV export for finance, and a durable ledger that keeps a departed user's costs billable after their account is deleted. See docs/OBSERVABILITY.md /robert-mcdermott/phlox/blob/main/docs/OBSERVABILITY.md . - βš™οΈ Live admin configuration β€” edit provider profiles keys write-only , model pricing, resilience, generation defaults, and sandbox limits from an admin-only Configuration panel, applied without a server restart. config.yml remains the seed. - πŸ“¦ Container sandbox β€” run code in an isolated Podman/Docker container with resource limits + network isolation. See docs/SANDBOX.md /robert-mcdermott/phlox/blob/main/docs/SANDBOX.md . - 🎨 Theming β€” Phlox Dark default + Phlox Light/Light/Dark/Fred Hutch/Hutch Night/Sandstone, instant switching. See docs/THEMING.md /robert-mcdermott/phlox/blob/main/docs/THEMING.md . - πŸ›‘οΈ Per-tool permissions β€” auto | ask | deny , with an "Agent mode" toggle. | Doc | What it covers | |---|---| | start here docs/ROADMAP.md /robert-mcdermott/phlox/blob/main/docs/ROADMAP.md docs/AUTH.md /robert-mcdermott/phlox/blob/main/docs/AUTH.md Entra ID SSO setup docs/SANDBOX.md /robert-mcdermott/phlox/blob/main/docs/SANDBOX.md Podman/Docker container code-execution sandbox docs/OBSERVABILITY.md /robert-mcdermott/phlox/blob/main/docs/OBSERVABILITY.md docs/MCP.md /robert-mcdermott/phlox/blob/main/docs/MCP.md docs/THEMING.md /robert-mcdermott/phlox/blob/main/docs/THEMING.md docs/ADDING A TOOL.md /robert-mcdermott/phlox/blob/main/docs/ADDING A TOOL.md Β· docs/ADDING A PROVIDER.md /robert-mcdermott/phlox/blob/main/docs/ADDING A PROVIDER.md AGENTS.md /robert-mcdermott/phlox/blob/main/AGENTS.md Two processes: a FastAPI backend LLM orchestration, agent harness, MCP, RAG, code exec, auth, SQLite persistence and a React/Vite frontend. Full details in docs/ARCHITECTURE.md . backend/ FastAPI app app/ , config.yml, SQLite + Qdrant under data/ frontend/ React + Vite + Tailwind SPA docs/ ARCHITECTURE, ROADMAP, AUTH, SANDBOX, MCP, THEMING, ADDING A scripts/ dev.ps1 / dev.sh Prerequisites: Python 3.11+ with uv https://docs.astral.sh/uv/ , Node 18+ , and a model provider a local Ollama https://ollama.com is the easiest . 1. Backend cd backend uv sync cp config.yml.example config.yml edit: set your provider profile s uv run uvicorn app.main:app --reload --port 8000 2. Frontend separate terminal cd frontend npm install npm run dev open http://localhost:5173 On Windows you can run both with ./scripts/dev.ps1 ; on macOS/Linux ./scripts/dev.sh . Edit backend/config.yml full examples in config.yml.example . Any OpenAI-compatible server works with type: openai β€” just point endpoint at it. That covers the popular local runtimes, so Phlox can run entirely offline with no cloud API key: default profile: local-ollama profiles: local-ollama: type: openai label: "Ollama local " endpoint: http://localhost:11434/v1 api key: ollama required by the client, ignored by Ollama model: qwen3.6:35b Optional: restrict/seed the model dropdown. If omitted, /api/providers tries to list models from the endpoint. models: qwen3.6:35b, glm-4.7-flash:latest supports tools: true set false for models without tool-calling LM Studio local β€” enable its server under the "Developer" tab default port 1234 . lmstudio: type: openai label: "LM-Studio local " endpoint: http://localhost:1234/v1 api key: none required by the client, ignored by LM-Studio model: qwen/qwen3.6-27b Optional: restrict/seed the model dropdown. If omitted, /api/providers tries to list models from the endpoint. models: qwen/qwen3.6-27b supports tools: true set false for models without tool-calling The same type: openai shape also covers OpenAI , LiteLLM , and any other OpenAI-compatible gateway β€” set the endpoint and api key . For AWS Bedrock , use type: bedrock with a model id and aws region credentials resolve via the standard AWS chain; for temporary STS creds also set aws session token . Define as many profiles as you like and switch between them live in Settings β†’ Model there's a built-in connection tester . Embeddings for document RAG can also run locally β€” e.g. Ollama's nomic-embed-text β€” so the whole stack stays offline. Edit config without a restart. config.yml is the seed; an admin can edit provider profiles, model pricing, resilience, generation defaults, and sandbox limitsliveinSettings β†’ Admin Configuration overrides are stored in the DB and applied immediately . API keys there are write-only/masked. Bootstrap-sensitive settings auth , vector store , the sandbox runnertype, OTel stay file-only and need a restart. See docs/AUTH.md Β§admin config. Auth is on by default with a seeded admin: admin / admin . Manage users, reset passwords, and view/configure SSO under Settings β†’ Admin Users / Authentication . Change the default admin password and set a real before sharing access β€” see auth.jwt secret docs/AUTH.md /robert-mcdermott/phlox/blob/main/docs/AUTH.md . To run single-user with no login, set auth.enabled: false .By default code runs in a local subprocess fast, trusts the host . For isolation, set sandbox.runner: container to run each execution in an ephemeral Podman/Docker container with CPU/memory/PID limits and network isolation β€” see docs/SANDBOX.md /robert-mcdermott/phlox/blob/main/docs/SANDBOX.md . cd frontend && npm run build outputs frontend/dist cd ../backend && uv run uvicorn app.main:app --port 8000 FastAPI serves the built SPA from frontend/dist at / . The backend has a pytest suite unit + FastAPI TestClient API tests + scripted-provider agent-loop/fallback tests ; the frontend is verified by a production build. The same checks run in GitHub Actions CI .github/workflows/ci.yml on every push/PR. Backend: lint + tests from backend/ cd backend uv sync --extra dev installs ruff + pytest uv run ruff check app tests uv run pytest or: uv run pytest -k usage to run a subset Frontend: the CI check is the build from frontend/ cd ../frontend && npm run build The tests run against an in-memory/temp SQLite DB with auth.enabled off a synthetic dev admin , so no provider credentials or network are needed β€” agent-loop tests use a built-in scripted "test" provider . Coverage includes the chargeback ledger surviving user deletion tests/test api.py::test usage ledger survives user deletion . backend/evals/run evals.py exercises the agent against a real configured provider tool use, RAG, multi-step . It needs a working config.yml profile and is not part of CI: cd backend && uv run python -m evals.run evals Auth: change the seeded admin / admin and set a strong auth.jwt secret env PHLOX JWT SECRET before any shared use. Data is isolated per user; admin features are role-gated. Sandbox: the local runner trusts the host fine for single-user/local . For untrusted/multi-user execution use sandbox.runner: container docs/SANDBOX.md /robert-mcdermott/phlox/blob/main/docs/SANDBOX.md .- Mutating/execution tools default to the permission policy; "Agent mode" auto-approves for a turn. ask Sensitive data PHI : Postgres, audit logging, secrets management, and data governance are tracked as Tier 5 in the roadmap /robert-mcdermott/phlox/blob/main/docs/ROADMAP.md and are required before any deployment touching sensitive data. Licensed under the Apache License, Version 2.0 β€” see LICENSE /robert-mcdermott/phlox/blob/main/LICENSE . Copyright Β© 2026 Robert McDermott < robert.c.mcdermott@gmail.com mailto:robert.c.mcdermott@gmail.com .