{"slug": "hestia-a-local-first-home-assistant-that-trusts-timers-over-the-llm", "title": "Hestia – a local-first Home Assistant that trusts timers over the LLM", "summary": "Hestia, a local-first, self-hosted home assistant, runs a local LLM on user-owned hardware with no cloud or internet exposure. It delegates deterministic tasks to timers and databases, using the LLM only for judgment and conversation, and integrates with Home Assistant, Plex, and media automation tools. The project prioritizes reliability and privacy over a smarter brain.", "body_md": "A local-first, self-hosted assistant for your home. One stateful \"brain\" runs a local LLM on hardware you own, and every window into it — your phone, a terminal, the kitchen mic, Home Assistant — talks to that same brain. Nothing runs in the cloud, nothing is exposed to the internet, and your data never leaves the house.\n\n**The idea it's built on.** Most \"AI for the home\" points the model at the things it's *worst*\nat: remembering a schedule, watching a threshold, firing a reminder at the right minute. Hestia\ndoes the opposite. Anything deterministic — a chore is due, the soil is dry, the trash goes out\nTuesday — is handed to something dumb and reliable: a timer, a record, a row in a database. The\nLLM is left to do the one thing it's genuinely good at, which is judgment and conversation. The\ngoal was never a smarter brain. It's a more reliable one. ([ARCHITECTURE.md](/thefullnacho/hestia/blob/main/ARCHITECTURE.md) is\nthe long version; [MEMORY-DESIGN.md](/thefullnacho/hestia/blob/main/MEMORY-DESIGN.md) covers the memory plan.)\n\n**What it actually is.**\n\n**A brain**(`brain/`\n\n) — an OpenAI-compatible endpoint (`POST /v1/chat/completions`\n\n) wrapping a local LLM (Ollama,`qwen3:14b`\n\n) with an agent loop. Every client speaks one dialect.**Eight scoped tools**—`home`\n\n(control Home Assistant),`media`\n\n(Plex + *arr),`memory`\n\n,`records`\n\n,`reminder`\n\n,`search`\n\n,`status`\n\n,`weather`\n\n. There is deliberately**no shell tool**: the brain can act in your house but cannot run arbitrary commands.** Memory that grows**— markdown soft-facts plus a SQLite record of the things in your life (pets, garden, wildlife, chores), and a background note-taker that*proposes*durable facts for you to approve rather than writing them silently.**A media appliance**— Plex + the *arr stack + Bazarr subtitles + qBittorrent behind a fail-closed VPN kill-switch.**Voice**— talk to it through Home Assistant's Assist pipeline or the browser.\n\n**What it isn't.** A cloud service, a wrapper around someone else's API, or anything you should put\non the public internet. It runs rootless on your own box and never phones home.\n\n⚠️ ReadThe brain has no built-in authentication and can control your devices, so it must stay on a private network (Tailscale or LAN). That's a deliberate trade-off, not an oversight — the doc explains the trust model.[SECURITY.md]before running it.\n\nHestia is part of the **Forager / Homesteader Labs** constellation, alongside `forager_ml`\n\n,\n`forager-field-station`\n\n, and the Homesteader Labs site.\n\n**Phase 0 — Reach + brain**✅ — talk to your home model from your phone (details below).** Phase 1 — Media appliance**✅ — Plex + qBittorrent + gluetun VPN kill-switch (verified) + the *arr automation layer (Prowlarr/Sonarr/Radarr + FlareSolverr + Bazarr subtitles). Full loop: search → download (via VPN) → hardlink → Plex.**Phase 2 — House (Home Assistant)**✅ — HA running; lights and devices reachable via the`home`\n\ntool.**Phase 3 — Voice**✅ — speak to Hestia through HA's Assist pipeline and a browser voice loop.** Phase 4 — The seam (memory + tools)**✅*core in place, still growing*— the brain is a tool-calling agent with the eight tools above plus deterministic skill injection, and**HA's conversation agent points at Hestia**, so Assist and voice route through the brain (which can control HA back). It also gets smarter over time via the note-taker (see*Memory & learning*). Next: vision (Eyes).\n\nWin: talk to your home model from your phone.\n\nThe brain (`brain/`\n\n) is a thin OpenAI-compatible proxy onto Ollama. Every client —\nterminal, phone, kitchen mic — speaks one dialect (`POST /v1/chat/completions`\n\n).\nIn Phase 0 it forces the chosen model, injects Hestia's system prompt (persona +\nthe hardened safety rules from the benchmark A/B), and streams the reply back.\nMemory and tools land in Phase 4 behind this same URL.\n\n| Service | What | Bind | GPU |\n|---|---|---|---|\n`hestia-ollama` |\nOllama inference engine | `127.0.0.1:11434` (localhost only) |\nRTX 5080 only |\n`hestia-brain` |\nHestia `/v1` proxy |\n`0.0.0.0:8730` (reachable over Tailscale) |\n— |\n\nBoth are **user** systemd services (no root), defined in `deploy/systemd/`\n\nand\ninstalled into `~/.config/systemd/user/`\n\n. Linger is enabled, so they survive\nlogout/reboot. Ollama is pinned to the 5080 (`CUDA_VISIBLE_DEVICES`\n\n), leaving the\n4060 Ti free for Phase 3 (Whisper/Piper) per the benchmark verdict.\n\nModel: ** qwen3:14b** (resident, thinking off) — the current pick after the model eval\n(\n\n`brain/eval_models.py`\n\n; `qwen2.5:14b`\n\nkept on disk as a fallback). See `MODEL_EVAL.md`\n\n.Day to day, use `deploy/hestiactl`\n\n(symlinked into `~/.local/bin`\n\n) — one command\nfor the whole estate, run from the GPU box:\n\n```\nhestiactl status              # brain health + local units + every container on hl-relay\nhestiactl health              # raw /health JSON\nhestiactl up|down|restart X   # X: brain ollama | arr services | plex qbit ha adguard ... | all\nhestiactl logs X [-f]         # journalctl (local) or docker logs (remote)\nhestiactl vpn                 # verify the qBittorrent kill-switch\n```\n\n`all`\n\ncovers only the Hestia-managed pieces (local units + arr stack); core\ncontainers (AdGuard = house DNS, gluetun, HA) are controlled one at a time and\nask for confirmation before stopping.\n\nThe underlying commands, for when you need them directly:\n\n```\n# status / logs\nsystemctl --user status hestia-ollama hestia-brain\njournalctl --user -u hestia-brain -f\n\n# restart after editing brain code or a service file\nsystemctl --user daemon-reload          # only if you edited a .service\nsystemctl --user restart hestia-brain\n\n# health (Ollama up + model present?) — brain binds the Tailscale IP, not localhost\ncurl -s 127.0.0.1:8730/health | jq\n\n# talk to it\ncurl -s 127.0.0.1:8730/v1/chat/completions -H 'content-type: application/json' \\\n  -d '{\"messages\":[{\"role\":\"user\",\"content\":\"hello Hestia\"}]}' | jq -r .choices[0].message.content\n```\n\nIf you edit a `deploy/systemd/*.service`\n\nfile, re-copy it into\n`~/.config/systemd/user/`\n\nbefore `daemon-reload`\n\n.\n\nTailscale is the one piece that needs root, so it isn't auto-installed. On the GPU box:\n\n```\ncurl -fsSL https://tailscale.com/install.sh | sh\nsudo tailscale up\n```\n\nThen on the phone: install the Tailscale app, sign in to the same tailnet. The\nbrain is then reachable at `http://<gpu-box-tailscale-name>:8730/v1`\n\nfrom any app\nthat speaks OpenAI (set that as the base URL; any API key string works — Ollama\nignores it). Nothing is exposed to the public internet.\n\n```\nbrain/\n  hestia.py       # the agent loop: /v1/chat/completions + /health, tools, memory, note-taker hook\n  config.py       # single source of paths + secret loading; makes the brain relocatable\n  prompt.py       # SYSTEM_PROMPT — persona + hardened safety rules\n  records_store.py / memory_store.py   # SQLite entities+events / markdown soft facts\n  note_taker.py   # background \"gets smarter over time\" extractor\n  review_notes.py # CLI to review + promote the note-taker's proposals\n  tools/          # home, media, memory, records, reminder, search, status, weather (+ skill router)\n  tests/          # pytest: stores, dispatch, note-taker (run: uv run --project brain pytest)\n  pyproject.toml  # deps + dev (pytest) + pytest config (uv-managed, isolated venv)\n```\n\n**Relocatable.** Every path derives from `config.py`\n\n's own location, so moving or\nrestoring the repo to a new path needs no edits; `HESTIA_ROOT`\n\noverrides if needed. All\nservice URLs, tokens, and thresholds stay env-overridable next to the tools that use them.\n\nWin: the media stack runs, independent of the brain.\n\nMost of this already existed on the Micro before Hestia: **Plex** (`hl-plex`\n\n),\n**qBittorrent** behind **gluetun** (Surfshark, OpenVPN, NL) with a **fail-closed VPN\nkill-switch**, plus AdGuard, MQTT, and Home Assistant. The kill-switch is verified:\nqBittorrent's traffic egresses via the VPN datacenter IP, not the host's. Don't\n`docker compose up`\n\nthe existing `/opt/home/compose.yml`\n\nblindly — its volume paths\nare literal `/path/to/...`\n\nhost dirs that the running containers depend on.\n\nHestia added the missing **automation layer** as a separate, isolated stack\n(`deploy/media/compose.yml`\n\n, deployed to `/opt/home/arr/`\n\n): **Prowlarr** (:9696,\nindexer manager), **Sonarr** (:8989, TV), **Radarr** (:7878, movies). All reachable\nover Tailscale.\n\nAlso added **FlareSolverr** (:8191) so Prowlarr can reach Cloudflare-protected\nindexers, wired as a Prowlarr indexer-proxy (tag `flaresolverr`\n\n).\n\nWired via API: root folders point at the existing Plex library\n(`/data/TV Shows`\n\n, `/data/Movies`\n\n); a remote-path mapping (`/downloads`\n\n→\n`/data/downloads`\n\n) lets Sonarr/Radarr **hardlink** from qBittorrent's downloads into\nthe library (instant, no copy — both are one filesystem under `/mnt/media`\n\n); Prowlarr\nis connected to Sonarr + Radarr (`fullSync`\n\n). Five reputable **public indexers** added\n(The Pirate Bay, Knaben, LimeTorrents, plus 1337x + EZTV via FlareSolverr) and synced\ndown to the apps. YTS deliberately excluded (history of feeding user data to copyright trolls).\n\n**qBittorrent** is wired as the download client in both Sonarr (category `tv-sonarr`\n\n)\nand Radarr (`radarr`\n\n), tested OK. The full loop works: search → download through the\nVPN → hardlink into the Plex library. Both apps report no health warnings.\n\n```\ncd /opt/home/arr\ndocker compose ps\ndocker compose pull && docker compose up -d   # update *arr\n```\n\n`deploy/ha/custom_components/hestia/`\n\nis a thin custom HA integration: it registers a\nconversation agent (`conversation.hestia`\n\n) that forwards each utterance to Hestia's\n`/v1`\n\nand speaks the reply. Hestia owns the loop (memory + tools, incl. controlling\nHA back); HA is just input + a tool. This is the architecture's keystone made real.\n\nWiring on `hl-relay`\n\n(not in this repo — lives in HA's config):\n\n- Integration files installed to\n`/opt/home/ha_config/custom_components/hestia/`\n\n. - A config entry points it at\n`http://127.0.0.1:8730/v1/chat/completions`\n\n(Hestia over Tailscale; the HA container can reach it). - The preferred Assist pipeline's\n`conversation_engine`\n\nis set to`conversation.hestia`\n\n, so the Assist chat and voice satellites route through the brain.\n\nVerified: via HA's conversation API, \"turn on the TV light\" drove the real light and \"what coffee should I buy?\" recalled a memory — HA → Hestia → HA round trip.\n\nTwo stores back the brain: `memory_store`\n\n(markdown soft facts/preferences, git-auditable)\nand `records_store`\n\n(SQLite entities + a uniform event log: pets/lineage, wildlife, chores,\nservice reminders, the garden). Both are injected into the system prompt per request, scoped\nto what the request implies.\n\nThe brain also learns passively. After each exchange — once the answer is already on the\nwire — a background **note-taker** (`note_taker.py`\n\n) reads the turn and proposes durable\nfacts it heard (\"trash pickup is Tuesday mornings\"). True to *propose, don't dispose*, those\nland in a review inbox (`memory/inbox/`\n\n), **not** straight into live memory:\n\n```\nuv run --project brain python brain/review_notes.py list\nuv run --project brain python brain/review_notes.py promote <id> | --all\nuv run --project brain python brain/review_notes.py discard <id> | --all\n```\n\nIt reuses the resident model by default and never blocks or breaks a request. Tuning knobs:\n`HESTIA_NOTETAKER=0`\n\ndisables it; `HESTIA_NOTETAKER_AUTOWRITE=1`\n\nskips the review queue and\nwrites durable memories directly; `HESTIA_NOTETAKER_MODEL`\n\npoints it at a cheaper model (e.g.\na second Ollama on the free 4060 Ti) to take the load off the brain.\n\nHestia is licensed under the **GNU Affero General Public License v3.0** — see [LICENSE](/thefullnacho/hestia/blob/main/LICENSE).\nThe AGPL is deliberate: Hestia is built to be self-hosted, so the copyleft keeps it open even for\nanyone who runs a modified version as a network service, while imposing nothing on you for running\nit at home.\n\nBefore running it, read ** SECURITY.md**: the brain has no built-in authentication\nand can control your Home Assistant devices, so it must stay on a private network (Tailscale/LAN)\nand must never be exposed to the public internet. It deliberately has no shell tool.\n\n© 2026 TheFullNacho and contributors.", "url": "https://wpnews.pro/news/hestia-a-local-first-home-assistant-that-trusts-timers-over-the-llm", "canonical_source": "https://github.com/thefullnacho/hestia", "published_at": "2026-06-28 13:46:44+00:00", "updated_at": "2026-06-28 14:04:40.152440+00:00", "lang": "en", "topics": ["large-language-models", "ai-agents", "ai-safety", "ai-products", "ai-infrastructure"], "entities": ["Hestia", "Home Assistant", "Ollama", "Plex", "qBittorrent", "Gluetun", "Tailscale", "Forager / Homesteader Labs"], "alternates": {"html": "https://wpnews.pro/news/hestia-a-local-first-home-assistant-that-trusts-timers-over-the-llm", "markdown": "https://wpnews.pro/news/hestia-a-local-first-home-assistant-that-trusts-timers-over-the-llm.md", "text": "https://wpnews.pro/news/hestia-a-local-first-home-assistant-that-trusts-timers-over-the-llm.txt", "jsonld": "https://wpnews.pro/news/hestia-a-local-first-home-assistant-that-trusts-timers-over-the-llm.jsonld"}}