{"slug": "building-an-offline-first-bushfire-response-platform-with-hermes-agent", "title": "Building an Offline-First Bushfire Response Platform With Hermes Agent", "summary": "A developer rebuilt Project Haven, an offline-first bushfire response platform, integrating Hermes Agent as the core AI engine for emergency guidance, scheduled fire-risk briefings, and recovery grant research. The platform, originally a 46-hour GovHack 2024 winner, now runs event-driven microservices with a fully offline-capable React PWA, using Hermes to power real-time chat grounded in Australian emergency protocols and autonomous web searches for government grants. The entire stack, including a local LLM, deploys with a single Docker command and requires no external inference API.", "body_md": "*This is a submission for the Hermes Agent Challenge: Build With Hermes Agent*\n\n**Project Haven** is an AI-powered emergency response platform for bushfire preparedness — evacuation routing, tiered real-time alerts, government recovery grant discovery, and offline-first PWA support for when mobile networks go down mid-crisis.\n\nThe platform was originally a 46-hour hackathon build (GovHack 2024 — we won). This year I brought it back from the dead and [rebuilt ](https://dev.to/ujja/from-govhack-win-to-something-that-actually-matters-2mmi) it properly: event-driven microservices, contract-first OpenAPI specs, a prediction engine based on XGBoost weights from our historical bushfire notebooks, and a fully offline-capable React PWA.\n\nThe one piece that was always a hollow mock was the **AI Assistant** — the in-app emergency guidance chat. It had a `setTimeout`\n\npretending to think and a big `switch`\n\nstatement of canned responses. Every time I looked at it I felt embarrassed.\n\nHermes fixed that.\n\nHermes Agent is now the live brain behind three things in Project Haven:\n\n**In-app emergency guidance** — the AI Assistant page calls Hermes via its OpenAI-compatible `/v1/chat/completions`\n\nAPI, grounded with a system prompt that constrains it to verified Australian emergency protocols and instructs it to escalate emergency situations to 000.\n\n**Scheduled fire-risk briefings** — Hermes runs a natural-language cronjob that fires every morning at 6am during fire season, calls the Bureau of Meteorology and NSW RFS feeds, synthesises a risk summary, and publishes it as an event into the alert pipeline.\n\n**Recovery grant research** — when a user marks themselves as \"in recovery\", Hermes autonomously searches for current government grant programs (NDRA, state schemes, Services Australia), compares them against the user's declared situation, and adds matched recommendations to the recommendation service DB.\n\n🔗\n\nGitHub Repository:[project-haven]\n\n```\ncp .env.example .env\ndocker compose up --build\n```\n\nThat's it. One command spins up 6 microservices, an API gateway, PostgreSQL instances, RabbitMQ, and the React PWA at `http://localhost:3000`\n\n.\n\nTo trigger the full prediction → alert pipeline:\n\n```\ncurl -X POST http://localhost:8080/weather \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"lat\":-33.87,\"lng\":151.21,\"temperature\":42,\"windSpeed\":80,\"humidity\":10,\"season\":\"summer\",\"vegetationDensity\":0.9}'\n```\n\nThat simulates an extreme weather event near Sydney, runs it through the prediction engine, and fires a CRITICAL alert through the system within seconds.\n\n**AI Assistant — Hermes-powered emergency guidance**\n\nThe AI Assistant page now sends messages to Hermes via the api-gateway (`/assistant/v1/chat/completions`\n\n). Hermes has persistent memory across sessions via `X-Hermes-Session-Key`\n\n, so if a user opened the app two days ago and said \"I'm in the Blue Mountains\", Hermes still knows that when they say \"the fire is getting closer\" today.\n\n**Scheduled briefings appearing in the alert feed**\n\nEvery morning during fire season, Hermes fetches live fire danger ratings from the NSW RFS API, synthesises a 3-sentence risk summary grounded in real data, and injects it as a `feed.created`\n\nevent. The event propagates through RabbitMQ to the alert service and appears in-app within seconds.\n\n**Recovery grant matching**\n\nA user marks the \"Recovery\" scenario. Hermes is asked to research grants available for their postcode and situation. It uses its web search tool to check current Services Australia pages — bypassing the staleness problem of any static dataset — and returns structured results that get persisted back to the recommendations table.\n\n🔗\n\nGitHub Repository:[project-haven]\n\nThe full stack — Haven microservices + Hermes Agent + a local LLM — runs entirely on your machine. No external inference API needed.\n\n**Requirements:** Docker Desktop 27+, 8 GB RAM minimum (16 GB recommended), macOS / Linux / WSL2 on Windows.\n\nOllama, the model, all backend services, and the frontend are all managed by Docker Compose. No host-level installation needed beyond Docker itself.\n\n```\ngit clone https://github.com/ujja/project-haven.git\ncd project-haven\ncp .env.example .env\ndocker compose up --build\n```\n\nOn first start, the `ollama-init`\n\ncontainer pulls `nous-hermes2`\n\n(~4.5 GB) automatically. The api-gateway waits for the pull to complete before starting. Subsequent starts are fast — the model is cached in a Docker volume.\n\nProgress is streamed to the compose log. Once you see `api-gateway | api-gateway listening on port 8080`\n\n, everything is ready.\n\n```\n# Simulate an extreme weather event near Sydney → triggers prediction → alert\ncurl -X POST http://localhost:8080/weather \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"lat\":-33.87,\"lng\":151.21,\"temperature\":42,\"windSpeed\":80,\"humidity\":10,\"season\":\"summer\",\"vegetationDensity\":0.9}'\n\n# Ask the AI assistant directly\ncurl -X POST http://localhost:8080/assistant/v1/chat/completions \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"model\":\"nous-hermes2\",\"messages\":[{\"role\":\"user\",\"content\":\"There is a bushfire near me. What do I do right now?\"}]}'\n\n# Search live recovery grants\ncurl -X POST http://localhost:8080/recommendations/research \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"postcode\":\"2750\",\"situation\":\"home destroyed by bushfire\"}'\nReact PWA (port 3000)\n        ↓\nAPI Gateway (port 8080)\n   ├── /assistant/*  →  Ollama /v1/chat/completions (nous-hermes2)\n   ├── /recommendations/research  →  Ollama (structured JSON)\n   └── node-cron @ 6am  →  Ollama → feed-service\n        ↓\nOllama container (port 11434, haven-net)\n        ↓\nnous-hermes2 (cached in ollama-data volume)\n```\n\nEverything runs inside Docker Compose. `ollama-init`\n\npulls the model once on first start; the `ollama-data`\n\nvolume persists it across restarts.\n\n| Model | Size | Best for |\n|---|---|---|\n`nous-hermes2` |\n~4.5 GB | Default — strong instruction following, good JSON |\n`gemma2:9b` |\n~5.4 GB | Superior JSON adherence |\n`gemma2:2b` |\n~1.6 GB | Low-RAM machines, faster responses |\n`llama3:latest` |\n~4.7 GB | General-purpose alternative |\n\nSet `HERMES_MODEL`\n\nin `.env`\n\nto swap. Also update the `ollama-init`\n\nentrypoint in `docker-compose.yml`\n\nto pull the new model.\n\nAvoid 70B-class models unless you have GPU hardware with 40+ GB VRAM.\n\n**First start is slow:** The `ollama-init`\n\ncontainer has to download the `nous-hermes2`\n\nmodel (~4.5 GB). Subsequent starts skip this.\n\n**Model too slow on CPU:** Edit `.env`\n\nand set `HERMES_MODEL=gemma2:2b`\n\n(1.6 GB, much faster). Also update the `ollama-init`\n\nentrypoint in `docker-compose.yml`\n\nto pull the new model.\n\n**WSL2 networking problems:** Volume mounts and bridge networking have edge cases — increasing Docker's memory allocation in Docker Desktop settings usually resolves them.\n\n**Linux GPU acceleration:** Uncomment the `deploy`\n\nblock in the `ollama`\n\nservice in `docker-compose.yml`\n\nand ensure `nvidia-container-toolkit`\n\nis installed.\n\n| Layer | Technology |\n|---|---|\nAI backbone |\nNous Hermes 2 via Ollama (OpenAI-compatible, locally hosted) |\nFrontend |\nReact 18, TypeScript, Vite, Workbox PWA, Leaflet |\nBackend |\nNode.js 20, Express, TypeScript — 6 microservices + API gateway |\nMessaging |\nRabbitMQ (event-driven: weather → prediction → alert pipeline) |\nDatabases |\nPostgreSQL (per-service) |\nML |\nXGBoost (Python notebooks → TypeScript heuristic engine) |\nData |\nDigital Atlas / Geoscience Australia ArcGIS REST APIs |\nInfra |\nDocker Compose, multi-stage builds, shared `@haven/shared` npm package |\n\nOllama exposes `POST /v1/chat/completions`\n\nexactly like the OpenAI SDK expects. It runs as a Docker Compose service (`ollama`\n\n) on the internal `haven-net`\n\nnetwork. The api-gateway proxies `/assistant/*`\n\ndirectly to `http://ollama:11434`\n\n. No separate agent container, no extra port, no API key management between services.\n\nIn `AIAssistant.tsx`\n\n, the `getAIResponseHermes()`\n\nfunction — previously a `setTimeout`\n\n+ `switch`\n\nstatement — became a real API call:\n\n```\nasync function getAIResponseHermes(userMessage: string): Promise<string> {\n  const res = await fetch('/assistant/v1/chat/completions', {\n    method: 'POST',\n    headers: {\n      'Content-Type': 'application/json',\n    },\n    body: JSON.stringify({\n      model: 'nous-hermes2',\n      messages: [\n        {\n          role: 'system',\n          content: HAVEN_SYSTEM_PROMPT,  // emergency protocols, escalate to 000, AU context\n        },\n        { role: 'user', content: userMessage },\n      ],\n      max_tokens: 512,\n    }),\n  });\n  if (!res.ok) throw new Error(`Ollama responded with ${res.status}`);\n  const data = await res.json();\n  return data.choices[0].message.content;\n}\n```\n\nThe api-gateway proxy strips any client-side auth before forwarding to Ollama. A stable session key is stored in `localStorage`\n\nper browser for memory scoping, ready for if a stateful layer is added upstream.\n\nHermes's cron scheduling is genuinely one of its most underrated features. Instead of writing a Node worker with `node-cron`\n\n, a BOM API client, a response parser, and an event publisher, I wrote this in the Hermes config:\n\n```\njobs:\n  - name: fire-risk-briefing\n    schedule: \"0 6 * * * (Oct-Mar)\"   # 6am daily, fire season only\n    prompt: |\n      Check the current fire danger ratings for NSW, VIC, SA, and WA from the\n      Bureau of Meteorology and RFS feeds. Synthesise a 3-sentence morning\n      briefing — severity level, highest-risk regions, and one action\n      recommendation. Keep it under 80 words. Respond with JSON:\n      { \"severity\": \"LOW|MEDIUM|HIGH|EXTREME|CATASTROPHIC\", \"summary\": \"...\" }\n    deliver: http://api-gateway:8080/feeds\n    skills: [web_search]\n```\n\nHermes handles the scheduling, the web fetch, the summarisation, and the HTTP delivery. The `/feeds`\n\nendpoint in the feed service receives the JSON payload and publishes it as a `feed.created`\n\nevent. The entire pipeline — external data → summary → in-app alert — runs without any new code.\n\nThe recommendation service seeds static government programs at startup. But government grants change — new schemes open after disasters, eligibility criteria shift, application portals go down and come back up.\n\nFor the recovery scenario, I added a route in the api-gateway that delegates a research task to Hermes:\n\n```\n// api-gateway: POST /recommendations/research\n// HERMES_BASE = http://ollama:11434 (set via env in docker-compose)\n// HERMES_MODEL = nous-hermes2 (default, overridable via HERMES_MODEL env var)\napp.post('/recommendations/research', express.json(), async (req, res) => {\n  const { postcode, situation } = req.body;\n\n  const hermesRes = await fetch(`${HERMES_BASE}/v1/chat/completions`, {\n    method: 'POST',\n    headers: {\n      'Content-Type': 'application/json',\n    },\n    body: JSON.stringify({\n      model: HERMES_MODEL,\n      messages: [\n        {\n          role: 'system',\n          content: 'You are an Australian emergency recovery specialist. Return ONLY a valid JSON array — no markdown fences, no commentary, just the array.',\n        },\n        {\n          role: 'user',\n          content: `Research current Australian government disaster recovery grants\n            available for postcode ${postcode}. Situation: ${situation}.\n            Return JSON array of { title, provider, description, applicationUrl, eligibilitySummary }.\n            Only include currently open programs with verified URLs.`,\n        },\n      ],\n      max_tokens: 1024,\n    }),\n  });\n\n  const data = await hermesRes.json();\n  const content = data.choices[0].message.content;\n  // Strip accidental markdown fences if the model wraps the JSON\n  const cleaned = content.replace(/^```\n{% endraw %}\n(?:json)?\\n?/, '').replace(/\\n?\n{% raw %}\n```$/, '').trim();\n  const grants = JSON.parse(cleaned);\n  res.json({ grants });\n});\n```\n\nHermes uses its web search tool to check current Services Australia, state government, and Red Cross pages. It returns structured JSON. The gateway upserts the results into the recommendation service — live, verified, current — not frozen in a seed file.\n\nThe key agentic capability here is **web search grounded in the real prompt**: Hermes significantly reduced stale or hallucinated grant recommendations by grounding responses in live web results rather than parametric knowledge. It fetches the actual pages, checks them, and only returns what it finds. For emergency advice, that correctness bar is non-negotiable.\n\nBefore landing on Hermes, I experimented with a few other locally-run models. Here's what that comparison actually looked like for an emergency-context application.\n\nLlama is the obvious first stop for local inference. I ran Llama 3.1 8B and 3.2 3B through Ollama and pointed the same system prompt at them.\n\n**What worked:** Ollama's OpenAI-compatible `/api/chat`\n\nendpoint made drop-in testing easy. Llama 3.2 3B is genuinely fast on Apple Silicon — response latency was better than Hermes on equivalent hardware. For simple question-answering against a fixed system prompt, the outputs were reasonable.\n\n**What didn't:** Neither model had built-in tool execution, persistent session memory, or a scheduler — all three capabilities I needed. To replicate Hermes's cron briefings with Llama, I would have written a Node cron worker, a separate BOM API client, a response parser, an event publisher, and a memory store. That's four services Hermes replaced with a YAML block. Llama also had a notable tendency toward elaboration — answers were often 3-4× too long for a crisis UX where brevity is safety-critical. Prompt engineering helped, but it was a constant battle.\n\n**Verdict:** Great for straight inference tasks where you're bringing your own orchestration layer. Not the right fit when you need the agent capabilities without the plumbing.\n\nGoogle's Gemma 2 9B was the most pleasant surprise in terms of raw instruction-following. It respected format constraints (JSON output, word limits) more reliably than anything else I tested at this size.\n\n**What worked:** JSON output adherence was excellent. When I told it to return `{ \"severity\": \"...\", \"summary\": \"...\" }`\n\n, it almost always did — no markdown wrapping, no prose preamble. That's unusually good at 9B parameters. It also handled the Australian emergency context well without needing extensive ground-truth examples in the system prompt.\n\n**What didn't:** Same infrastructure gap as Llama — no native tool use, no session memory, no scheduling. Gemma 2's knowledge cutoff also made it confidently wrong about post-2024 government programs (it cited DRFA schemes that had been superseded). For a platform that needs to tell people where to apply for grants right now, that's a hard failure mode. Web search grounding isn't optional here.\n\n**Verdict:** Best raw model for structured output tasks at the 7-9B tier. If I ever strip Hermes out and build the orchestration layer myself, Gemma 2 is what I'd put under it.\n\nI tried Mistral 7B Instruct and Mistral-Nemo (12B) briefly. Both are fast and capable general-purpose models. But in the emergency context, Mistral had one consistent problem: it over-hedged.\n\nEvery answer about what to do in a bushfire came back wrapped in \"I am not a qualified emergency services professional and this should not be taken as official advice…\" disclaimers that would fill half the screen on a mobile device. I understand why models are trained to do this. But during an actual emergency, a response that leads with three sentences of disclaimer before telling someone to leave is arguably worse than no AI at all. Getting Mistral to drop the hedging without also dropping the actual safety guardrails required more system prompt engineering than the ROI justified.\n\n**Verdict:** Capable model, wrong default behaviour for a high-stakes UX. Taming it is possible but expensive in prompt tokens and iteration time.\n\nAgainst this field, here's where Hermes genuinely pulls ahead:\n\n**Native tool execution with no orchestration layer.** Web search, HTTP calls, file reads — built in. No LangChain, no custom function-calling wrapper, no managing tool schemas manually. For the grant research workflow (fetch → parse → structure → return), this is a week of orchestration code that just isn't written.\n\n**Persistent session memory across requests.** Every other model I tested was stateless per-request. Hermes's session-scoped memory model is simple and works reliably in practice. For a crisis scenario that plays out over days, that matters.\n\n**Cron scheduling with natural language task definitions.** Nothing else offers this at the infrastructure level. The fire-season briefing scheduler is 12 lines of YAML.\n\n**Stays on the wire it's told to stay on.** Hermes with the Haven system prompt stays grounded in Australian emergency protocols, is instructed to escalate emergency situations to 000, and significantly reduced stale or hallucinated grant recommendations by grounding responses in live web results. The safety system prompt feels sticky in a way that required more reinforcement with Llama and Mistral.\n\nI also evaluated broader autonomous-agent platforms — particularly OpenClaw-style systems built around persistent personal agents, plugins, and wide capability surfaces.\n\nThose systems are impressive. But for Project Haven, they solved a different problem than the one I actually had.\n\nProject Haven is not trying to build:\n\nIt needs something much narrower and more reliable:\n\nThat distinction ended up mattering a lot.\n\nPlatforms like OpenClaw optimise for maximum flexibility — plugins, integrations, autonomous behaviours, evolving capability graphs. Hermes feels more opinionated. In practice, that was a benefit.\n\nFor an emergency-response application, I cared more about:\n\nThe tighter execution model made Hermes easier to reason about architecturally. The memory model was simpler. The scheduling primitives were built in. And the default operational surface felt significantly safer for a high-stakes context.\n\nThat tradeoff means Hermes currently has a smaller ecosystem, fewer integrations, less community tooling, and less flexibility than broader agent platforms. But for Project Haven, that narrower scope was exactly the point.\n\nI didn't need an AI operating system. I needed a dependable emergency-response runtime that could live entirely inside my infrastructure stack and keep working when the situation around it stopped being normal.\n\nBeing honest about the gaps matters:\n\n**Cold start time.** Hermes in Docker takes noticeably longer to reach a ready state than a model served via Ollama. On my MacBook Pro M2, Ollama with Llama 3.2 3B is ready in under 5 seconds. Hermes takes 15-25 seconds to initialise its memory backend and tool registry. In a production deployment this is a non-issue (it starts once). In development, it slows iteration loops.\n\n**Raw inference speed at the same model size.** Hermes's overhead — memory management, tool routing, session handling — costs tokens per request. For a simple in-context QA task with no tools, Llama through Ollama will answer faster. For the agentic tasks (web search, multi-step research), that comparison flips because Hermes automates and coordinates multi-step tool execution.\n\n**Documentation gaps.** The job scheduler YAML schema isn't well-documented — I spent more time than I'd like reading source to understand what `deliver:`\n\naccepted and how session keys scope memory. Llama/Gemma via Ollama have significantly more community documentation. Stack Overflow has nothing for Hermes-specific issues yet; you're on the GitHub issues list and Discord.\n\n**Model selection is less flexible.** Hermes currently offers less model flexibility than a pure Ollama workflow. If you want Gemma 2 9B's superior JSON adherence under Hermes's tool/memory layer, that combination isn't always straightforward depending on your configuration. With Ollama, you can swap models in one command. This matters if you're trying to tune cost/quality tradeoffs.\n\n**Windows support is rough.** Docker on Windows with WSL2 works but the volume mounting and networking for Hermes's memory backend had edge cases I had to work around. On macOS and Linux it was smooth.\n\nA few things made Hermes specifically well-suited here over rolling a custom LLM integration:\n\n**It lives on your server.** Project Haven is explicitly designed for scenarios where internet connectivity is degraded. Hermes runs in the same Docker Compose stack as everything else — no dependency on an external inference API during the crisis window. The model runs locally.\n\n**Persistent memory that scales with the crisis.** Emergency situations evolve over hours and days. A user's context from this morning matters when they're asking questions tonight. Hermes's session memory handles this automatically.\n\n**Scheduled autonomy fits the \"prevention not reaction\" model.** The best emergency outcome is a user who prepared before the fire arrived. Hermes's cron scheduler lets Project Haven push proactive briefings during fire season without any always-on polling infrastructure.\n\n**OpenAI-compatible API means zero new client code.** The React frontend, the Node gateway, anything that already speaks the OpenAI format works against Hermes unmodified. No new SDK. No vendor lock-in.\n\n**Local inference eliminates API dependency risk.** One thing that became obvious very quickly while testing hosted AI APIs was how fast autonomous workflows amplify usage. A single user interaction can become multiple model calls — retrieval, summarisation, reasoning, formatting, follow-up clarification. For a crisis-response platform, depending entirely on external inference APIs introduced operational and cost dependencies I wasn't comfortable with. Running Hermes locally shifted the tradeoff toward compute and infrastructure complexity instead of per-token billing and rate limits — which was the right trade for this project.\n\nProject Haven started as a 46-hour hackathon build, sat frozen for two years, and became the platform it was always supposed to be during this rebuild.\n\nThe AI Assistant was always the most important feature and the least real. Hermes made it real — not just more capable, but genuinely appropriate for the stakes of an emergency context: server-local, memory-persistent, verifiably grounded, and invisible enough to get out of the way when someone needs an answer fast.\n\nIf you're building anything where the AI output actually matters — where a wrong answer has real consequences — the pattern of running Hermes locally with a tight system prompt and opaque tool execution is one I'd recommend seriously.\n\n*Built with TypeScript, React, Node.js, PostgreSQL, RabbitMQ, Docker, Hermes Agent, and a lot of respect for the people who actually work bushfire emergencies.*\n\n*Originally born at GovHack 2024. Finally given room to grow.*", "url": "https://wpnews.pro/news/building-an-offline-first-bushfire-response-platform-with-hermes-agent", "canonical_source": "https://dev.to/ujja/building-an-offline-first-bushfire-response-platform-with-hermes-agent-4m0a", "published_at": "2026-05-28 01:51:59+00:00", "updated_at": "2026-05-28 02:22:31.986549+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-agents", "ai-products", "ai-tools", "natural-language-processing"], "entities": ["Hermes Agent", "Project Haven", "GovHack", "Bureau of Meteorology", "XGBoost", "React", "OpenAI"], "alternates": {"html": "https://wpnews.pro/news/building-an-offline-first-bushfire-response-platform-with-hermes-agent", "markdown": "https://wpnews.pro/news/building-an-offline-first-bushfire-response-platform-with-hermes-agent.md", "text": "https://wpnews.pro/news/building-an-offline-first-bushfire-response-platform-with-hermes-agent.txt", "jsonld": "https://wpnews.pro/news/building-an-offline-first-bushfire-response-platform-with-hermes-agent.jsonld"}}