{"slug": "tlamatini-local-first-ai-dev-assistant-with-68-agents-and-hybrid-rag", "title": "Tlamatini – Local-first AI dev assistant with 68 agents and hybrid RAG", "summary": "Tlamatini, a locally-deployed AI developer assistant named after the Nahuatl word for \"one who knows,\" now features 68 drag-and-drop agents following its v1.9.0 release. The tool combines a hybrid RAG pipeline with a multi-turn orchestration layer and ACPX delegation to external coding agents, operating local-first by default with cloud LLMs as opt-in only. The latest update introduces the STM32er agent, a zero-config firmware bridge that scaffolds, builds, flashes, and observes STM32 microcontrollers while refusing to produce or flash mis-targeted firmware.", "body_md": "*\"one who knows\" — a locally-deployed AI developer assistant*\n\n**Tlamatini** (Nahuatl for *\"one who knows\"*) is a locally-deployed AI developer assistant that pairs a hybrid [RAG pipeline](#82-rag) (FAISS + BM25, metadata extraction, context budgeting) with a [Multi-Turn](#35-tutorial-the-multi-turn-toggle) tool-orchestration layer, [ACPX](#5-acpx--external-coding-agent-clis-as-tools) delegation to external coding-agent CLIs ([Claude Code](https://docs.anthropic.com/en/docs/claude-code/overview), [Cursor](https://cursor.com), [Codex](https://github.com/openai/codex), [Gemini](https://github.com/google-gemini/gemini-cli), [Qwen](https://github.com/QwenLM/qwen-code), …), and a [visual workflow designer](#4-visual-workflow-designer-agentic_control_panel) with **68 drag-and-drop agents**.\n\nLocal-first by default: the full RAG pipeline, the Multi-Turn execution loop, and every workflow agent run on your machine — embeddings and chat are driven by your local[Ollama]install. Cloud LLMs (Claude API, Ollama Pro/Max) and ACPX delegation to cloud CLIs are opt-in per-request, never the default. Sensitive code never leaves the box unless you explicitly route it out.\n\nLatest — v1.9.0 (2026-05-26): STM32er, zero-config firmware bridge.A newSTM32eragent (canvas node + Multi-Turn tool`chat_agent_stm32er`\n\n) brings the catalog to68 agents. It bridges the[STM32 Template Project MCP]to scaffold, build, flash, observe (serial / SWD), and reset STM32 firmware.Zero-config auto-bootstrapmeans the user only installs STM32CubeIDE + Tlamatini — STM32er downloads, installs, and validates the MCP server on first use. Acritical-mission safety preflightvalidates the toolchain and a positively-confirmed connected ST-LINK probe before flashing, andrefusesrather than producing or flashing mis-targeted firmware. Three new catalog demos ship in migration`0103`\n\n(STM32 GENESIS / BLINKY / HIL OBSERVATORY). See[§3.15].\n\n[ 🌐 Website](https://xaiht.org) ·\n\n[·](https://www.youtube.com/watch?v=4MyRXBahHuU&t=41s)\n\n▶️ One-minute teaser[·](/XAIHT/Tlamatini/blob/main/BookOfTlamatini.md)\n\n**📖 Long-form docs**[·](/XAIHT/Tlamatini/blob/main/VERSIONING.md)\n\n**🏷️ Versioning**\n\n**🎬 More demos**[1. Overview](#1-overview)[2. Quickstart (source mode)](#2-quickstart-source-mode)[3. Using the Chat (](#3-using-the-chat-agent)`/agent/`\n\n)[3.1. Chat layout in 30 seconds](#31-chat-layout-in-30-seconds)[3.2. Setting code as context](#32-setting-code-as-context)[3.3. Tutorial: a one-shot question (no toggles)](#33-tutorial-a-one-shot-question-no-toggles)[3.4. Tutorial: the](#34-tutorial-the-internet-toggle)**internet** toggle[3.5. Tutorial: the](#35-tutorial-the-multi-turn-toggle)**Multi-Turn** toggle[3.6. Tutorial: the](#36-tutorial-the-exec-report-toggle)**Exec Report** toggle[3.7. Tutorial: the](#37-tutorial-the-acpx-toggle)**ACPX** toggle[3.8. From chat to flow: the](#38-from-chat-to-flow-the-create-flow-button)**Create Flow** button[3.9. Why Chat-created flows are safer now](#39-why-chat-created-flows-are-safer-now)[3.10. The](#310-the-db-menu--backup-set-db-and-the-start-up-swap-in)**DB** menu — Backup, Set DB, and the start-up swap-in[3.11. The](#311-the-acpx-skills-menu--browse-configure-diagnostics-reload)**ACPX-Skills** menu — Browse, Configure, Diagnostics, Reload[3.12. Tutorial: command a window from chat (](#312-tutorial-command-a-window-from-chat-chat_agent_windower)`chat_agent_windower`\n\n)[3.13. Tutorial: drive a browser from chat (](#313-tutorial-drive-a-browser-from-chat-chat_agent_playwrighter)`chat_agent_playwrighter`\n\n)[3.14. Tutorial: run Kali Linux tools from chat (](#314-tutorial-run-kali-linux-tools-from-chat-chat_agent_kalier)`chat_agent_kalier`\n\n)[3.15. Tutorial: build and flash STM32 firmware from chat (](#315-tutorial-build-and-flash-stm32-firmware-from-chat-chat_agent_stm32er)`chat_agent_stm32er`\n\n)\n\n[4. Visual Workflow Designer (](#4-visual-workflow-designer-agentic_control_panel)`/agentic_control_panel/`\n\n)[4.1. Canvas anatomy](#41-canvas-anatomy)[4.2. Tutorial: your first flow (3 agents)](#42-tutorial-your-first-flow-3-agents)[4.3. Saving and loading](#43-saving-and-loading-flw-files)`.flw`\n\nfiles[4.4. Validate and Start now compile the live canvas](#44-validate-and-start-now-compile-the-live-canvas)[4.5. Pause, Resume, Stop](#45-pause-resume-stop)[4.6. FlowHypervisor (watchdog)](#46-flowhypervisor-watchdog)[4.7. FlowCreator (let an LLM design the flow)](#47-flowcreator-let-an-llm-design-the-flow)[4.8. Parametrizer (chain outputs into the next agent's config)](#48-parametrizer-chain-outputs-into-the-next-agents-config)[4.9. Gatewayer (external triggers)](#49-gatewayer-external-triggers)\n\n[5. ACPX — External Coding-Agent CLIs as Tools](#5-acpx--external-coding-agent-clis-as-tools)[6. Unreal MCP — Driving Unreal Engine 5 from Tlamatini](#6-unreal-mcp--driving-unreal-engine-5-from-tlamatini)[6.1. What Unreal MCP is](#61-what-unreal-mcp-is)[6.2. The MCP plugin source (the](#62-the-mcp-plugin-source-the-mcp-git-location)**MCP git location**)[6.3. Installing and enabling the plugin inside your UE5 project](#63-installing-and-enabling-the-plugin-inside-your-ue5-project)[6.4. The command catalog (up to 53 commands across 9 categories)](#64-the-command-catalog-up-to-53-commands-across-9-categories)[6.5. Using Unreal MCP from the chat (](#65-using-unreal-mcp-from-the-chat-chat_agent_unrealer)`chat_agent_unrealer`\n\n)[6.6. Using Unreal MCP on the canvas (the visual](#66-using-unreal-mcp-on-the-canvas-the-visual-unrealer-node)**Unrealer** node)[6.7. What the agent actually does, end-to-end](#67-what-the-agent-actually-does-end-to-end)[6.8. Exec Report integration](#68-exec-report-integration)[6.9. Bullet-proof checklist for Unreal Engine users](#69-bullet-proof-checklist-for-unreal-engine-users)[6.10. Troubleshooting Unreal MCP](#610-troubleshooting-unreal-mcp)\n\n[7. Building a Frozen Distribution](#7-building-a-frozen-distribution)[8. Configuration (](#8-configuration-tlamatiniagentconfigjson)`Tlamatini/agent/config.json`\n\n)[9. Architecture at a Glance](#9-architecture-at-a-glance)[10. Embedding-Memory Pre-Flight Guard (GPU hosts)](#10-embedding-memory-pre-flight-guard-gpu-hosts)[11. Orphan-Process Cleanup (](#11-orphan-process-cleanup-conhostexe-reaper)`conhost.exe`\n\nreaper)[12. Troubleshooting](#12-troubleshooting)[13. Versioning](#13-versioning)[14. Contributing & License](#14-contributing--license)\n\n**Tlamatini** (Nahuatl for *\"one who knows\"*) is a Django/Channels app you run on your own machine. It packages a hybrid RAG pipeline, a Multi-Turn tool-calling LLM loop, an ACPX runtime that spawns external coding-agent CLIs as child processes, an **Unreal MCP** client that drives Unreal Engine 5 from chat or canvas, and a drag-and-drop workflow designer with 68 agent types — into one local install. Backends: **Ollama** (local), **Anthropic Claude** (cloud), **Qwen vision** (Ollama).\n\nLicense: **GPL-3.0** · Repo: [https://github.com/XAIHT/Tlamatini.git](https://github.com/XAIHT/Tlamatini.git) · Platform tested: Windows 11 (cross-platform for source mode).\n\n**Real RAG over your code**— FAISS + BM25 hybrid retrieval, code-aware metadata extraction, Reciprocal Rank Fusion, context budgeting, OOM fallback.**Multi-Turn mode**— the LLM becomes an*operator*: shell, Python, APIs, SQL, file ops, screenshots, keyboard/mouse automation, email, Telegram, WhatsApp, STM32 firmware build/flash — chained in one conversation.**ACPX**— delegate sub-tasks to external CLIs (`claude`\n\n,`cursor-agent`\n\n,`codex`\n\n,`gemini`\n\n,`qwen-code`\n\n, plus 8 more) and relay output between them.**Visual workflow designer**— design`.flw`\n\nflows once, run them unattended, schedule with Croner, watch them with FlowHypervisor.**Self-aware**— a first-person self-knowledge map (`Tlamatini.md`\n\n) is injected into the LLM's prompt on every chain, so Tlamatini can answer accurately about her own architecture, runtime modes, ports, and pages. Builds packaged with`--self-modify`\n\nship her own source tree (`TlamatiniSourceCode/`\n\n) so she can read, inspect, and modify herself.\n\nEverything runs locally. The whole app packages into a one-click Windows `.exe`\n\ndistribution (Part [§7](#7-building-a-frozen-distribution)).\n\n[First system-usage walkthrough](https://www.youtube.com/watch?v=CkvDPSd_c-g)[Loading a complete project and summarizing its source code](https://www.youtube.com/watch?v=Lrpbt_dPIXw)[Installing OpenCV end-to-end in Multi-Turn](https://www.youtube.com/watch?v=bBlqbZVK-Wk)[Uninstalling Poco — Exec Report and matching flow](https://www.youtube.com/watch?v=E5vi0q5FxXQ)[Implementing a FlowCreator-aided agentic flow](https://www.youtube.com/watch?v=3Pno6s4xVsE)[A complete Cybersec enhancement with Tlamatini!!!](https://www.youtube.com/watch?v=4MyRXBahHuU&t=41s)\n\nThis is the fastest way to be productive: clone, install, run. No installer, no admin, no frozen build. Five minutes.\n\n| Requirement | Recommended | Notes |\n|---|---|---|\n| Python | 3.12.10 |\nThe only version Tlamatini is fully tested on. |\n| OS | Windows 11 | Linux/macOS work for chat + designer; Mouser/Keyboarder are Windows-leaning. |\n| RAM | 16 GB+ | 32 GB comfortable for bigger embedding models. |\n| Disk | ~10 GB | Most is local LLM models. |\n| LLM server | Ollama |\nDefault. Cloud Claude/Gemini also supported. |\n\nYou do **not** need administrator rights for any of the steps below.\n\nOpen PowerShell **normally** (do not Run as Administrator), then:\n\n```\n$env:OLLAMA_INSTALL_DIR = \"$env:LOCALAPPDATA\\Programs\\Ollama\"\nirm https://ollama.com/install.ps1 | iex\n```\n\nClose the window, open a fresh PowerShell, and verify:\n\n```\nollama --version\nollama serve     # leave running in its own window if it's not already up\nInvoke-WebRequest http://127.0.0.1:11434/api/tags -UseBasicParsing\n```\n\nTlamatini expects Ollama at `http://127.0.0.1:11434`\n\n.\n\n```\nollama pull Nomic-Embed-Text:latest\nollama pull glm-5:cloud\nollama pull qwen3.5:cloud\nollama pull gpt-oss:120b-cloud\nollama pull qwen3.5:397b-cloud\nollama pull llama3.2-vision:11b\n```\n\n| Tag | Used for |\n|---|---|\n`Nomic-Embed-Text:latest` |\nRAG embeddings (default — small VRAM footprint, ~600 MB resident) |\n`glm-5:cloud` |\nDefault chat + Multi-Turn unified-agent + MCP file-search |\n`qwen3.5:cloud` |\nDefault vision (Image-Interpreter) |\n`gpt-oss:120b-cloud` |\nSeveral workflow-agent templates (Monitor-Log, Notifier, Prompter, Summarizer, …) |\n`qwen3.5:397b-cloud` |\nDefault FlowCreator |\n`llama3.2-vision:11b` |\nLocal vision fallback |\n\nYou can substitute any tag — just edit `Tlamatini/agent/config.json`\n\n(see [§8.1](#81-llm-and-unified-agent)) or the relevant agent's `config.yaml`\n\n.\n\nOptional: swap to a higher-detail embedding model.If your retrieval quality on dense, technical corpora is not good enough with the default, you can switch to`qwen3-embedding:8b`\n\nfrom theConfig → Modelsmenu inside the app (or by editing`embeding-model`\n\nin`config.json`\n\nand reconnecting).Use with caution:`qwen3-embedding:8b`\n\nis roughly10× heavier in VRAMthan`Nomic-Embed-Text:latest`\n\n(~6.24 GB resident vs ~600 MB on a Q4_K_M quant) and will trip the embedding-memory pre-flight guard (see[§10]) on 8 GB consumer GPUs. Pull it first with`ollama pull qwen3-embedding:8b`\n\n.\n\nFour of the six default model tags in [§2.3](#23-pull-the-default-models) carry the `:cloud`\n\nsuffix — `glm-5:cloud`\n\n, `qwen3.5:cloud`\n\n, `gpt-oss:120b-cloud`\n\n, and `qwen3.5:397b-cloud`\n\n. Those are **Ollama Cloud** models: they live on Ollama's servers, not on your machine, and `ollama pull`\n\nonly registers a stub that proxies inference to the cloud. Reaching that cloud requires a logged-in Ollama account and a subscription tier that allows the workload you intend to run.\n\nThe plan structure (prices are deliberately omitted from this README because they change — check ** https://ollama.com/pricing** for the current numbers):\n\n| Plan | Cloud-model access | Why it matters for Tlamatini |\n|---|---|---|\nFree |\n1 cloud model concurrently, light usage. Local open-weights models are unlimited. | Enough to try a single cloud model for a one-shot chat. Not enough for Tlamatini's default config, which pins different cloud models for chat (`glm-5:cloud` ), FlowCreator (`qwen3.5:397b-cloud` ), several workflow agents (`gpt-oss:120b-cloud` ), and vision (`qwen3.5:cloud` ) — so a real Multi-Turn run typically needs 2–3 cloud models loaded at once. |\nPro |\n3 concurrent cloud models, ~50× the Free monthly quota, access to the larger cloud-only models, ability to upload / share private models. | The realistic minimum for running Tlamatini out-of-the-box with its shipped cloud-model defaults — Multi-Turn + Exec Report + occasional Image-Interpreter calls. |\nMax |\n10 concurrent cloud models, ~5× the Pro quota, designed for sustained heavy agentic workloads. | Recommended for long-running ACPX relays, FlowHypervisor-supervised flows, and Croner-driven unattended runs that chain many cloud calls per hour. |\n\n**If you do not want to subscribe**, you can run Tlamatini entirely on local open-weights models. Edit `Tlamatini/agent/config.json`\n\n(`chained-model`\n\n, `unified_agent_model`\n\n, `mcp_file_search_model`\n\n, `flow_creator_model`\n\n, `image_interpreter_model`\n\n) and every agent `config.yaml`\n\nthat names a `:cloud`\n\ntag, and swap them for a model you have pulled locally (for example, `llama3.1:8b`\n\n, `qwen2.5-coder:14b`\n\n, `mistral-nemo:12b`\n\n). Performance and quality will scale with your GPU/CPU — Multi-Turn and ACPX both work fine on a sufficiently large local model.\n\n**API keys are separate.** This subscription only governs `*:cloud`\n\nOllama models. The ACPX runtime can additionally spawn external coding-agent CLIs that bring their own credentials (Anthropic API key for `claude`\n\n, OpenAI key for `codex`\n\n, Google key for `gemini`\n\n, etc.) — those are configured in `Tlamatini/agent/config.json`\n\nunder `acpx.agents.<id>.env`\n\nand are unaffected by your Ollama plan. See [§5.6](#56-api-key-setup-the-easy-button) for the easy-button setup. (Unreal MCP is *not* part of ACPX — it's its own MCP surface, documented in [§6](#6-unreal-mcp--driving-unreal-engine-5-from-tlamatini).)\n\n```\ngit clone https://github.com/XAIHT/Tlamatini.git\ncd Tlamatini\n\npython -m venv venv\n# Windows:\nvenv\\Scripts\\activate\n# Linux/macOS:\nsource venv/bin/activate\n\npip install -r requirements.txt\n\npython Tlamatini/manage.py migrate\npython Tlamatini/manage.py createsuperuser\npython Tlamatini/manage.py collectstatic --noinput\npython Tlamatini/manage.py runserver --noreload\n```\n\n`--noreload`\n\nis important: Daphne's auto-reloader does not coexist well with the wrapped-runtime subprocess pool.\n\nThe console title becomes `Tlamatini`\n\nand stdout/stderr are tee'd into `Tlamatini/tlamatini.log`\n\n(truncated on every start). When debugging, `tlamatini.log`\n\nis the first thing to read.\n\nOpen `http://127.0.0.1:8000/`\n\nand log in with the superuser you just created. Then:\n\n`/agent/`\n\n— the chat (Part[§3](#3-using-the-chat-agent))`/agentic_control_panel/`\n\n— the visual designer (Part[§4](#4-visual-workflow-designer-agentic_control_panel))`/admin/`\n\n— Django admin (change passwords, manage users)\n\nIf you used the\n\ninstaller(Part[§7]) instead of cloning, the default credentials are`user`\n\n/`changeme`\n\n. Change them at first login via`/admin/`\n\n.\n\n```\n┌───────────────────────────────────────────────────────────────────────────────┐\n│ Tlamatini  [Context ▼] [Open in… ▼] [MCPs ▼] [Tools ▼] [Agents ▼] [Config ▼] [DB ▼] │ ← top nav\n├───────────────────────────────────────────────────────────────────────────────┤\n│  Multi-Turn ☐   Exec Report ☐   ACPX ☐   internet ☐    Clear ⌫               │ ← four toggles\n├───────────────────────────────────────────────────────────────────────────────┤\n│  ┌──── chat ────────────────┐   ┌──── code canvas ────────────────┐          │\n│  │  conversation history    │   │  syntax-highlighted, with copy  │          │\n│  └──────────────────────────┘   └─────────────────────────────────┘          │\n├───────────────────────────────────────────────────────────────────────────────┤\n│  Type your prompt here…                                              [Send]   │\n└───────────────────────────────────────────────────────────────────────────────┘\n```\n\nThe **four toolbar toggles** are independent. Tick whatever combination fits the task — each one is its own tutorial section below.\n\nNewer builds also expose a **Config** dropdown in the same navbar. `Config -> Models`\n\nedits the most common model-name fields, and `Config -> URLs`\n\nedits the Ollama / unified-agent / MCP endpoint values through validated dialogs instead of hand-editing JSON. The chat/canvas divider was also tightened so resizing the right-hand canvas feels more predictable during long editing sessions.\n\nThe newest entry in that navbar is the **DB** dropdown: `DB -> Backup database`\n\nsnapshots the live SQLite file to a directory you pick, and `DB -> Set DB`\n\nstages a `db.sqlite3`\n\nfile of your choice for the **next session** — Tlamatini swaps it in before Django opens the database, archives the previous one under `DB/Older/<timestamp>/`\n\n, then continues normal start-up. Full walkthrough in [§3.10](#310-the-db-menu--backup-set-db-and-the-start-up-swap-in).\n\nClick **Context** in the top nav:\n\n| Menu entry | What it does |\n|---|---|\nSet directory as context |\nLoads a folder. Tlamatini reads every text file, splits, embeds, builds FAISS+BM25, grounds answers in your code. |\nSet file as context |\nSingle-file scope. |\nSet canvas as context |\nUse the code currently shown in the canvas (handy for iterative editing). |\nClear context |\nDrops the loaded context. |\n\n**Set directory as context** now loads a project at **any depth** under the app root. The old browser `showDirectoryPicker()`\n\nonly exposed the leaf folder name, so deeply-nested projects could not be reached; it was replaced by a backend native Win32 folder picker (`views.pick_context_directory_view`\n\n, route `pick_context_directory/`\n\n) that returns the real absolute path. `path_guard.is_within_application_root()`\n\nthen accepts the application root or any descendant of it, and `agent_page_init.js`\n\nfalls back to manual path entry on non-Windows hosts.\n\nA green banner at the top shows the current context path. If embedding runs out of memory, Tlamatini packs the source files as a fallback context — retrieval quality drops, access to your code does not.\n\nIf you refresh the browser and Tlamatini restores a saved context automatically, the input now stays disabled until the contextual RAG chain has actually finished rebuilding. That closes the old \"restored banner arrived before the context was really ready\" race on the first load stage.\n\nLeave every checkbox unticked. Type:\n\n\"Write a Python function that validates an email address with a regex. Just the function.\"\n\nThe bot answers in one shot. Code lands in the right-hand canvas with copy/save buttons. This is the legacy chat path — fast, no tools, no internet.\n\nTick **internet** when the question genuinely needs fresh web data:\n\n\"What is the latest stable version of FastAPI right now?\"\n\nTlamatini classifies the prompt with a small LLM call (\"does this need the web?\"), then DuckDuckGo-searches, summarizes the top results, and inlines the summary into the LLM's context. Leave it **unticked** for everything else (the round-trip adds latency).\n\nThis is the big one. Multi-Turn turns Tlamatini from *answerer* into **operator**:\n\n- The planner picks the relevant subset of Tlamatini's\n**75 Multi-Turn tools**— 20 core Python tools (`execute_command`\n\n,`agent_starter`\n\n,`googler`\n\n, the image-analysis pair, the`chat_agent_run_*`\n\nlifecycle helpers, …), 43 wrapped chat-agent tools, and 12 ACPX/Skill tools — binding at most`max_selected_tools`\n\nper request (default cap:**20**). - The unified-agent loop runs\n**up to 4096 iterations**(the`unified_agent_max_iterations`\n\ndefault) — call tool, see result, decide next, chain. - Wrapped sub-agents run in headless background runtimes (no console pop-ups).\n\n**Try this:** tick **Multi-Turn**, send\n\n\"Take a screenshot of my desktop and save it to\n\n`C:\\Tlamatini-test\\shot.png`\n\n.\"\n\nWatch the chat. The LLM picks `chat_agent_shoter`\n\n, calls it with the right args, reads the JSON result, and replies \"Done — saved to C:\\Tlamatini-test\\shot.png.\" Open the file. The screenshot is there.\n\n| Symptom | Fix |\n|---|---|\n| LLM says \"Tool X is not available\" | The planner did not bind it. Check `[Planner._select]` console lines; add matching keywords to your prompt or raise `max_selected_tools` . |\n| Same tool fired twice with identical args | Suppressed by the dedup guard — the second call returns \"skipped — duplicate\". |\n| 4096 iterations exhausted | You probably hit a polling loop. Use `chat_agent_sleeper` instead of busy-polling. |\n\nMulti-Turn stacks with Set-Context: the LLM reasons over your code *and* runs tools on the result.\n\nBelow the prose answer, Tlamatini appends **per-agent execution tables** — one HTML table per *kind* of state-changing agent that fired. Each row = one real tool call + ✓/✗.\n\nTick **Multi-Turn + Exec Report** and send:\n\n\"Create\n\n`C:\\test\\hello.txt`\n\nwith`Hi from Tlamatini`\n\n, then read it back and tell me its size.\"\n\nAfter the prose, you see:\n\n```\n─── List of File Creator Operations ───\n #  │ Command                                        │ ✓/✗\n 1  │ filepath='C:\\test\\hello.txt' content='Hi …'    │  ✓\n─── List of Executer Operations ───\n #  │ Command                                        │ ✓/✗\n 1  │ type C:\\test\\hello.txt                         │  ✓\n```\n\nWhat gets a table: state-changing tools only (`execute_command`\n\n, `execute_file`\n\n, `unzip_file`\n\n, `decompile_java`\n\n, every `chat_agent_*`\n\nthat touches the system, all five `acp_*`\n\nlifecycle tools — merged into one \"List of ACPx Operations\" — and `invoke_skill`\n\n). Read-only tools (Crawler, Googler, Prompter, Summarizer, File-Interpreter/Extractor, Image-Interpreter, Shoter, Sleeper, monitor_*, run_*, `window_present`\n\n) are intentionally absent. **Tables persist into chat history** — reload the page and they are still there.\n\nACPX lets the chat **delegate** to external coding-agent CLIs running on your box. Picture it:\n\n```\nYou ─► Tlamatini chat ─► acp_doctor → acp_spawn(claude) → acp_send_and_wait\n                                  │\n                                  ▼ subprocess.Popen\n                                claude CLI / gemini / cursor / codex / qwen / …\n```\n\nWhen **ACPX is ticked**, the planner sees the 12 ACPX/Skill tools. When **unticked**, those tools are filtered out — the chat behaves like legacy Multi-Turn. (Implemented in `agent/acpx/__init__.py::filter_acpx_tools()`\n\n.)\n\n**Prereq:** at least one external CLI on `PATH`\n\n. The simplest:\n\n```\nnpm install -g @anthropic-ai/claude-code\nclaude --version\n```\n\nThen drop your key in `Tlamatini/agent/config.json`\n\n(or use the [ setup-new-acpx-key skill](#56-api-key-setup-the-easy-button) — much easier).\n\nTick **Multi-Turn + ACPX + Exec Report** and send:\n\n\"Use ACPX to spawn the claude CLI in\n\n`C:/Development/Tlamatini`\n\n, ask it to summarize CLAUDE.md in 5 bullet points, harvest the answer, and kill the session.\"\n\nYou see: `acp_doctor`\n\n(always first) → `acp_spawn(agent_id=\"claude\", task=…)`\n\n→ `acp_send_and_wait`\n\n→ `acp_kill`\n\n. The 5 bullets land in the prose, and the Exec Report shows a \"List of ACPx Operations\" table with all four rows.\n\nACPX deep dive in Part [§5](#5-acpx--external-coding-agent-clis-as-tools).\n\nWhen a Multi-Turn run **succeeds** and used at least one state-changing tool, Tlamatini renders a **Create Flow** button on the message header. Click → download a `.flw`\n\nJSON file mirroring the exact tool sequence, laid out left-to-right, ready to load in the visual designer:\n\n```\nStarter ─► Crawler ─► File Creator ─► Ender\n```\n\nYou can re-open it in `/agentic_control_panel/`\n\nand run it as an unattended workflow. The LLM is no longer in the loop.\n\nThe button gates on four conditions: Multi-Turn was on, ≥1 mappable tool succeeded, an LLM-based classifier marked the answer SUCCESS (fails open on internal error), and the user is logged in.\n\nOlder Chat-created `.flw`\n\nfiles were generated almost entirely in the browser. That worked for simple chains, but it meant the browser had to remember many backend facts:\n\n- what each agent is called on disk;\n- which config field means \"my input\";\n- which config field means \"my output\";\n- which agents are special, like Ender or Parametrizer;\n- which values are safe to save into a portable\n`.flw`\n\nfile.\n\nThat is a lot of responsibility for a button.\n\nNow the browser still builds the first draft, but the backend normalizes it through the **Agent Contract Registry** before the file is downloaded. In plain English: Tlamatini checks the flow against the same agent rules that ACP uses to run it.\n\nWhat this means for you:\n\n- Repeated tools stay repeated. If Multi-Turn ran Executer five times, the flow contains five Executer nodes, not one overwritten node.\n- Agent names are normalized. Names like\n`Kyber-KeyGen`\n\n,`kyber_keygen`\n\n, and`Kyber Keygen`\n\nare resolved to the right template. - Secrets are protected where the contract knows about them. Remote chat super-agents such as TeleTlamatini and WhatsTlamatini have credential-like fields redacted on export.\n- The\n`.flw`\n\nfile stays portable. It does not store`C:/Development/...`\n\nor the installed app path beside`Tlamatini.exe`\n\n.\n\nIf backend normalization is temporarily unavailable, the old browser generator remains as a fallback so the button does not become useless.\n\nThe whole of Tlamatini — chat history, agents, Tool/MCP toggles, sessions, your user — lives in a single SQLite file. The **DB** dropdown gives you a safe, GUI-first way to handle that file: a read-only **Backup** path, a destructive-but-deferred **Set DB** path, and a built-in audit trail under `DB/Older/`\n\n.\n\nOpens a dialog with one input — the **target directory**. The path is live-validated (350 ms debounce): the page asks `GET /agent/check_backup_directory/?path=…`\n\nas you type and colors the status line green / amber / red:\n\n| State | Status | Meaning |\n|---|---|---|\n| 🟢 | `Directory exists. db.sqlite3 will be saved here.` |\nReady to back up. |\n| 🟠 | `A filename was specified — please specify the directory only.` |\nYou typed a file path; the output is always named `db.sqlite3` so it can be loaded back later. |\n| 🔴 | `Directory does not exist.` |\nMissing on disk. |\n\nClick **Backup** → Tlamatini calls `POST /agent/backup_db/`\n\n, resolves the live database path via `settings.DATABASES['default']['NAME']`\n\n(so source / frozen behave identically), and `shutil.copy2`\n\ns it to `<your-dir>/db.sqlite3`\n\n. The live database stays open and unchanged.\n\nThe opposite direction: replace the database on the **next start-up**. Same dialog idiom, stricter validation. The input is the **full path to a db.sqlite3 file**; the page asks\n\n`GET /agent/check_set_db_file/?path=…`\n\nas you type:| State | Status | Meaning |\n|---|---|---|\n| 🟢 | `File exists. It will be loaded on the next start-up.` |\nReal `db.sqlite3` with a valid SQLite header. |\n| 🟠 | `File found, but its name is not \"db.sqlite3\". Tlamatini will still stage it as db.sqlite3.` |\nSnapshot-style names (`db_2026-05-14.sqlite3` ) work — the staging step renames. |\n| 🟠 | `Specify the full path to a db.sqlite3 file, not a directory.` |\nYou typed a directory; Set DB needs a file. |\n| 🔴 | `The selected file does not look like a SQLite database.` |\nFirst 16 bytes don't match the `SQLite format 3\\x00` magic. |\n| 🔴 | `File does not exist.` |\nMissing on disk. |\n\nClick **Set** → `POST /agent/set_db/`\n\ncopies your file into `<base>/DB/ToLoad/db.sqlite3`\n\n(where `<base>`\n\nis the executable directory in frozen mode, the inner Django project directory in source mode). The live database is **not** touched — SQLite is held open by Django, so the actual replacement must wait for a process restart.\n\nImmediately after staging succeeds, the dialog is replaced by a **yellow ⚠ warning panel** with a single **OK** button:\n\nThe selected database will be loaded the next time Tlamatini starts. If you want it loaded immediately, you must restart Tlamatini completely so the swap-in can run BEFORE Django opens the live database.\n\nIf you click **Cancel** instead of **Set**, the staging dialog closes and nothing is written.\n\nThe actual replacement lives at the very top of `Tlamatini/manage.py`\n\n, in `_apply_pending_db_swap()`\n\n. It runs **before any Django import** so Django's SQLite connection pool is never holding a stale file descriptor at the moment of the swap:\n\n```\nmanage.py main()\n    │\n    ▼\n_apply_pending_db_swap()\n    │\n    ▼\n[ DB/ToLoad/db.sqlite3 exists? ]\n    │\n    ├─ NO  ──► return (no-op, normal start-up continues)\n    │\n    └─ YES ──► [1] mkdir DB/Older/<YYYY-MM-DD_HHMMSS>/\n               [2] shutil.move(live db.sqlite3 → Older/<timestamp>/db.sqlite3)\n               [3] shutil.move(DB/ToLoad/db.sqlite3 → live db.sqlite3)\n               [4] return\n    │\n    ▼\nfrom django.core.management import execute_from_command_line   ← only NOW Django wakes up\n```\n\nThree guarantees:\n\n**A Reconnect from the navbar is NOT enough.** The swap window is only open*before*the Django process opens its SQLite pool. You must**fully restart Tlamatini**(close the console / kill the exe, then launch again).** Atomic moves, no copies.**Both legs use`shutil.move`\n\n(filesystem rename when possible, copy+delete across mounts). A second launch with`DB/ToLoad/`\n\nempty is automatically a no-op — no \"stuck flag\" to clear.**Mode-correct path resolution.** Frozen mode reads`<exe_dir>/DB/ToLoad/db.sqlite3`\n\n(where you can browse to it in Explorer); source mode reads`<repo>/Tlamatini/DB/ToLoad/db.sqlite3`\n\n(next to`manage.py`\n\n). The live`db.sqlite3`\n\npath is computed the same way Django does —`_MEIPASS/db.sqlite3`\n\nunder PyInstaller,`<manage.py dir>/db.sqlite3`\n\nin source — so the swap-in always writes to exactly the path Django will open.\n\nIf anything fails inside the swap-in (locked file on Windows, corrupt source, permission error), the function catches the exception, prints `--- [DB SWAP] Skipped due to error: …`\n\nto `tlamatini.log`\n\n, and lets Tlamatini start normally with the previous database. **A bad ToLoad file must never lock you out of your own database.**\n\nEvery successful swap-in leaves a complete record under `<base>/DB/Older/<YYYY-MM-DD_HHMMSS>/db.sqlite3`\n\n. Because Set DB *moves* (not copies) the prior live database, this archive is the only built-in recovery path:\n\n```\nDB/\n├─ ToLoad/                 ← empty most of the time; momentary home of next-session pick\n│   └─ README.md\n└─ Older/\n    ├─ 2026-05-14_153022/db.sqlite3   ← was live before swap #1\n    ├─ 2026-05-14_164410/db.sqlite3   ← was live before swap #2\n    └─ README.md\n```\n\nTo roll back, drop the archived `db.sqlite3`\n\nback into `ToLoad/`\n\nand restart — the swap-in will archive the **current** live database under a fresh timestamp and promote your roll-back pick. Tlamatini never auto-deletes anything from `Older/`\n\n; prune by hand when the tree gets noisy, but remember each file is a full snapshot of chat history + agents + sessions + your user.\n\nBoth directories must exist on day one (the swap-in opens them with `os.makedirs(exist_ok=True)`\n\n, but having them pre-seeded with docs prevents user confusion):\n\n**Source / dev mode**:`Tlamatini/Tlamatini/DB/{ToLoad,Older}/README.md`\n\nare checked into the repo. The README files are the \"git keepers\" — without them, git would silently drop the empty directories.**Frozen mode**:`build.py`\n\nextends its`empty_dirs`\n\ntuple with`\"DB/ToLoad\"`\n\nand`\"DB/Older\"`\n\n. The PyInstaller post-build step creates both under`dist/manage/`\n\n, the`pkg.zip`\n\npackager preserves them via explicit zip entries, and end-users get the tree from the very first launch.\n\nTlamatini ships with **24 skills** — markdown SKILL.md packages under `agent/skills_pkg/`\n\nthat the LLM can invoke through `invoke_skill('<name>', '{...args...}')`\n\n. They cover everything from the canonical `acp-router`\n\n(pick the right external CLI for an intent) and `summarize`\n\n(compress text faithfully) to `setup-new-acpx-key`\n\n, `skill-creator`\n\n, `code-review`\n\n(senior-engineer git-diff review with an APPROVE/REQUEST_CHANGES verdict), `security-audit`\n\n(multi-scanner SAST/secret/dependency sweep) and `kali-pentest`\n\n(an authorized Kali Linux assessment runbook that drives the Kalier agent / MCP-Kali-Server), the `tlamatini_*`\n\naudit/lint/refactor helpers, and integration stubs for GitHub / Notion / Slack / Gmail / Jira / Todoist / Trello / Weather.\n\nBefore 2026-05-17 the only way to interact with them was through the LLM (`list_skills`\n\nto enumerate, `invoke_skill`\n\nto run). The **ACPX-Skills** navbar dropdown — added next to **Agents** and **Config** in the chat toolbar — gives you an operator-grade admin surface that does NOT require the LLM. Four entries:\n\nOpens a two-pane modal: a left-side list of all 24 skills (with a green/red dot for enabled / disabled and a runtime tag) and a right-side detail pane that shows the selected skill's full identity — description, runtime (in-process / acpx), `acpx_agent`\n\nif any, budgets (max_iterations · max_seconds · max_tokens), trigger keywords, `requires_tools`\n\nand `requires_mcps`\n\n, inputs and outputs (with required-field markers), and the full markdown body. A search box at the top filters by name or description as you type. Pure read — nothing is written back.\n\nBacked by `GET /agent/skills/`\n\n(list payload) and `GET /agent/skills/<name>/`\n\n(deep detail). Use it when you want to know what a skill *actually does* before you ask the LLM to call it, or when you've just authored a new SKILL.md and want to confirm it parsed correctly.\n\nA checkbox grid — one row per skill — that mirrors the existing **MCPs** and **Agents** dialogs exactly. Toggle a checkbox off, click **Continue**, and the row's `Skill.enabled`\n\nflips to `false`\n\n. Two consequences immediately:\n\n`list_skills`\n\n(the LLM's enumeration tool) filters that skill out of its returned array.`invoke_skill('<name>', ...)`\n\nreturns`{\"ok\": false, \"code\": \"SKILL_DISABLED\"}`\n\ninstead of running.\n\nToggling back to enabled restores the skill. This is the right knob when (a) you want to hide an unfinished skill from the planner, (b) you don't have the API key for an integration skill (e.g. `notion`\n\nwithout `NOTION_TOKEN`\n\n) and don't want the LLM to keep trying, or (c) you're running a demo and want a minimal tool surface.\n\nThe toggle goes over the same WebSocket channel as `set-mcps`\n\n/ `set-tools`\n\n/ `set-agents`\n\n— payload encoding `name=description=true/false,name=description=true/false,...`\n\n. Backend handler is in `consumers.py::receive`\n\nand calls `save_skill(name, enabled)`\n\nwhich touches only the `enabled`\n\ncolumn.\n\nA cross-check report that catches drift between the skill catalog and the rest of the system. Sections:\n\n**Missing tool dependencies**— for each skill whose`requires_tools`\n\nlists a tool that's currently**disabled** in the Tools dialog, lists the skill + the unmet tools. (A disabled tool means the skill*would*fail at runtime — Diagnostics surfaces it before the LLM tries.)**Missing MCP dependencies**— same idea against disabled`Mcp`\n\nrows.**Unknown ACPX agents**— for skills with`runtime: acpx`\n\n, flags any`acpx_agent`\n\nvalue that isn't in the`AcpAgent`\n\ntable (typo, removed CLI, etc.).**Orphan DB rows**—`Skill`\n\nrows whose SKILL.md file no longer exists on disk. Usually a sign that someone deleted a skill directory without running Reload.\n\nEach section is collapsed when clean (✓ green) and expanded with red ⚠ counts when something's wrong. Run it after editing SKILL.md files or after toggling tools/MCPs to confirm nothing is silently broken. Backed by `GET /agent/skills/_/diagnostics/`\n\n— pure read, no writes.\n\nA single-click button that re-runs the registry boot pipeline: rescan `agent/skills_pkg/`\n\n, refresh every `Skill`\n\nDB row's metadata (description, runtime, frontmatter_json, body_sha256), prune any DB row whose SKILL.md is gone. The user-toggled `enabled`\n\nfield is preserved across reload.\n\nUse this after you've authored or edited a SKILL.md on disk — no server restart needed. The success toast tells you the new skill count.\n\nBy design, the `Skill`\n\nDB table stays at \"enumeration + enable/disable\" only, exactly the way the `Tool`\n\nand `Mcp`\n\ntables work. Per-skill **permissions** (filesystem read/write globs, allowed shell commands, network deny/allow), **budgets** (max_iterations / max_seconds / max_tokens), and the skill's **body** all live in the SKILL.md frontmatter on disk and are the only source of truth. The admin UI deliberately does NOT let you override them from the browser — if you want to change a permission, edit the SKILL.md and click Reload. This keeps `git diff`\n\nhonest: every behavioural change to a skill shows up in a file, not in a database row that the next backup would silently archive.\n\n- HTTP endpoints:\n`agent/views.py`\n\n(`list_skills_view`\n\n,`skill_detail_view`\n\n,`reload_skills_view`\n\n,`skills_diagnostics_view`\n\n) — wired in`agent/urls.py`\n\n. - WebSocket toggle:\n`agent/consumers.py::receive`\n\n(`set-skills`\n\nbranch) →`save_skill(name, enabled)`\n\n. The connect path also calls`skill_establishment()`\n\nfor every row so the frontend's`skills = []`\n\ncache hydrates on session start, mirroring how`tools[]`\n\nand`agents[]`\n\nhydrate. - Tool-surface gating:\n`agent/acpx/tools.py::_disabled_skill_names()`\n\n— fails open on DB exception so a broken admin layer never silently hides skills. - Frontend dialogs:\n`agent/static/agent/js/skills_dialog.js`\n\n(the Configure / Browse / Diagnostics / Reload dialogs) +`agent/static/agent/css/skills_dialog.css`\n\n. - Coverage: 14 tests in\n`agent/tests.py`\n\n(`SkillsAdminEndpointTests`\n\n,`SkillsToolSurfaceGatingTests`\n\n,`SkillsNavbarTemplateContractTests`\n\n).\n\n**Windower** is the desktop **window manager** of the chat surface — the third member of the desktop-UI trio: where **Mouser** clicks *inside* a window and **Keyboarder** types *into* one, Windower commands the **window itself**. It is implemented self-contained on the Win32 API (pywin32 `win32gui`\n\n/`win32con`\n\n/`win32process`\n\n+ `ctypes`\n\n), porting the window-management subset of Microsoft's [Windows-MCP](https://github.com/CursorTouch/Windows-MCP) — including the cross-process `AttachThreadInput`\n\nfocus-transfer dance that lets a background process reliably raise a foreground window. It is **Windows-only** and **state-changing**, so it appears in the Exec Report.\n\nTick **only the Multi-Turn** checkbox (Windower is a normal Multi-Turn tool — it is **not** behind the ACPX/Skill surface, so the ACPX checkbox is *not* required). Then ask, for example:\n\n\"Open Notepad, bring it to the front and maximize it, then tell me its size.\"\n\nTlamatini will launch the app (`chat_agent_executer`\n\n), confirm it is up (`chat_agent_window_present`\n\n), then call ** chat_agent_windower** with\n\n`action='maximize'`\n\nand `window_title='Notepad'`\n\n— and you will **watch the window come to the foreground and fill the screen**. The tool promotes its result fields (\n\n`action`\n\n, `window_title`\n\n, `matched`\n\n, `match_count`\n\n, `state`\n\n, `left`\n\n, `top`\n\n, `width`\n\n, `height`\n\n) to the top level of its JSON return, so the answer can report the live geometry without parsing logs.`action`\n\ncan be any of: `list`\n\n(enumerate every open window with its geometry + state), `focus`\n\n, `minimize`\n\n, `maximize`\n\n, `restore`\n\n, `move`\n\n, `resize`\n\n, `move_resize`\n\n, `close`\n\n(by title), `topmost`\n\n/ `untopmost`\n\n(always-on-top), or `arrange`\n\n(snap/tile to left/right/top/bottom halves, the four quadrants, center, or full). Matching is by `match_mode`\n\n∈ `substring`\n\n(default) / `exact`\n\n/ `regex`\n\n, with `match_index`\n\nto pick among same-titled windows. **Use Windower — not Mouser — whenever the goal is the window as a whole** (bring to front, tile, resize, close by title).\n\nTwo ready-made showcases live in the chat **Prompts** dropdown (the Catalog of Prompts): **WINDOW SPOTLIGHT** (basic — maximize + list + close) and **WINDOW CHOREOGRAPHY** (medium — restore → tile left → tile right → top-left quadrant → move/resize → list → close, so a single window visibly dances around the screen). Pick one, send, and watch.\n\n**Playwrighter** drives a **real browser** (Playwright — Chromium / Firefox / WebKit) through a scripted, interactive, stateful flow. Where **Crawler** does a one-shot static fetch and the `googler`\n\ntool only searches, Playwrighter clicks, fills forms, waits for elements, extracts text/attributes, screenshots, asserts, and downloads — so it can log into a site, submit a multi-step form, click through a wizard, or scrape a JavaScript-rendered single-page-app behind a login. It needs Playwright installed (`pip install playwright && playwright install`\n\n); set ** headless=false** to\n\n*watch*it drive, and\n\n**(alias**\n\n`hold_open_seconds=N`\n\n`hold_open_ms`\n\n) to keep the browser visible for N seconds *after*the last step\n\n*before*it closes — that's the \"wait a few seconds before closing so I can see it\" knob; just ask Tlamatini to wait and it passes it for you.\n\nTick **only the Multi-Turn** checkbox (Playwrighter, like Windower, is a normal Multi-Turn tool — ACPX is not required). Then ask, for example:\n\n\"Open Wikipedia in a visible browser, search for ‘Nahuatl’, and tell me the first paragraph of the article.\"\n\nTlamatini calls ** chat_agent_playwrighter** with\n\n`start_url`\n\n, `headless='false'`\n\n(and `hold_open_seconds`\n\nif you asked it to keep the browser open), and the whole script as a single JSON string in **(the flat**\n\n`steps_json`\n\n`key=value`\n\nrequest grammar cannot express a list-of-dicts, so the script is passed as JSON and the agent `json.loads`\n\nit). Each step is `{\"action\": <verb>, ...}`\n\n; supported verbs are `goto`\n\n, `click`\n\n, `dblclick`\n\n, `fill`\n\n, `type`\n\n, `press`\n\n, `select`\n\n, `check`\n\n/`uncheck`\n\n, `wait_for`\n\n, `wait`\n\n, `extract_text`\n\n, `extract_attr`\n\n, `screenshot`\n\n, `assert_visible`\n\n, `assert_text`\n\n, and `download`\n\n. The run reports `status`\n\n, `final_url`\n\n, `steps_run`\n\n, `assert_result`\n\n, and the extracted values, so a downstream step (or a Forker, on the canvas) can branch on the verdict.Two ready-made showcases live in the **Prompts** dropdown: **BROWSER SPOTLIGHT** (basic — open `example.com`\n\nwith a visible browser, extract the heading, assert the link, screenshot) and **BROWSER WIZARD** (medium — a visible multi-step Wikipedia search: fill → click → wait → extract → assert → screenshot). The canvas counterpart is the visual **Playwrighter** node (see §4 and §9.5); the YAML `steps`\n\nlist is its authoring form.\n\n**Kalier** bridges Tlamatini to **Kali Linux** offensive-security tooling through the [MCP-Kali-Server](https://www.kali.org/tools/mcp-kali-server/). That project runs a small Flask **API server** (`server.py`\n\n) on the Kali box exposing `/api/command`\n\n, `/api/tools/<tool>`\n\nand `/health`\n\n; Kalier talks to it directly over HTTP (Python-stdlib `urllib`\n\n, no extra packages in the agent pool), so it is the canonical tool for **AI-assisted penetration testing, recon, and CTF solving**. It is **state-changing**, so it appears in the Exec Report.\n\nFirst, get the MCP-Kali-Server (`server.py`\n\n) running on your Kali machine and reachable from Tlamatini. **Tlamatini is the embedded client** — you no longer need Claude Desktop's `client.py`\n\n; instead you set the Kali box URL **once** in ** Config ▸ URLs → Kali server (Kalier)** (the\n\n`kali_server_url`\n\nkey in `config.json`\n\n, default `http://127.0.0.1:5000`\n\n). That default already works when Kali runs in WSL2 with localhost forwarding or when you SSH-tunnel the port (`ssh -L 5000:localhost:5000 user@KALI_IP`\n\n); for a LAN Kali box set it to `http://<KALI_LAN_IP>:5000`\n\n. (See [for the full zero-client walkthrough.)](/XAIHT/Tlamatini/blob/main/Tlamatini-Kali-Setup.md)\n\n`Tlamatini-Kali-Setup.md`\n\n**Authorized targets only**— run Kalier solely against systems you own or are explicitly authorized to test (engagement, lab, CTF).\n\nTick **only the Multi-Turn** checkbox (Kalier is a normal Multi-Turn tool — not behind the ACPX/Skill surface). Then ask, for example:\n\n\"Scan 10.0.0.5 with an nmap -sCV on ports 1-1000 and summarize the open services.\"\n\nTlamatini calls ** chat_agent_kalier** with\n\n`action='nmap'`\n\n, `target='10.0.0.5'`\n\n, `scan_type='-sCV'`\n\n, `ports='1-1000'`\n\n— and **auto-injects your configured** from\n\n`server_url`\n\n`kali_server_url`\n\n, so you never repeat the Kali box address in a prompt (the LLM only passes `server_url=`\n\nexplicitly to hit a different one-off box). The `action`\n\nfield selects the capability: `command`\n\n(any shell command on the Kali box), `nmap`\n\n, `gobuster`\n\n, `dirb`\n\n, `nikto`\n\n, `sqlmap`\n\n, `metasploit`\n\n, `hydra`\n\n, `john`\n\n, `wpscan`\n\n, `enum4linux`\n\n, or `health`\n\n(probe the server and which tools are installed — a good first call when you are unsure the API is reachable). The tool returns the Kali tool's stdout/stderr verbatim and captures an `INI_SECTION_KALIER`\n\nblock (`action`\n\n, `endpoint`\n\n, `subject`\n\n, `return_code`\n\n, `success`\n\n, `timed_out`\n\n, `server_url`\n\n) for the Exec Report and Parametrizer.On the canvas the same capability is the visual **Kalier** node (see §4 and §9.5): chain `Starter → Kalier (nmap) → Parametrizer → Kalier (gobuster) → Forker → Ender`\n\nto build a fully unattended, branch-on-result assessment pipeline. The visual node and the chat tool share the same MCP-Kali-Server contract.\n\n**STM32er** bridges Tlamatini to STM32 microcontroller firmware development through the [STM32 Template Project MCP](https://github.com/XAIHT/STM32TemplateProjectMCP) — a FastMCP server that exposes project scaffolding, build, flash, serial / SWD observation, and reset. It is **state-changing** (it compiles firmware and writes to hardware), so it appears in the Exec Report. The visual canvas counterpart is the **STM32er** node (see §4 and §9.5); both surfaces share the same MCP contract.\n\nZero-config auto-bootstrap — you only install STM32CubeIDE + Tlamatini.With no on-disk`server_script`\n\nconfigured (the new default), STM32erdownloads the MCPfrom its git repo (a shallow`git clone`\n\n, or a GitHub-zip fallback when`git`\n\nis absent) into a per-user cache (`%LOCALAPPDATA%/Tlamatini/STM32TemplateProjectMCP`\n\n),pip-installs its deps(`mcp`\n\n+`pyserial`\n\n) if they are missing, andvalidatesthe server — all on first use. Nothing to clone by hand, no path to set. A`bootstrap`\n\naction triggers this explicitly;`auto_bootstrap`\n\n(default`true`\n\n) does it lazily before the first real action.\n\n⚠️ Critical-mission safety preflight.Before it compiles or flashes anything, STM32er runs a`validate`\n\npreflight that checks the environment:`arm-none-eabi-gcc`\n\n, STM32CubeIDE,`make`\n\n/`cmake`\n\n,`STM32_Programmer_CLI`\n\n, the ST-LINK USB driver,a positively-confirmed connected ST-LINK probe, and the target device family. If the environment is wrong — or the request would target the wrong STM32 family — STM32errefusesrather than producing or flashing mis-targeted firmware. Hardware isconditional: compile-only actions (`build`\n\n,`list_artifacts`\n\n,`clean`\n\n,`create_project`\n\n,`write_source`\n\n) needno board; hardware actions (`flash`\n\n,`erase`\n\n,`reset`\n\n,`serial_*`\n\n, SWD reads,`live_*`\n\n)requirea connected ST-LINK. The bundled MCP template isSTM32F407VG-specific, so a cross-`STM32F`\n\n-family device mismatch is refused (a multi-family MCP fork is future work).\n\nTick **only the Multi-Turn** checkbox (STM32er is a normal Multi-Turn tool — not behind the ACPX/Skill surface). Then ask, for example:\n\n\"Scaffold a blinky project for the STM32F407, build it, and flash it to the connected board.\"\n\nTlamatini calls ** chat_agent_stm32er** — bootstrapping the MCP if needed, running the safety preflight, then driving the build and flash. The\n\n`action`\n\nfield selects one of **23 MCP tools**(project scaffold / build / flash / erase / reset / serial / SWD / observe / …),\n\n**2 composites**(\n\n`serial_session`\n\n, `live_monitor`\n\n), or **2 meta-actions**(\n\n`bootstrap`\n\n, `validate`\n\n). The tool captures an `INI_SECTION_STM32ER`\n\nblock for the Exec Report and Parametrizer.On the canvas the same capability is the visual **STM32er** node: chain `Starter → STM32er (validate) → Forker → STM32er (build) → STM32er (flash) → STM32er (serial_session) → Ender`\n\nto build a fully unattended, validate-gated firmware pipeline. Three ready-made [catalog demos](#35-tutorial-the-multi-turn-toggle) ship in migration `0103`\n\n: **STM32 GENESIS** (bootstrap + validate + compile, no board needed), **STM32 BLINKY** (validate + build + flash), and **STM32 HIL OBSERVATORY** (a validate-gated real-hardware flash + SWD + serial + reset). The zero-config end-to-end path (download → build → flash → reset) is verified on a real **STM32F407G-DISC1**, with 122 automated tests in `agent/test_stm32er_agent.py`\n\n.\n\n📟\n\nSerial output on a Discovery board needs a wire — the VCP is not bridged to the MCU's UART.This matters specifically for the3rd STM32er demo, STM32 HIL OBSERVATORY, whose`serial_session`\n\nstep reads the firmware's`BOOT tlamatini hil count=…`\n\nbanner. The on-board ST-LINK on theSTM32F4-Discovery family (including the STM32F407G-DISC1)provides debug (SWD) but —unlike ST— doesNucleoboardsnotinternally route its USBVirtual COM Portto any of the target STM32F407's USART pins. A firmware that prints overUSART2 (PA2 = TX, PA3 = RX)therefore showsnothingon the ST-LINK VCP regardless of baud or timeout: on the PCB those bytes have nowhere to go. To actually read that stream you mustbridge the port yourself with an external USB-to-UART (USB-TTL) adapter— cross-wire adapterRX ← PA2, adapterTX → PA3,GND ↔ GND— and point`serial_session`\n\natthat adapter'sCOM port (not the ST-LINK VCP). No wiring is needed for the demo'sprimaryhardware proof: the`live_monitor`\n\nstep samples the live`g_blink_count`\n\ncounter from the running MCU over theSWD debug channel, so it works on a bare, unwired board — which is exactly why STM32er treats anempty VCP read on a Discovery board as, and why the HIL demo proves the firmware is alive over SWD first and treats the serial banner as a bonus.expected, not a failure\n\nThe chat is great for one-off tasks. The designer is for jobs you want **scheduled**, **unattended**, or **identically reproducible**.\n\n```\n┌────────────────────────────────────────────────────────────────────────┐\n│ ▶ Start  ⏸ Pause  ⏹ Stop  ⚠ Hypervisor  💾 Save  📂 Load  ✓ Validate │\n├──────────────────┬─────────────────────────────────────────────────────┤\n│ Sidebar          │                                                     │\n│ ─ Control        │                                                     │\n│   Starter, Ender │            CANVAS (#canvas-content)                 │\n│ ─ Routing        │       (draggable agents, typed connections,         │\n│   Forker, Asker  │        green-running / red-down / yellow-paused     │\n│ ─ Logic Gates    │        LEDs)                                        │\n│   AND OR Barrier │                                                     │\n│ ─ Action / etc.  │                                                     │\n└──────────────────┴─────────────────────────────────────────────────────┘\n```\n\n- The canvas\n**scrolls**: viewport is`#submonitor-container`\n\n, content layer is`#canvas-content`\n\n. New canvas-level features should be children of`#canvas-content`\n\n. **Connections are typed**: green = \"start the target after this finishes\" (`target_agents`\n\n), blue = \"monitor this source's log\" (`source_agents`\n\n).**Double-click** an agent to edit its config.**Right-click** for description / log / explore-dir / open-cmd / restart.\n\nGoal: run a shell command, take a screenshot, end.\n\n- Drag\n**Starter** onto the canvas (top-left). - Drag\n**Executer** to its right. - Drag\n**Shoter** further right. - Drag\n**Ender** at the far right. - Connect: Starter → Executer → Shoter → Ender (drag from the right edge of one to the left edge of the next).\n- Double-click\n**Executer**, set`command`\n\nto`dir C:\\`\n\n(or`ls /tmp`\n\n). - Double-click\n**Shoter**, set`output_dir`\n\nto a writable folder. - Leave\n**Ender** wiring to Tlamatini. Validate/Start will calculate Ender's`target_agents`\n\nkill list from the arrows. - Click\n**✓ Validate**— Tlamatini compiles the visible canvas, then runs structural checks (no orphans, no self-connections, terminal agents reachable). - Click\n**▶ Start**. LEDs go green, then gray. Open`output_dir`\n\n— there's a screenshot.\n\n**💾 Save** — pick a name. You get a JSON file with positions, configs, and connections. Distribute to colleagues; they **📂 Load** the same file and run the same flow. `.flw`\n\nis also what the chat's **Create Flow** button emits.\n\nA `.flw`\n\nfile is meant to describe the **idea of the flow**, not the exact machine it was created on. A good `.flw`\n\nsays:\n\n- \"There is a Starter here.\"\n- \"There is an Executer there.\"\n- \"Starter connects to Executer.\"\n- \"Executer uses this script.\"\n\nIt should **not** say:\n\n- \"This flow only works from\n`C:/Development/Tlamatini/...`\n\n.\" - \"This flow only works from the install folder on Angel's PC.\"\n- \"This Parametrizer mapping exists somewhere in a temporary pool directory, good luck.\"\n\nSaved flows now carry a small `schemaVersion`\n\nplus an `artifacts`\n\nsection. The most important artifact today is Parametrizer mappings. When you save a flow with a Parametrizer, Tlamatini keeps the mapping in the `.flw`\n\n. When you load the flow later, Tlamatini recreates `interconnection-scheme.csv`\n\nfor that Parametrizer in the current session pool.\n\nFor a beginner, the practical rule is simple: **if you configured Parametrizer with the mapping dialog, Save/Load should remember that mapping.**\n\nThis is the most important reliability change in the visual designer.\n\nBefore, Validate mostly read whatever agent configs already existed in the pool directory. That could become stale:\n\n- You drag nodes around.\n- You load a\n`.flw`\n\n. - You edit a config.\n- You reconnect an edge.\n- The pool directory still contains an older\n`config.yaml`\n\n. - Validate or Start reads that older file and acts confused.\n\nNow ACP takes a fresh **snapshot of the canvas** before validation and start. The snapshot includes:\n\n- every visible node;\n- each node's position;\n- every connection;\n- input and output slot numbers;\n- current in-browser config;\n- Parametrizer mappings.\n\nThe backend then compiles that snapshot into real pool `config.yaml`\n\nfiles using the Agent Contract Registry. In beginner terms: **the picture on the screen becomes the source of truth.**\n\nAnother important nuance: if you opened an agent dialog and manually edited wiring-sensitive fields such as `source_agents`\n\nor an Ender kill list, those dialog edits now survive compilation. Canvas edges still contribute their live connections, but a deliberate dialog override is no longer silently discarded by Validate or Start.\n\nWhat happens when you click **✓ Validate**:\n\n- Browser captures the live canvas.\n- Backend compiles it in dry-run mode.\n- Frontend validates the compiled configs.\n- Nothing is written to disk just for validation.\n\nWhat happens when you click **▶ Start**:\n\n- Browser captures the live canvas.\n- Backend compiles it in write mode.\n- Pool folders/configs are updated.\n- Logs are cleared.\n- Starter agents launch.\n\nThis removes a whole class of \"I swear I connected it correctly, why is it running the old thing?\" problems.\n\n| Button | What happens |\n|---|---|\n⏸ Pause |\nSaves running agents into `paused_agents.reanim` , kills them, leaves logs and `reanim*` state files intact. LEDs go yellow. |\n▶ Resume (after pause) |\nReanimates each saved agent with `AGENT_REANIMATED=1` . Each agent reads its `reanim*` files and continues from where it stopped. |\n⏹ Stop |\nHard stop. Ender runs termination logic; reanimation files are cleared. |\n\nThis is why long-running workflows (Crawler scraping 10k URLs, Parametrizer iterating segments) survive pauses.\n\nStop is also safer in mixed flows now: the ACP cleanup path is better at terminating leftover session processes before the next run begins, so partially mixed ACP sessions are less likely to leave orphaned agents behind.\n\nClick **⚠ Hypervisor** — a system FlowHypervisor agent starts watching every running agent. It is an LLM that reads each agent's log, builds an NxN connection matrix from the canvas wiring, and emits exactly ** OK** or\n\n**. If it raises, the browser pops an alert. Add custom rules to**\n\n`ATTENTION NEEDED { explanation }`\n\n`user_instructions`\n\nin its config.Drag **FlowCreator**, double-click, and type a natural-language objective:\n\n\"Every hour, crawl our status page; if it shows ERROR, email the on-call engineer.\"\n\nClick **Generate**. FlowCreator reads `agentic_skill.md`\n\n(its design playbook), produces a JSON flow description, and renders agents + connections onto the canvas. Tweak and run.\n\nTlamatini agents communicate through **log files** and ** config.yaml**. Parametrizer is the bridge: it reads structured segments from a source agent's log, injects mapped values into a target agent's\n\n`config.yaml`\n\n, runs the target, restores the config, advances the cursor, repeats.The unified output format every Parametrizer-friendly agent emits:\n\n```\nINI_SECTION_<AGENT_TYPE><<<\nkey1: value1\nkey2: value2\n\nmulti-line body content (becomes 'response_body')\n>>>END_SECTION_<AGENT_TYPE>\n```\n\n25 source agents support this format: Apirer, Gitter, Kuberneter, Crawler, Summarizer, File-Interpreter, Image-Interpreter, File-Extractor, Prompter, FlowCreator, Kyber-KeyGen/Cipher/DeCipher, Gatewayer, Gateway-Relayer, Googler, **Playwrighter**, **ACPXer**, Shoter, Mouser, **Windower**, **Unrealer**, **Reviewer**, **Analyzer**, **Kalier**.\n\nCanonical example:\n\n```\nApirer ─► Parametrizer ─► Kyber-Cipher\n```\n\nApirer hits 3 endpoints → 3 `INI_SECTION_APIRER<<<`\n\nblocks → Parametrizer maps `response_body → buffer`\n\n→ Kyber-Cipher runs 3 times, encrypting each body. No manual config editing. Pause-safe. Single-lane queue.\n\nThe mapping dialog is now part of normal flow persistence:\n\n- Connect exactly one source into Parametrizer.\n- Connect Parametrizer to exactly one target.\n- Double-click Parametrizer.\n- Click a source field on the left.\n- Click a target config field or target marker on the right.\n- Save mappings.\n- Save the\n`.flw`\n\n.\n\nWhen the `.flw`\n\nis loaded later, Tlamatini restores the mappings and writes the Parametrizer's `interconnection-scheme.csv`\n\nagain. You do not need to remember which pool directory had the CSV.\n\nOne limitation is intentional: one Parametrizer is a **single-lane queue** from one source to one target. If one API response must feed Emailer and File-Creator, use two Parametrizers.\n\nTwo trigger modes:\n\n| Mode | When |\n|---|---|\nHTTP webhook |\nCI server, SaaS callback, cron, curl, internal portal — anything that POSTs. Auth: `bearer` / `hmac` / `none` . Validates → dedups → queues → starts `target_agents` . |\nFolder-drop watcher |\nIndustrial / IoT — sensor writes JSON to a shared folder. Gatewayer polls, archives, fires. |\n\nPending events survive crashes via `reanim_queue.json`\n\n. To accept GitHub-style webhooks (which sign only the body), put the bundled **Gateway-Relayer** in front.\n\n**ACPX = Agent Communication Protocol eXtension.** It spawns external coding-agent CLIs as out-of-process child processes, talks to them over stdin/stdout, persists the conversation as NDJSON transcripts, and brokers them to the chat LLM as 12 native tools. It is a Python port of OpenClaw's ACPX plugin — `agent_id`\n\nmapping, `permissionMode`\n\nvocabulary, and `SKILL.md`\n\nfrontmatter all match verbatim.\n\nDefined in `agent/acpx/agent_registry.py::DEFAULT_ACP_AGENTS`\n\n. User overrides go in `config.json`\n\nunder `acpx.agents.<id>`\n\n.\n\n`agent_id` |\nDefault command | Transport | Prompt form |\n|---|---|---|---|\n`claude` |\n`claude` |\n`oneshot-prompt` |\n`claude -p \"<task>\"` |\n`codex` |\n`codex` |\n`oneshot-prompt` |\n`codex exec \"<task>\"` |\n`cursor` |\n`cursor-agent` |\n`oneshot-prompt` |\n`cursor-agent -p \"<task>\"` |\n`gemini` |\n`gemini` |\n`oneshot-prompt` |\n`gemini -p \"<task>\"` |\n`qwen` |\n`qwen-code` |\n`oneshot-prompt` |\n`qwen-code -p \"<task>\"` |\n`tlamatini` |\n`python -m agent.acpx.self_acp_server` |\n`json-acp` |\nstdin envelope |\n`kiro / kimi / iflow / kilocode / opencode / pi / droid / copilot` |\n(own command) | `tui-repl` |\nstdin |\n\n**Transport modes:**\n\n— fresh process per turn; prompt is a CLI arg; stdin closes; stdout captured to EOF. The`oneshot-prompt`\n\n**only** transport that reliably captures TUI agents on Windows (TUI CLIs detect a piped stdout and refuse to flush in long-lived mode).— child speaks one JSON envelope per turn, ends with`json-acp`\n\n`{\"done\": true}`\n\n.— long-lived REPL; transport-aware idle rule fires after`tui-repl`\n\n`startup_grace + idle_seconds`\n\neven with zero events (a silent TUI is, by definition, finished).\n\nAll return JSON envelopes. Failures: `{\"ok\": false, \"reason\": \"...\", \"code\": \"...\"}`\n\n.\n\n| Tool | What it does |\n|---|---|\n`acp_doctor()` |\nHealth probe + per-agent enumeration with `resolvable` and `cli_version` . Always call first. |\n`list_acp_agents()` |\nCheap enumeration without the version probe. |\n`acp_spawn(agent_id, task, …)` |\nSpawn child. Returns `session_id` , `transport` , `transcript_path` , `events` . TUI agents return sub-second. |\n`acp_send(session_id, text, …)` |\nSend a follow-up turn. |\n`acp_send_and_wait(session_id, text, until_idle_seconds=10, max_wait_seconds=180)` |\nSend and block until child settles. Prefer this for \"wait for the full answer\". |\n`acp_kill(session_id)` |\nTerminate. Returns `transcript_path` so Exec Report can cite it. |\n`acp_transcript(session_id, max_chars, direction)` |\nRead the on-disk NDJSON transcript. |\n`acp_session_status(session_id)` |\n`{alive, pid, transcript_size, last_event_at, closed}` . |\n`acp_list_sessions()` |\nEnumerate live sessions. |\n`acp_relay(session_id_src, session_id_dst, …)` |\nSingle-call hand-off — replaces transcript→string→send. |\n`invoke_skill(skill_name, args_json)` |\nRun a `SKILL.md` package inside `SkillHarness` . |\n`list_skills(filter_keywords)` |\nList registered skills. |\n\n24 seed skills live under `agent/skills_pkg/`\n\n(acp-router, summarize, setup-new-acpx-key, skill-creator, code-review, security-audit, kali-pentest, 8× tlamatini-* maintenance helpers, plus OpenClaw-format ports for github / gmail / slack / jira / notion / todoist / trello / weather).\n\nTick **Multi-Turn + ACPX + Exec Report** and send:\n\n\"Spawn claude in\n\n`C:/Development/Tlamatini`\n\n, ask it to list the top-level files, harvest the answer, and kill the session.\"\n\nExpected tool sequence:\n\n```\nacp_doctor\n  → acp_spawn(agent_id=\"claude\", task=\"list top-level files\")\n    → acp_send_and_wait(session_id, \"...\")\n      → acp_kill(session_id)\n```\n\n\"Spawn claude in this dir, ask it to draft a refactor of\n\n`worker.py`\n\n. Spawn gemini, relay claude's answer to it, ask gemini to critique. Kill both.\"\n\nExpected sequence:\n\n```\nacp_doctor\n  → acp_spawn(claude, draft_task)\n    → acp_send_and_wait(session_a, …)\n      → acp_spawn(gemini, critique_template)\n        → acp_relay(session_a, session_b)     # ONE call — transform=last_assistant_text\n          → acp_kill(session_a)\n            → acp_kill(session_b)\n```\n\nWithout `acp_relay`\n\n, that hand-off is three calls (`acp_transcript`\n\n→ string-manipulate → `acp_send`\n\n). Always prefer the dedicated tool.\n\nTwo layers exist in `config.json`\n\n:\n\n```\n{\n  \"ANTHROPIC_API_KEY\": \"sk-ant-...\",     // Layer 1: Tlamatini's own cloud calls\n  \"GEMINI_API_KEY\": \"AIza...\",\n  \"acpx\": {\n    \"agents\": {\n      \"claude\": { \"env\": { \"ANTHROPIC_API_KEY\": \"sk-ant-...\" } },   // Layer 2: spawned child env\n      \"gemini\": { \"env\": { \"GEMINI_API_KEY\": \"AIza...\", \"GOOGLE_API_KEY\": \"AIza...\" } },\n      \"codex\":  { \"env\": { \"OPENAI_API_KEY\":  \"sk-...\" } },\n      \"qwen\":   { \"env\": { \"DASHSCOPE_API_KEY\": \"sk-...\" } }\n    }\n  }\n}\n```\n\nMerge order at spawn: `{**os.environ, **spec.env}`\n\n— explicit per-agent `env`\n\nwins over an exported shell variable.\n\n**Easier path** — invoke the `setup-new-acpx-key`\n\nskill from chat (Multi-Turn + ACPX ticked):\n\n\"Use\n\n`invoke_skill`\n\nwith`setup-new-acpx-key`\n\nto register my Anthropic key for the`claude`\n\nagent_id.\" (paste the key)\n\nThe skill writes `data.keys`\n\n, patches both `config.json`\n\nlayers, optionally extends `regen_secrets.py`\n\n, and verifies via `acp_doctor`\n\n.\n\nSecurity:`config.json`\n\nis git-tracked. Use`python regen_secrets.py --mode push-able`\n\nto swap real keys for placeholders before commit;`--mode keyed`\n\nrestores from`data.keys`\n\n(gitignored). Never commit`data.keys`\n\n.\n\n**ACPXer** is the canvas-facing version of the 12 LLM-facing tools. One ACPXer node = one full ACPX session lifecycle. It is **self-contained** — does NOT import `agent.acpx`\n\n— because pool subprocesses can't import `agent.*`\n\n. Mirrors the runtime's transport-aware drain inline (~120 lines), writes byte-identical NDJSON transcripts, and emits Parametrizer-compatible `INI_SECTION_ACPXER<<<`\n\nblocks.\n\nCanonical visual relay flow:\n\n```\nStarter → ACPXer(claude) → Parametrizer → ACPXer(gemini) → Parametrizer → ACPXer(cursor) → File-Creator → Ender\n```\n\nThree different LLMs argue back and forth, fully visual, fully unattended.\n\nThe **Unrealer** agent (#62 in the catalog) lets Tlamatini drive a live Unreal Engine 5 editor through the **Unreal MCP** plugin's TCP socket protocol. You spawn a `chat_agent_unrealer`\n\ncall from Multi-Turn or drop an **Unrealer** node on the visual canvas; Tlamatini opens a TCP connection to `127.0.0.1:55557`\n\n, sends one JSON command (`{\"type\": <verb>, \"params\": {...}}`\n\n), captures the engine's JSON response into an `INI_SECTION_UNREALER<<<`\n\nblock, and triggers downstream agents. Because the agent forwards whatever `command`\n\n+ `params`\n\nyou give it, the catalog is exactly whatever your connected plugin build exposes — from the base 28-command upstream release up to the **53-command, nine-category extended surface** (actor manipulation incl. viewport screenshots, Blueprint creation and graph wiring, input mappings, UMG widget building, in-editor Python/console execution, level I/O, asset import, and material authoring) shipped by Tlamatini's own plugin fork, (the Unreal Engine MCP modified specifically for this system — see §6.2) — without you ever leaving the chat or the canvas.\n\n`XAIHT/XaihtUnrealEngineMCP`\n\n**Unreal MCP** is an open-source UE5 plugin (Model Context Protocol over a TCP socket) that runs **inside the Unreal Editor process** and accepts one JSON command per TCP connection. Each command names a verb (`spawn_actor`\n\n, `create_blueprint`\n\n, `compile_blueprint`\n\n, `add_widget_to_viewport`\n\n, …) and a dictionary of parameters; the plugin schedules the work onto UE5's game thread, executes it, and writes a JSON response back over the same socket before closing it. The wire shape is small — `{\"type\": <command>, \"params\": {...}}`\n\ngoing in, `{\"status\": \"ok\"|\"error\", \"result\": {...}}`\n\n(or `{\"success\": false, \"error\": \"...\"}`\n\n) coming back — and is identical across every documented command.\n\n**Tlamatini does not embed or compile the plugin.** It is a *client* of whatever UE5 instance the user has already started. The engine must be open, the plugin must be enabled, and its in-engine listener must be bound to `127.0.0.1:55557`\n\n(the default — configurable per-call via `host`\n\n/ `port`\n\n). Tlamatini contributes the calling side: one wrapped Multi-Turn tool, one visual canvas node, one Agent Contract entry, one Exec Report row family, and one Parametrizer source mapping — all built around the same `UnrealConnection`\n\nadapter at `agent/agents/unrealer/unrealer.py`\n\n.\n\n**Recommended — Tlamatini's own extended fork.** The plugin Tlamatini is built and tested against is the **Unreal Engine MCP modified specifically for this system**:\n\n**Repository:**`https://github.com/XAIHT/XaihtUnrealEngineMCP.git`\n\n**What it is:** the canonical`chongdashu/unreal-mcp`\n\nplugin forked and extended for Tlamatini. It ships the full**53-command, nine-category** surface this chapter documents — the base editor / blueprint / node / project / umg verbs**plus** the System / Level / Asset / Material families and the newer`take_screenshot`\n\n/`focus_viewport`\n\n/`set_pawn_properties`\n\n/`find_blueprint_nodes`\n\nverbs.**Plugin folder name:**`UnrealMCP`\n\n**Default plugin TCP port:**`55557`\n\non`127.0.0.1`\n\n**Supported UE versions:** Unreal Engine 5.5+\n\nIt speaks the **identical wire protocol on the identical port** as every other build below, so it is a drop-in: Tlamatini's `UnrealConnection`\n\nadapter needs no client-side changes to use it. Install this one if you want the System / Level / Asset / Material families that the seeded demos `idPrompt 60/61/62`\n\n(§6.5) exercise.\n\n**Upstream base.** The XAIHT fork is built on the canonical reference implementation Tlamatini's `UnrealConnection`\n\nadapter mirrors verbatim:\n\n**Repository:**`https://github.com/chongdashu/unreal-mcp`\n\n**License:** MIT**Supported UE versions:** Unreal Engine 5.5+\n\nIf you only ever need the base 28-command surface (editor / blueprint / node / project / umg), the upstream is enough on its own.\n\n**Equivalent community forks.** Two other forks ship the same wire protocol on the same port; either works with Tlamatini's Unrealer with **no client changes**:\n\n`https://github.com/CrispyW0nton/Unreal-MCP-Ghost`\n\n`https://github.com/gingerol/vhcilab-unreal-engine-mcp`\n\nPick the build that matches your UE5 version and your team's licensing comfort. If you fork the plugin to add a new command verb, your fork is automatically usable from Tlamatini — there is no client-side allow-list of verbs (the wrapped tool forwards any `command`\n\n+ `params`\n\npair verbatim).\n\nThe plugin is a per-project install (not engine-wide). Steps:\n\n**Clone or download** the plugin — the recommendedfork from §6.2, or any compatible build (only the`XAIHT/XaihtUnrealEngineMCP`\n\n`MCPGameProject/Plugins/UnrealMCP`\n\nfolder matters — different forks may name the folder slightly differently; rename to`UnrealMCP`\n\nif needed).**Drop the folder** into your project's`Plugins/`\n\ndirectory so the final path is`<YourProject>/Plugins/UnrealMCP/UnrealMCP.uplugin`\n\n. Create the`Plugins/`\n\nfolder at the project root if it does not exist.**Open the project in UE5.** The editor will detect the new plugin and offer to rebuild it for your engine version — accept. If you opened a Blueprint-only project, you will be prompted to install Visual Studio Build Tools / Xcode command-line tools first, since the plugin is C++.**Enable the plugin** via`Edit → Plugins → search \"UnrealMCP\" → tick Enabled`\n\n. Restart the editor.**Confirm the listener is bound.** With the editor running, open the**Output Log**(`Window → Developer Tools → Output Log`\n\n) and look for a line such as`LogTemp: UnrealMCP listening on 127.0.0.1:55557`\n\n. That line is your green light: the plugin is now waiting for JSON commands on the loopback interface.\n\nYou do not need to press Play (PIE).The plugin listens ateditorlevel — actor manipulation, Blueprint creation, widget construction, etc. all work against the open project even when PIE is stopped. Some UMG operations (`add_widget_to_viewport`\n\n) physically render only after the user enters PIE, but the build steps are queued correctly either way.\n\nThe Unrealer agent forwards whatever `command`\n\n+ `params`\n\nyou pass it, so the exact catalog is whatever your connected plugin build exposes — there is **no client-side allow-list of verbs**. The canonical chongdashu/unreal-mcp release ships **28 commands across 5 categories** (rows marked `base`\n\nbelow); plugin builds that add the System / Level / Asset / Material command handlers — such as Tlamatini's own extended fork [ XAIHT/XaihtUnrealEngineMCP](https://github.com/XAIHT/XaihtUnrealEngineMCP.git) (§6.2) — bring the total to\n\n**53 commands across 9 categories**:\n\n| Category | Commands | Tier |\n|---|---|---|\neditor |\n`get_actors_in_level` , `find_actors_by_name` , `spawn_actor` , `create_actor` , `delete_actor` , `set_actor_transform` , `get_actor_properties` , `set_actor_property` , `spawn_blueprint_actor` , `focus_viewport` , `take_screenshot` |\nbase + `focus_viewport` /`take_screenshot` |\nblueprint |\n`create_blueprint` , `add_component_to_blueprint` , `set_static_mesh_properties` , `set_component_property` , `set_physics_properties` , `compile_blueprint` , `set_blueprint_property` , `set_pawn_properties` |\nbase + `set_pawn_properties` |\nnode |\n`add_blueprint_event_node` , `add_blueprint_input_action_node` , `add_blueprint_function_node` , `connect_blueprint_nodes` , `add_blueprint_variable` , `find_blueprint_nodes` , `add_blueprint_get_self_component_reference` , `add_blueprint_self_reference` |\nbase + `find_blueprint_nodes` |\nproject |\n`create_input_mapping` |\nbase |\numg |\n`create_umg_widget_blueprint` , `add_text_block_to_widget` , `add_button_to_widget` , `bind_widget_event` , `add_widget_to_viewport` , `set_text_block_binding` |\nbase |\nsystem |\n`execute_python` , `execute_console_command` , `get_class_info` , `list_assets` |\nextended |\nlevel |\n`open_level` , `save_current_level` , `save_all` , `new_level` , `get_current_level` |\nextended |\nasset |\n`import_asset` , `duplicate_asset` , `rename_asset` , `delete_asset` , `save_asset` , `create_folder` |\nextended |\nmaterial |\n`create_material` , `create_material_instance` , `set_material_parameter` , `assign_material` |\nextended |\n\n`execute_python`\n\nis the **universal escape hatch** — it runs an arbitrary Python script inside the editor, so anything in UE5's `unreal`\n\nPython API (Niagara, Sequencer, landscape, audio, …) is reachable even when no dedicated verb exists. `take_screenshot`\n\ncloses the observe→act loop: spawn or change something, then capture the viewport to verify it. Note that the plugin's **headless build/cook/test** tools (`build_project`\n\n, `run_automation_tests`\n\n, `run_macro`\n\n) are *not* part of this catalog — they shell out to `UnrealEditor-Cmd`\n\nas separate processes and are unreachable over the editor's TCP socket. Chain Unrealer nodes through a Parametrizer for the `run_macro`\n\nequivalent.\n\nParam shapes vary per command (e.g. `spawn_actor`\n\nwants `name`\n\n+ `type`\n\n+ `location`\n\n+ `rotation`\n\n; `create_blueprint`\n\nwants `name`\n\n+ `parent_class`\n\n; `set_material_parameter`\n\nwants `material`\n\n+ `parameter`\n\n+ `value`\n\n; `import_asset`\n\nwants `source_file`\n\n(a disk path) + `destination_path`\n\n(a `/Game`\n\ncontent path)). The Unrealer agent does not validate them — it forwards them as-is, after two defensive fixups: it normalizes `/Content/...`\n\ncontent paths to `/Game/...`\n\n, prunes unset placeholder params, and remaps `params.console_command`\n\n→ the wire's `params.command`\n\nfor `execute_console_command`\n\n(so the console line doesn't collide with the top-level `command:`\n\nselector). The plugin will reply with `{\"status\": \"error\", \"error\": \"<reason>\"}`\n\nif a param is missing or malformed, and that error lands verbatim in the `INI_SECTION_UNREALER`\n\nblock so Multi-Turn / Parametrizer can branch on it.\n\nThe wrapped Multi-Turn tool `chat_agent_unrealer`\n\nis the easiest way in. Tick **Multi-Turn** in the toolbar, leave **Exec Report** ticked too (the Unreal calls get their own table in the answer), and send a prompt like:\n\n\"Run Unreal command with command='spawn_actor' and params.name='MyCube' and params.type='StaticMeshActor' and params.location=[0,0,150].\"\n\nThe planner picks `chat_agent_unrealer`\n\n, the wrapped runtime spawns one short-lived `unrealer.py`\n\nchild under `agent/agents/pools/_chat_runs_/unrealer_<seq>_<id>/`\n\n, the child opens a TCP socket to `127.0.0.1:55557`\n\n, sends the JSON command, captures the response, emits the `INI_SECTION_UNREALER`\n\nblock to its log, and exits. The Multi-Turn loop reads the run's log excerpt, parses the section, and returns the full Unreal response JSON to the LLM. The LLM then sees the engine's reply and either reports it to you, branches on it, or fires the next call.\n\nThe tool accepts the same overrides documented in `config.yaml`\n\n:\n\n`host='10.0.0.5'`\n\nand`port=55557`\n\nto target a remote UE instance (rare; the plugin binds to loopback by default and you would need to change the bind address inside the plugin or tunnel it).`connect_timeout=5`\n\nand`read_timeout=10`\n\nto widen the budgets for slow operations (e.g.`compile_blueprint`\n\non a complex graph).\n\n**Built-in demo prompts.** Migration `0087_add_unrealer_demo_prompt.py`\n\nseeds a one-click demo into the Prompts table (`idPrompt=25`\n\n). Open the chat, click the **Prompts** dropdown, pick *Unreal MCP End-to-End Editor Drive*, and Tlamatini will execute ten guided steps spanning the **base** command categories — sanity-probe (`get_actors_in_level`\n\n), spawn a `StaticMeshActor`\n\n(`spawn_actor`\n\n), verify it (`find_actors_by_name`\n\n), scaffold a Blueprint (`create_blueprint`\n\n), add a `StaticMeshComponent`\n\n(`add_component_to_blueprint`\n\n), compile (`compile_blueprint`\n\n), spawn an instance (`spawn_blueprint_actor`\n\n), build a UMG HUD widget (`create_umg_widget_blueprint`\n\n→ `add_text_block_to_widget`\n\n→ `add_button_to_widget`\n\n→ `add_widget_to_viewport`\n\n) — and render the result as a per-step HTML report table at the bottom of the answer. Use it as your smoke test the first time you wire the plugin up.\n\nMigration `0100_add_unrealer_extended_demo_prompts.py`\n\nadds **three more demos that exercise the extended (System / Level / Asset / Material) surface** the base demo never touches, at basic → hard complexity:\n\n(basic): the observe→act loop —`idPrompt=60`\n\n—*Unreal Snapshot*`get_current_level`\n\n→`spawn_actor`\n\n→`take_screenshot`\n\n(to`C:/Temp/unreal_snapshot.png`\n\n) →`save_current_level`\n\n.(medium): content authoring —`idPrompt=61`\n\n—*Unreal Scene Forge*`list_assets`\n\n→`create_folder`\n\n→`create_material`\n\n→`create_material_instance`\n\n→`set_material_parameter`\n\n→`spawn_actor`\n\n→`assign_material`\n\n→`take_screenshot`\n\n→`save_all`\n\n. (It is honest that`set_material_parameter`\n\non a freshly-created blank material may return`status: error`\n\n— that is expected and recorded, not aborted.)(hard): the System escape hatch —`idPrompt=62`\n\n—*Unreal Python & Introspection*`execute_console_command`\n\n(via the agent's`params.console_command`\n\nremap) →`get_class_info`\n\n→`list_assets`\n\n→`execute_python`\n\n(a multi-line script passed as a triple-quoted`params.code`\n\n) →`take_screenshot`\n\n.\n\nAll three drive `chat_agent_unrealer`\n\nexactly like the base demo (tick only **Multi-Turn**; ACPX not required) and require the same running editor + bound plugin listener.\n\nFor unattended `.flw`\n\nworkflows, drop the **Unrealer** sidebar agent onto the canvas. Each node sends exactly one Unreal command when its turn arrives. The node's `config.yaml`\n\nis the same one shipped under `agent/agents/unrealer/config.yaml`\n\n:\n\n```\nhost: 127.0.0.1\nport: 55557\ncommand: get_actors_in_level\nparams:\n  name: ''\n  type: ''\n  location: []\n  # ... (the shipped config carries empty placeholders for every param across\n  #     all 9 categories — editor/blueprint/node/umg/system/level/asset/material\n  #     — so the Flow Compiler's dotted `params.X` overrides always resolve into\n  #     an existing YAML leaf. Unset placeholders are pruned before the command\n  #     is sent, so add/remove keys freely to match the verb you picked.)\nconnect_timeout: 5\nread_timeout: 10\nsource_agents: []\ntarget_agents: []\n```\n\nThe agent emits an `INI_SECTION_UNREALER<<<`\n\nblock to its log, which means **Parametrizer can chain Unreal calls together**. Registered source fields (`agent/services/agent_contracts.py`\n\n): `host`\n\n, `port`\n\n, `command`\n\n, `status`\n\n, `error`\n\n, `response_body`\n\n. Canonical multi-step canvas pattern — create a Blueprint, compile it, spawn an instance:\n\n```\nStarter → Unrealer(create_blueprint) → Parametrizer → Unrealer(compile_blueprint)\n       → Parametrizer → Unrealer(spawn_blueprint_actor) → Ender\n```\n\nEach Parametrizer copies the previous Unrealer's `response_body`\n\n(or a specific JSON field within it, via the Parametrizer dialog's interconnection-mapping UI) into the next Unrealer's `params`\n\nblock. Branching on `status`\n\n(`ok`\n\nvs `error`\n\n) via a Raiser between Unrealer and the next Parametrizer gives you per-step exception handling — e.g., abort to a Notifier if `compile_blueprint`\n\nreturns `status: error`\n\n.\n\nThe `unrealer.py`\n\nscript (~120 lines of business logic, plus the standard pool-agent boilerplate) is **self-contained**: it does NOT import from `agent.acpx`\n\n, `agent.services`\n\n, or any other Tlamatini-internal package. Pool subprocesses run as separate Python interpreters with no `sys.path`\n\nback into the Django app, so the inline `UnrealConnection`\n\nadapter is a verbatim mirror of the upstream Unreal MCP Python client (with the FastMCP plumbing stripped out). Per execution:\n\n**Load**, read`config.yaml`\n\n`host`\n\n,`port`\n\n,`command`\n\n,`params`\n\n, timeouts, and`target_agents`\n\n.**Write** so the orphan-process reaper (chapter 11) can track the run.`agent.pid`\n\n**Open a fresh TCP socket** to`host:port`\n\nwith`TCP_NODELAY`\n\n,`SO_KEEPALIVE`\n\n, and 64 KB send/recv buffers. The Unreal MCP plugin closes the socket after each command, so the agent opens a new one per turn.**Send**`json.dumps({\"type\": command, \"params\": params})`\n\n(no trailing newline) and call`recv()`\n\nuntil a complete JSON document has been assembled (validated by attempting`json.loads()`\n\non the accumulated bytes after each chunk).**Normalize** the response shape —`{\"success\": false, ...}`\n\nfrom older plugin builds is rewritten into`{\"status\": \"error\", \"error\": ...}`\n\nso downstream Parametrizer / Multi-Turn code can rely on a single shape.**Emit one atomic** with the`logging.info()`\n\ncall`INI_SECTION_UNREALER<<<`\n\nblock (header:`host`\n\n,`port`\n\n,`command`\n\n,`status`\n\n,`error`\n\n; body: pretty-printed Unreal JSON response, capped at 64 KiB). Single-call emission is mandatory for the parser at`agent/agents/parametrizer/parametrizer.py`\n\n.**Trigger**, so the canvas can route on the`target_agents`\n\neven on error`status`\n\nfield instead of relying on agent-level fail-stops.**Remove** and exit.`agent.pid`\n\nFailure modes the adapter handles gracefully (each turns into `{\"status\": \"error\", \"error\": \"<reason>\"}`\n\nplus a non-fatal log line, never an uncaught exception):\n\n**Connection refused**— the plugin's TCP listener is not bound (editor not running, plugin not enabled, port mismatched).** Socket timeout during receive**— UE5's game thread is busy (e.g.`compile_blueprint`\n\non a heavy graph) and exceeded`read_timeout`\n\n. Raise`read_timeout`\n\nin`config.yaml`\n\nor in the wrapped tool call.**Malformed JSON**— the plugin closed mid-write; logged as an`error`\n\nstatus and downstream agents still fire.\n\n`chat_agent_unrealer`\n\nis registered in `_EXEC_REPORT_TOOLS`\n\n(`agent/mcp_agent.py`\n\n) under `agent_key=\"unrealer\"`\n\nand `agent_display=\"Unrealer\"`\n\n. When **Exec Report** is ticked alongside **Multi-Turn**, every Unreal call shows up as one row in a dedicated **List of Unrealer Operations** table at the bottom of the answer. Columns: command (left-bordered with the Unrealer caption gradient), success (`SUCCESS`\n\n/ `FAILURE`\n\nderived from the underlying tool-call verdict — the same verdict Multi-Turn already uses for dedup and `tool_calls_log`\n\n). The table styling lives in `agent/static/agent/css/agent_page.css`\n\n(caption gradient mirrors `.canvas-item.unrealer-agent`\n\nin `agentic_control_panel.css`\n\n).\n\nA short pre-flight you can copy into a sticky note before any session:\n\n| Check | How |\n|---|---|\n| UE5 5.5+ open with a project loaded | `File → Project → <YourProject>` — and leave the editor focused, not minimized to the system tray |\n| Plugin enabled | `Edit → Plugins → UnrealMCP → Enabled = ✓` , restart of editor confirmed |\n| Listener bound | Output Log shows `UnrealMCP listening on 127.0.0.1:55557` (or your configured port) |\n| Port not blocked | `Test-NetConnection -ComputerName 127.0.0.1 -Port 55557` returns `TcpTestSucceeded: True` (PowerShell) |\n| Tlamatini server up | `python Tlamatini/manage.py runserver --noreload` shows the startup banner |\nMulti-Turn ticked |\nThe toolbar checkbox to the left of Exec Report |\n| Tool enabled | `Tools` dialog in chat shows `Chat-Agent-Unrealer` ticked (it ships ticked by default after migration `0086_add_chat_agent_unrealer_tool` runs) |\n\nThen run the seeded **Unreal MCP End-to-End Editor Drive** demo prompt (idPrompt 25) as your smoke test. A clean run leaves three artifacts in your project: actor `TlamatiniProbe_Cube`\n\n, Blueprint `BP_TlamatiniProbe`\n\nwith one spawned instance `TlamatiniProbe_Spawned`\n\n, and widget `/Game/UI/WBP_TlamatiniProbeHUD`\n\n. Delete them via right-click in the Content Browser when you are done.\n\nUnreal MCP is a`acp_doctor`\n\nis not relevant here.*workflow-agent*surface, not the*ACPX*surface — the`acp_*`\n\ntools talk to external coding-agent CLIs (claude, gemini, …), not to UE5. The corresponding \"is the channel alive?\" probe for Unrealer is to call`chat_agent_unrealer`\n\nwith`command='get_actors_in_level'`\n\nand check that`status == 'ok'`\n\n.The plugin is not listening. Check the UE5 Output Log for the`status: error`\n\n/`Failed to connect to Unreal at 127.0.0.1:55557`\n\n.`UnrealMCP listening on …`\n\nline. If the line is absent, the plugin is either disabled or failed to build (re-open the project; UE5 will re-prompt to rebuild). If the line is present but the connection still fails, your firewall is blocking loopback (rare on Windows, but`Restart-Service mpssvc`\n\nand re-test if you have aggressive endpoint security).UE5's game thread is busy. Either widen`Timeout receiving Unreal response`\n\n.`read_timeout`\n\n(`config.yaml`\n\nor the wrapped-tool call), or split the work into smaller commands (e.g. spawn 10 actors with 10 separate calls instead of one`spawn_n_actors`\n\nmacro the plugin does not actually expose).Capitalize the`status: error`\n\nfrom a Blueprint command, but the verb seems valid.`parent_class`\n\nexactly as UE5 expects (`Actor`\n\n,`Pawn`\n\n,`Character`\n\n,`UserWidget`\n\n, …). The plugin does not auto-resolve`actor`\n\n→`Actor`\n\n.**Widget appears in the Content Browser but is invisible in PIE.**`add_widget_to_viewport`\n\nqueues the widget at editor level; you still need to press**Play** in the editor (or call`add_widget_to_viewport`\n\nfrom within a running PIE session) to make it render. This is an Unreal MCP plugin behavior, not a Tlamatini bug.**The Output Log shows the plugin received the command but nothing happened in the level.** Most often: an actor spawn at a coordinate**inside** another object's collision volume. UE5 silently fails the spawn. Raise`params.location`\n\nby`[0, 0, 150]`\n\nand retry.**A second instance of UE5 is bound to the same port.** Only one UnrealMCP listener can bind to`127.0.0.1:55557`\n\nper host. Close the second editor instance, or configure each instance to bind to a different port and pass`port=<n>`\n\nper Unrealer call.\n\nFor the full debugging trail: pool-agent log lives at `<pool>/unrealer_<n>/unrealer_<n>.log`\n\n; chat-wrapped runs land under `agent/agents/pools/_chat_runs_/unrealer_<seq>_<id>/unrealer_<seq>_<id>.log`\n\n. Both contain the outbound JSON command and the inbound Unreal response verbatim.\n\nFor shipping a one-click Windows installer to end users.\n\n```\nbuild.py  ──►  build_uninstaller.py  ──►  build_installer.py\n   │                   │                         │\n   ▼                   ▼                         ▼\npkg.zip          Uninstaller.exe        dist/Tlamatini_Release/\npython build.py\n```\n\nInstalls deps, runs `collectstatic`\n\n, executes PyInstaller, copies required payloads (including `README.md`\n\n, the self-knowledge map `Tlamatini.md`\n\n, and bundled `jd-cli/`\n\n), runs migrations, creates the default user (`user`\n\n/`changeme`\n\n), renames the exe to `Tlamatini.exe`\n\n, copies all 68 agent templates, bundles support scripts (`register_flw.ps1`\n\n, `CreateShortcut.ps1`\n\n, `Tlamatini.ps1`\n\n, `Tlamatini.ico`\n\n), and zips it all into ** pkg.zip**.\n\n`build.py`\n\nis strict: missing `README.md`\n\n, missing `jd-cli/`\n\n, or missing `jd-cli.bat`\n\ncauses a non-zero exit.\n\n**Self-modify builds.** Add the `--self-modify`\n\nflag to ship Tlamatini's own source tree inside the distribution:\n\n```\npython build.py --self-modify\n```\n\nWhen the flag is present (`self_modify = \"--self-modify\" in sys.argv`\n\n), the build copies `Tlamatini/agent/TlamatiniSourceCode/`\n\nrecursively to the install root next to the exe, so it resolves like `prompt.pmt`\n\n/ `config.json`\n\n/ `Tlamatini.md`\n\n, and Tlamatini can read, inspect, and modify her own code at runtime. Without the flag the directory is omitted entirely. The build prints `Self-modify build : YES`\n\n(or `no`\n\n) so you can confirm which kind of build you produced. See [§9.6](#96-self-knowledge--self-modification) for how the LLM uses it.\n\n```\npython build_uninstaller.py\n```\n\nBuilds `uninstall.py`\n\ninto a single `--onefile`\n\nTkinter exe. Output: `Uninstaller.exe`\n\nat the project root.\n\n```\npython build_installer.py\n```\n\nRequires `pkg.zip`\n\nand `Uninstaller.exe`\n\n. Builds `install.py`\n\nwith `--onedir --windowed`\n\nand a splash screen, copies `pkg.zip`\n\nand `Uninstaller.exe`\n\ninto `dist/Installer/`\n\n, and assembles `dist/Tlamatini_Release/`\n\nwith SHA-256 verification.\n\nThe final distributable is `dist/Tlamatini_Release/`\n\n— zip the folder, share it.\n\n- Tkinter GUI to choose installation directory (no admin needed).\n- Extracts\n`pkg.zip`\n\ninto`<install_path>/Tlamatini/`\n\n. - Locks agent venv permissions.\n- Writes\n`config.json`\n\n. - Copies\n`Uninstaller.exe`\n\n. - Creates desktop and Start Menu shortcuts (\n`Tlamatini.lnk`\n\n— falls back to user-scoped paths under restrictive Group Policies). - Registers\n`.flw`\n\nto open with Tlamatini. - Cleans the PyInstaller bundle path from helper subprocess environments.\n\nFrozen mode resolves `config.json`\n\nfrom the executable directory (or `CONFIG_PATH`\n\nenv var). Template-agent discovery uses `<install_dir>/agents`\n\nin frozen mode and `Tlamatini/agent/agents/`\n\nin source mode. `_resolve_python_executable()`\n\ntries `PYTHON_HOME`\n\n→ bundled `python.exe`\n\n→ PATH.\n\nTlamatini has two operational modes:\n\n| Mode | What it means | Where agent templates live |\n|---|---|---|\nSource / Not-Frozen |\nYou run `python Tlamatini/manage.py runserver --noreload` from a cloned repo. |\n`Tlamatini/agent/agents/` |\nFrozen / Installed |\nYou run the packaged `Tlamatini.exe` from the installer. |\n`<install_dir>/agents/` |\n\nThe new Flow Compiler was built to respect both modes. It does **not** assume your repo is at `C:/Development/Tlamatini`\n\n, and it does **not** assume the installed app lives in a specific Program Files folder.\n\nThe compiler asks Tlamatini at runtime:\n\n- \"Am I frozen?\"\n- \"Where are the agent templates?\"\n- \"Where is this user's session pool?\"\n- \"Which agent contract applies to this node?\"\n\nThen it writes only into the current pool:\n\n```\nagents/pools/<session_id>/<agent_name_n>/config.yaml\n```\n\nThat path exists in both modes. In source mode it is under the repo's `Tlamatini/agent/agents/pools/`\n\n. In frozen mode it is under the installed app's `agents/pools/`\n\n.\n\nFor users, the takeaway is simpler: **a .flw saved in source mode should load in an installed build, and a .flw saved in an installed build should load back in source mode.**\n\n| Mode | Resolution order |\n|---|---|\n| Source | `Tlamatini/agent/config.json` |\n| Frozen | `<install-dir>/config.json` next to the executable |\n| Both | `CONFIG_PATH` env var wins over both |\n\n```\n{\n  \"embeding-model\": \"Nomic-Embed-Text:latest\",\n  \"chained-model\": \"glm-5:cloud\",\n  \"ollama_base_url\": \"http://127.0.0.1:11434\",\n  \"ollama_token\": \"\",\n  \"ANTHROPIC_API_KEY\": \"<ANTHROPIC_API_KEY goes here>\",\n  \"GEMINI_API_KEY\": \"<GEMINI_API_KEY goes here>\",\n  \"enable_unified_agent\": true,\n  \"unified_agent_model\": \"glm-5:cloud\",\n  \"unified_agent_base_url\": \"http://127.0.0.1:11434\",\n  \"unified_agent_temperature\": 0.0,\n  \"unified_agent_max_iterations\": 4096,\n  \"chat_agent_limit_runs\": 100\n}\n```\n\n`unified_agent_max_iterations`\n\ncaps the Multi-Turn tool loop (default 4096). `enable_unified_agent`\n\nis the master switch for tool-calling.\n\nKey knobs: `chunk_size`\n\n(3000), `chunk_overlap`\n\n(800), `k_vector`\n\n/ `k_bm25`\n\n(100 each), `k_fused`\n\n(150), `enable_bm25`\n\n, `rrf_k`\n\n(60), `max_doc_chars`\n\n(150000), `max_context_chars`\n\n(250000), and a `context_budget_allocation`\n\nmap (`high_relevance: 0.60, architecture: 0.20, related: 0.15, documentation: 0.05`\n\n). See `BookOfTlamatini.md`\n\nPart VII for the full schema.\n\n```\n{\n  \"acpx\": {\n    \"cwd\": \"C:/Development/Tlamatini\",\n    \"stateDir\": \"C:/Users/<you>/.tlamatini/acpx-state\",\n    \"probeAgent\": \"gemini\",\n    \"permissionMode\": \"approve-reads\",\n    \"nonInteractivePermissions\": \"deny\",\n    \"timeoutSeconds\": 180,\n    \"agents\": {\n      \"claude\": { \"command\": \"C:/Users/<you>/AppData/Roaming/npm/claude.cmd\",\n                  \"env\": { \"ANTHROPIC_API_KEY\": \"sk-ant-...\" } }\n    }\n  }\n}\n```\n\n`permissionMode`\n\n∈ `approve-reads`\n\n(default) / `approve-all`\n\n(DANGEROUS) / `deny-all`\n\n. The whole `acpx`\n\nblock is optional; on first boot of an upgrade build, `boot_acpx()`\n\nappends the documented default block atomically.\n\n`mcp_system_server_port`\n\n(8765),`mcp_files_search_server_port`\n\n(50051) — MCP daemons.`internet_classifier_model`\n\n,`web_summarizer_model`\n\n,`web_context_max_chars`\n\n— internet toggle.`image_interpreter_model`\n\n,`image_interpreter_base_url`\n\n— vision.`history_summary_*`\n\n,`keep_last_turns`\n\n— chat-history compression.`kali_server_url`\n\n(`http://127.0.0.1:5000`\n\n) — the MCP-Kali-Server address auto-injected into`chat_agent_kalier`\n\n(see §3.14).`stm32_mcp_server_script`\n\n(now`\"\"`\n\n— empty triggers zero-config auto-bootstrap),`stm32_mcp_repo_url`\n\n,`stm32_mcp_install_dir`\n\n— the STM32 Template Project MCP for STM32er (see §3.15). Leave`stm32_mcp_server_script`\n\nempty and STM32er downloads, installs, and validates the server itself on first use.\n\nYou no longer need to hand-edit all of those values. On `/agent/`\n\n, open `Config -> Models`\n\nor `Config -> URLs`\n\nto edit the most common runtime knobs in-place. The browser validates model strings / URLs / hosts / ports, the backend validates again, and `config_loader.save_config_updates()`\n\natomically merges only the changed keys into the active `config.json`\n\n. The same loader path is used in source mode and frozen builds, so the chat UI and the executable stop drifting onto different config copies.\n\n```\nBrowser (Chat / ACP Designer)\n    │ WebSocket\n    ▼\nDjango Channels (Daphne ASGI) → AgentConsumer\n    │\n    ├── RAG Pipeline (FAISS + BM25, RRF, context budgeting, OOM fallback)\n    ├── Unified Agent (Multi-Turn loop, planner, wrapped runtimes)\n    └── MCP Services (System-Metrics WS, Files-Search gRPC)\n    │\n    ▼\nLLM Backends: Ollama | Claude API | Qwen vision     +     ACPX Runtime → external CLIs\n```\n\n| Layer | Responsibility | Where |\n|---|---|---|\n| 1. Persisted toggles | DB rows for `Mcp` / `Tool` / `Agent` (UI enable/disable). |\n`agent/models.py` |\n| 2. Runtime MCP services | System-Metrics (WebSocket) + Files-Search (gRPC) daemons. | `agent/mcp_*` |\n| 3. Context fetcher chains | LCEL sidecars that inject system / files context. | `agent/chain_*_lcel.py` |\n| 4. Main answer chains | Basic / History-aware / Unified. `factory.py` monkey-patches `invoke()` . |\n`agent/rag/chains/` |\n| 5. Unified-agent tools | 74 synchronous `@tool` functions (20 core Python + 42 wrapped chat-agent + 12 ACPX/Skill). Active only in Multi-Turn. |\n`agent/tools.py` + `agent/chat_agent_registry.py` + `agent/acpx/` |\n\n```\nFrontend (toggles) → WebSocket → AgentConsumer → ask_rag() (skips prompt-shape validator)\n  → UnifiedAgentChain.invoke() → filter_acpx_tools(tools, acpx_enabled)\n    → planner picks ≤20 tools (capability scoring + history-aware boost)\n      → MultiTurnToolAgentExecutor: 1..4096 iterations of (LLM call → tool calls → ToolMessage)\n        → Exec Report HTML appended (if exec_report_enabled, BEFORE save_message)\n          → broadcast → frontend renders, shows Create Flow if all 4 gates pass\n```\n\nTlamatini has two ways to create flows:\n\n- The chat can infer a flow from Multi-Turn tool calls.\n- The ACP canvas can build a flow by dragging agents and drawing arrows.\n\nThose two paths now meet at the same backend contract layer. This is important because flow files are not just pictures. A flow must eventually become a set of real agent folders, and every folder needs a correct `config.yaml`\n\n.\n\nThe contract layer is intentionally small:\n\n| File | What it does |\n|---|---|\n`agent/services/agent_paths.py` |\nFinds the correct `agents/` and `agents/pools/` folders in both source mode and frozen mode. It also normalizes names like `TeleTlamatini` , `tele-tlamatini` , and `teletlamatini` into the same agent type. |\n`agent/services/agent_contracts.py` |\nDescribes what each agent needs: which config fields hold incoming agents, which fields hold outgoing agents, which agents are singletons, which agents are long-running, which agents should be hidden from validation, which Parametrizer fields can be mapped, and which secrets must be redacted before export. |\n`agent/services/flow_spec.py` |\nConverts old and new `.flw` shapes into one clean `FlowSpec` . It accepts legacy `sourceIndex` / `targetIndex` links and newer stable `sourceId` / `targetId` links. |\n`agent/services/flow_compiler.py` |\nConverts a `FlowSpec` into the actual pool configs. In dry-run mode it returns the configs for validation. In write mode it updates the current session pool before Start runs. |\n\nFor beginners, the rule is: **the canvas or chat creates a flow idea, then the Flow Compiler turns that idea into executable agent folders.**\n\nThe compiler does a few quiet but important safety jobs:\n\n- It starts from each agent template's\n`config.yaml`\n\n, then merges only the node's custom settings. - It clears and rebuilds managed connection fields, so stale arrows from an old pool do not survive by accident.\n- It understands special agent wiring such as\n`AND`\n\n,`OR`\n\n,`Asker`\n\n,`Forker`\n\n,`Counter`\n\n,`Ender`\n\n,`Stopper`\n\n, and`Cleaner`\n\n. - It writes\n`interconnection-scheme.csv`\n\nfor Parametrizer nodes when mappings are saved in the`.flw`\n\n. - It keeps FlowCreator and FlowHypervisor out of runtime validation because they are helper/control agents, not normal flow workers.\n- It redacts known secrets for remote chat ingress agents such as\n**TeleTlamatini** and**WhatsTlamatini** when chat-created flows are exported.\n\nThis is the Pareto improvement: a small shared backend layer makes both major features safer. Chat-created flows and ACP-created flows now speak the same format before they touch the runtime.\n\n| Family | Members |\n|---|---|\nControl |\nStarter, Ender, Stopper, Cleaner, Sleeper, Croner |\nRouting |\nRaiser, Forker, Asker, Counter |\nLogic gates |\nOR, AND, Barrier |\nAction |\nExecuter, Pythonxer, Prompter, Summarizer, Crawler, Googler, Playwrighter, Apirer, Gitter, Ssher, Scper, Dockerer, Kuberneter, Pser, Jenkinser, Sqler, Mongoxer, Mover, Deleter, Shoter, Mouser, Keyboarder, Windower, File-Creator, File-Interpreter, File-Extractor, Image-Interpreter, J-Decompiler, De-Compresser, Telegramer, TeleTlamatini, WhatsTlamatini, ACPXer, Unrealer, Reviewer, Analyzer, Kalier, STM32er |\nCryptography |\nKyber-KeyGen, Kyber-Cipher, Kyber-DeCipher (CRYSTALS-Kyber post-quantum) |\nUtility |\nParametrizer, FlowBacker, Gatewayer, Gateway-Relayer, Node-Manager |\nTerminal / monitoring |\nMonitor-Log, Monitor-Netstat, Emailer, RecMailer, Notifier, Whatsapper, TelegramRX, FlowHypervisor |\nAI / design |\nFlowCreator |\n\nPer-agent details (config knobs, lifecycle, naming convention, log markers): see `BookOfTlamatini.md`\n\nPart IV — *The Tlamatini Bestiary*. To add a new agent, follow `Tlamatini/.agents/workflows/create_new_agent.md`\n\n(8-step checklist).\n\nTlamatini ships with a first-person **self-knowledge map** — `Tlamatini/agent/Tlamatini.md`\n\n— that the LLM reads as her own description of who and what she is: her two runtime modes (frozen vs source, and how to tell them apart), the ports she opens (`8000`\n\nfor the web app, `8765`\n\nfor the System-Metrics MCP, `50051`\n\nfor the Files-Search MCP), her main pages, her tech stack, her full capability surface, and how she can improve herself. The audience is the LLM alone, so the file deliberately does **not** follow `prompt.pmt`\n\n's HTML/contrast styling rules — it is a private self-reference, never rendered to users.\n\nThe map is injected into the system prompt at prompt-build time. `prompt.pmt`\n\ncarries a `<self_knowledge>{self_knowledge}</self_knowledge>`\n\nblock, and `agent/rag/config.py`\n\nfills it in: `_load_self_knowledge_block()`\n\nreads `Tlamatini.md`\n\n, brace-escapes it (`{`\n\n→ `{{`\n\n, `}`\n\n→ `}}`\n\n) so its code snippets cannot collide with the f-string template variables, and **fails open** — a missing, empty, or unreadable file degrades to a short literal notice instead of raising. The substitution happens at the single prompt-load site in `load_config_and_prompt()`\n\n, so it covers **all four chains** (basic, history-aware, unified, prompt-only) without adding a new input variable. `Tlamatini.md`\n\nis resolved from the application directory exactly like `prompt.pmt`\n\nand `config.json`\n\n(the install root next to the `.exe`\n\nin frozen mode, `Tlamatini/agent/`\n\nin source mode), and `build.py`\n\nships it both via `--add-data=…/Tlamatini.md;agent`\n\nand by copying it to the install root so frozen resolution next to the exe works. The identity rules in `prompt.pmt`\n\npoint the LLM at `Tlamatini.md`\n\nwhenever a prompt concerns who or what she is, her architecture / modes / ports / pages / internals, or improving herself.\n\nSelf-modification is a **second, independent capability axis**. The optional directory `Tlamatini/agent/TlamatiniSourceCode/`\n\n, when present, contains Tlamatini's own source code so she can read, inspect, and modify herself — present means a *self-able-modify* build; absent means a *not-self-able-modify* build (orthogonal to frozen vs source). The tree is bundled **only** when `build.py`\n\nis invoked with the new `--self-modify`\n\nflag (see [§7.2](#72-step-1--buildpy)); without the flag it is omitted entirely. Because the directory is optional, `prompt.pmt`\n\ninstructs the LLM to **always verify the directory's presence** (for example, a Multi-Turn directory listing) before claiming she can read or edit her own code; if it is absent she says so and falls back to the injected self-knowledge plus the docs.\n\nWhen you load your own project as context (Context ▸ Set directory / Set file as context) and then ask a generic \"summarize the project / the source code / the provided context\" question, the\n\nloaded context takes priorityover the always-injected self-knowledge — so Tlamatini summarizesyourcode, not herself. This is enforced by a`prompt.pmt`\n\nloaded-context-priority rule plus a deterministic scope header (`agent/rag/utils.py::prepend_loaded_context_scope()`\n\n) applied across all four chains.\n\nWhen you click **Set directory as context** in the Context menu, Tlamatini walks the directory, splits each file into chunks, and pushes every chunk through Ollama's embedding API to build a FAISS index. On a laptop / consumer GPU a heavy embedding model can occupy 75–95% of total VRAM by itself — and once a chat model is also resident the combined footprint exceeds available memory and the daemon thrashes RAM↔VRAM swap on every embed batch. A 30-second context-load turns into a multi-hour stall.\n\nThe **embedding-memory pre-flight guard** (`Tlamatini/agent/embedding_memory_guard.py`\n\n) catches this before the embed burst starts. It runs only when an NVIDIA GPU is detected; on CPU-only / AMD / Apple Silicon hosts it is a silent no-op and the legacy load path is unchanged.\n\nThe trigger case is the dev box this codebase is calibrated on: an **NVIDIA GeForce RTX 4070 Laptop GPU with 8 188 MiB** of VRAM. The previously-configured embedding model `qwen3-embedding:8b`\n\n(7.6 B parameters, Q4_K_M quantization) sits at **~6.24 GB resident** — 77.9% of total VRAM. Add the chat model and the daemon evicts something on every embed batch. The fix is either: switch to a smaller model (e.g. `nomic-embed-text:v1.5`\n\nat ~0.60 GB resident, 7% saturation) or accept the slow path knowingly. The guard surfaces the choice **before** the heavy work starts, so you can abort, swap the model in `config.json`\n\n, and restart Ollama — saving an hour of debugging \"why is context loading frozen?\".\n\nThe guard is wired into `agent/consumers.py::setup_contextual_rag_chain`\n\nat exactly one point: **after** the \"loading context\" banner is broadcast to the chat, **before** the heavy `asyncio.to_thread(setup_llm_with_context, …)`\n\ncall that drives the embedding burst. The flow is:\n\n```\nWebSocket \"set-directory-as-context\"\n    ↓\nconsumers.py:setup_contextual_rag_chain(path_only)\n    ↓\nbroadcast MSG_AGENT_LOADING_CONTEXT chat bubble\n    ↓\n► embedding_memory_guard.check_embedding_memory_for_directory(...)\n    │\n    ├─► returns None (no GPU / under threshold / probe failed)\n    │       → proceed silently\n    │\n    └─► returns warning dict\n            → broadcast HTML warning chat bubble\n            → proceed anyway (informational, non-blocking)\n    ↓\nasyncio.to_thread(setup_llm_with_context, ...)\n    └─► OllamaEmbeddings + FAISS.from_documents(...)   ← VRAM burst\n```\n\nThe check runs inside `asyncio.to_thread`\n\nso a slow `nvidia-smi`\n\nor cold `/api/show`\n\ncall never blocks the Channels event loop. The whole block is wrapped in `try/except Exception`\n\nso any unhandled probe error prints a one-line `[EMBED-MEM] Pre-flight check skipped (fail-open)`\n\nto `tlamatini.log`\n\nand the load continues — **a diagnostic must never block the user**.\n\nThe guard reuses the **already-cached** `nvidia-smi -L`\n\nprobe from `agent/gpu_perf.py::_has_nvidia_gpu()`\n\n(introduced for the model-pinning hook). The probe runs at most once per process; subsequent calls hit the in-module cache. On CPU-only Linux/Windows, AMD GPUs, and Apple Silicon the probe returns `False`\n\nonce at server start and **every subsequent call to the guard returns None immediately** — no subprocesses spawned, no HTTP calls made, no overhead.\n\nThis is the **portability guarantee**: a fresh `git pull`\n\non a no-GPU box behaves *exactly* as before the guard existed. The 28 dedicated no-GPU compatibility tests (see 10.7) lock the contract in place.\n\nWhen the GPU gate passes, the guard predicts the embedding model's resident VRAM in priority order:\n\n| Tier | Source | Trigger | Accuracy |\n|---|---|---|---|\nA |\n`GET /api/ps` `size_vram` |\nModel already resident in Ollama | Exact — verbatim daemon value |\nB |\n`POST /api/show` → `parameter_count × bits_per_weight × overhead` |\nModel pulled but unloaded | ±5% on calibrated models |\nC |\n(any probe failure) | Ollama down, model not pulled, cloud model (`:cloud` suffix) |\nReturns `None` → fail-open |\n\nTier B uses a standard llama.cpp / GGUF bits-per-weight table:\n\n| Quant | Bits/weight | Quant | Bits/weight |\n|---|---|---|---|\n`F32` |\n32.0 | `Q4_K_M` |\n4.83 |\n`F16` / `BF16` |\n16.0 | `Q4_K_S` |\n4.58 |\n`Q8_0` |\n8.5 | `Q4_0` |\n4.55 |\n`Q6_K` |\n6.56 | `Q3_K_M` |\n3.91 |\n`Q5_K_M` |\n5.69 | `Q2_K` |\n2.96 |\n\nUnknown quants fall back to a conservative `5.0`\n\nbits/weight. The overhead multiplier accounts for KV cache + activation buffers + GGML allocator slack:\n\nfor models with`× 1.40`\n\n**≥ 1 B parameters**(large-model regime)for`× 2.20`\n\n**sub-1 B** models (proportionally larger KV/buffer overhead)\n\nCalibration against live measurements on the RTX 4070 Laptop:\n\n| Model | Params × bits/8 (raw) | Predicted (× overhead) | Measured resident | Error |\n|---|---|---|---|---|\n`qwen3-embedding:8b` (Q4_K_M) |\n4.54 GB | 6.36 GB (× 1.40) |\n6.24 GB | +1.9% |\n`Nomic-Embed-Text:latest` (F16) |\n274 MB | 603 MB (× 2.20) |\n600 MB | +0.5% |\n\nTier B also pulls the **embedding dimension** from `/api/show`\n\nvia any architecture-prefixed `*.embedding_length`\n\nkey (e.g. `qwen3.embedding_length=4096`\n\n, `nomic-bert.embedding_length=768`\n\n). Combined with a directory pre-scan that mirrors the exclusion rules of `agent/rag/factory.py::CustomTextLoader`\n\n, it reports a **projected FAISS-index RAM size** (`num_chunks × embedding_dim × 4`\n\nbytes, float32). This is RAM, not VRAM, but useful to surface on directories with hundreds of thousands of chunks.\n\nThe guard fires when **predicted_vram ≥ 0.80 × smallest-GPU total VRAM**. Why the smallest GPU (rather than the sum or the largest)? Because Ollama loads each model into a **single device** — using the max would silently under-report the constraint on heterogeneous multi-GPU rigs.\n\nWhen the threshold is crossed, the guard returns a structured dict the consumer renders as an HTML chat bubble. A real example from this dev box (artificially threshold-lowered to 70% so qwen3-embed:8b trips):\n\n```\n⚠️ Embedding-memory warning\nEmbedding model qwen3-embedding:8b needs ~6,378 MiB of VRAM\n(currently resident in VRAM), which is 77.9% of the smallest\nGPU's total (8,188 MiB) — above the safety threshold of 70%.\nProjected FAISS vector store (RAM, not VRAM): ~28 MiB across\n1,847 chunks at dim 4096.\nContext loading will continue, but expect slow embedding\nthroughput or RAM↔VRAM swap. To eliminate the pressure, switch\nembeding-model in config.json to a smaller model\n(e.g. nomic-embed-text:v1.5) and restart.\n```\n\nThe message is **informational and non-blocking** — context loading proceeds. The user picks whether to wait it out, hit Cancel, or change models. The phrasing names the exact `config.json`\n\nkey (`embeding-model`\n\n, with the spelling preserved from the existing codebase) and a concrete alternative.\n\n| Knob | Where | Default | When to change |\n|---|---|---|---|\n| Trigger threshold | `check_embedding_memory_for_directory(..., threshold=)` |\n`0.80` |\nPass `0.70` on smaller GPUs (6 GB cards) where 80% is already too tight. |\n| Large-model overhead | `_OVERHEAD_LARGE` constant |\n`1.40` |\nIf a new model family proves the calibration off by > 10%, recalibrate against `/api/ps` and bump the constant. |\n| Small-model overhead | `_OVERHEAD_SMALL` constant |\n`2.20` |\nSame calibration story for sub-1B models. |\n| Bits/weight table | `_QUANT_BITS` dict |\n`Q4_K_M=4.83` (etc.) |\nAdd new entries when a future GGUF quant ships. |\n\nWhat the guard **does NOT** do, by deliberate choice:\n\n- It does\n**not** abort context loading. The warning is informational. (If you want abort-on-warning behavior, wire a confirm/cancel WebSocket round-trip — the surface is described in`agent_page_init.js`\n\nnear`set-dir-context`\n\n.) - It does\n**not** estimate the**chat** model's VRAM. Only the embedding model is checked, because that is the model the directory-load path forces into VRAM. The chat model is handled by`gpu_perf.pin_ollama_model`\n\nseparately. - It does\n**not** persist warnings. Each context-load runs an independent check. - It does\n**not** call`nvidia-smi`\n\non CPU-only hosts. Both gates (`_has_nvidia_gpu_cached`\n\nand the`_gpu_total_memory_bytes`\n\nquery) short-circuit before any subprocess spawn. - It does\n**not** add new dependencies.`subprocess`\n\n,`urllib.request`\n\n, and`os.walk`\n\nare the only stdlib touchpoints — the same surface`agent/gpu_perf.py`\n\nalready uses.\n\nThe guard ships with **49 automated tests** in `Tlamatini/agent/test_embedding_memory_guard.py`\n\n, split into seven `SimpleTestCase`\n\nclasses:\n\n| Test class | Count | What it pins |\n|---|---|---|\n`QuantTableTests` |\n2 | Known quants resolve to standard bits/weight; unknown quants fall back to the conservative default. |\n`PredictFromShowTests` |\n3 | Tier-B prediction lands within calibrated bounds for both 7.6 B and 137 M reference models. |\n`EmbeddingDimExtractionTests` |\n2 | The dim key is found regardless of architecture prefix (`qwen3.` , `nomic-bert.` , future archs). |\n`ChunkEstimatorTests` |\n4 | Directory walk honors default + user omissions, respects `max_chunks_per_file` , and handles single-file mode. |\n`GuardEntryPointTests` |\n8 | All entry-point branches: no-GPU, cloud, threshold gate, Tier A `/api/ps` , Tier B `/api/show` , probe failure. |\n`FormatMessageTests` |\n2 | HTML renders the model name, percent, threshold, and chunk count. |\n`NoGpuCompatibilityTests` |\n28 |\nEvery no-GPU failure mode — see breakdown below. |\n\nThe `NoGpuCompatibilityTests`\n\nclass is the portability proof. Its coverage matrix:\n\n| Failure mode | Tests |\n|---|---|\n| Module import on no-GPU host has no side effects | `test_module_imports_without_side_effects` |\n`nvidia-smi` binary missing entirely |\n`test_run_cmd_returns_127_for_real_missing_binary` , `test_total_vram_returns_none_when_nvidia_smi_missing` |\n`nvidia-smi` exists but driver unloaded |\n`test_total_vram_returns_none_when_driver_unloaded` |\n`nvidia-smi` times out / crashes |\n`test_run_cmd_absorbs_timeout` / `..._permission_error` / `..._generic_oserror` |\n`nvidia-smi` returns empty or garbage output |\n`test_total_vram_returns_none_on_empty_output` / `..._on_unparseable_output` |\n| Heterogeneous multi-GPU rig | `test_total_vram_picks_smallest_gpu_in_heterogeneous_rig` |\n`gpu_perf` module missing / its probe raises |\n`test_has_nvidia_gpu_falls_back_when_gpu_perf_unimportable` , `test_has_nvidia_gpu_returns_false_when_gpu_perf_probe_raises` |\n| Ollama daemon offline (closed port) | `test_ollama_show_returns_none_against_closed_port` , `test_ollama_ps_returns_none_against_closed_port` |\n| Malformed Ollama URLs / empty args | `test_ollama_show_returns_none_for_garbage_url` |\nModel not in `/api/ps` |\n`test_ollama_loaded_vram_returns_none_when_model_not_in_ps` , `..._when_ps_fails` |\n| Entry on a CPU-only host | `test_check_returns_none_on_cpu_only_host` |\nGPU detected but `--query-gpu` fails |\n`test_check_returns_none_when_nvidia_smi_query_fails` |\n| GPU detected but Ollama offline | `test_check_returns_none_when_ollama_offline` |\n| Pathological 0 MiB GPU reading | `test_check_returns_none_when_gpu_zero_total` |\nEmpty `ollama_base_url` in config |\n`test_check_returns_none_for_empty_base_url` |\n| Deleted / nonexistent / empty path | 3× `test_chunk_estimator_*` + `test_check_with_nonexistent_path_does_not_crash` |\n| Unreadable file inside the walked tree | `test_chunk_estimator_with_unreadable_file_skips_it` |\n| Partial warning dict (missing optional keys) | `test_format_warning_message_handles_missing_optional_keys` |\nLive portability proof (real subprocess + real urllib) |\n`test_real_entry_point_call_never_raises` |\n\n`test_real_entry_point_call_never_raises`\n\nis the CI gate: it makes the *actual* `subprocess.run([\"nvidia-smi\", ...])`\n\nand `urlopen(\"http://127.0.0.1:11434/...\")`\n\ncalls against whatever the runner offers, and asserts the return is **either** `None`\n\n**or** a well-formed warning dict — never an exception. The same test passes on this RTX 4070 dev box (returns `None`\n\nbecause qwen3-embed sits at 77.9%, under the 80% gate) and on a CPU-only CI image (returns `None`\n\nbecause the GPU gate fails fast).\n\nRun them yourself:\n\n```\ncd Tlamatini\npython manage.py test agent.test_embedding_memory_guard --verbosity=2\n# 49 tests in ~2.3 s, no DB setup, no GPU required.\n```\n\nTlamatini now ships a three-tier reaper (`Tlamatini/agent/orphan_reaper.py`\n\n) that cleans up the console-host children every console subprocess on Windows drags behind it. Without this pass, users were occasionally seeing `conhost.exe`\n\nprocesses lingering in Task Manager **with the Tlamatini icon** — the icon is inherited from the parent EXE that spawned the console — and reasonably concluding that Tlamatini was leaking processes.\n\nOn Windows, when a tool (`execute_command`\n\n, `chat_agent_executer`\n\n, an ACPX CLI child, an agent-pool Python subprocess, …) spawns a console child, Windows allocates a `conhost.exe`\n\ncompanion to host that console. If the immediate parent dies before the OS reaps the console pair, that `conhost.exe`\n\noutlives Tlamatini. Two compounding causes were closed at once:\n\n**The reaper itself** sweeps zombies and orphaned console hosts at three lifecycle points (below).**Spawn sites were hardened**—`views.py::execute_starter_agent_view`\n\n,`execute_ender_agent_view`\n\n,`restart_agent_view`\n\n,`execute_flowcreator_view`\n\n, every ACPX child in`acpx/runtime.py`\n\n, and a`subprocess.Popen.__init__`\n\nguard at the top of`agents/ender/ender.py`\n\n(mirrored across every other pool agent) now spawn with`CREATE_NO_WINDOW | DETACHED_PROCESS | CREATE_NEW_PROCESS_GROUP`\n\nand stdio piped to`DEVNULL`\n\n. No console is allocated in the first place, so no`conhost.exe`\n\ncompanion exists to orphan.\n\n| Tier | When it runs | Scope | Visibility |\n|---|---|---|---|\nTier 1 |\nAfter every Multi-Turn tool call that may have spawned a child (`execute_command` , `execute_file` , `unzip_file` , `decompile_java` , `googler` , `agent_starter/stopper/parametrizer` , every `chat_agent_*` , every `acp_*` ). Driven by `MultiTurnToolAgentExecutor._reap_after_tool()` in `agent/mcp_agent.py` . Also fires on the tool-exception path so a crashed tool still gets cleaned up. |\nDead/zombie descendants of the current PID, plus orphaned `conhost.exe` whose parent is gone. Pool-cmdline scan is skipped here (cheap path). |\nSilent. Survivors accumulate on the executor for Tier 2 to surface. |\nTier 2 |\nOnce, right after the final answer is broadcast to the user. Driven by `AgentConsumer._tier2_orphan_sweep()` in `agent/consumers.py` . Runs in a thread (so it doesn't block the WebSocket loop) and merges its survivors with Tier 1's leftovers (de-duped by PID). |\nSame as Tier 1 plus the agent-pool cmdline scan (kills processes whose `cmdline` references `agents/pools/...` but are not tracked by `AgentProcess` / `ChatAgentRun` anymore). |\nIf anything survives BOTH tiers, the consumer sends a second chat message listing every surviving `name + PID` pair so the user can end them manually from Task Manager. |\nTier 3 |\nAt Tlamatini.exe shutdown — `AgentConfig.ready()` registers it on the same `atexit` / SIGINT / SIGBREAK path that already cleans up pools. |\nFull sweep (self-tree + pool cmdline + console-host orphans). | Logs `--- [Tier-3 reaper] killed=… survivors=… errors=…` to `tlamatini.log` . Survivors are listed by `name (PID)` so a post-mortem reader can audit what refused to die. |\n\nA process is considered a \"Tlamatini orphan\" if **any** of the following hold:\n\n- It is a descendant of the current Tlamatini PID and its status is\n`ZOMBIE`\n\n/`DEAD`\n\n. - It is a\n`conhost.exe`\n\n/`openconsole.exe`\n\nwhose parent PID is in our process tree, OR whose parent PID no longer exists. - Its\n`cmdline`\n\nreferences the agent-pool directory (`agents/pools/...`\n\nor`agents/pools/_chat_runs_/...`\n\n) but it is no longer tracked.\n\nEach candidate is escalated `terminate → wait 1 s → kill`\n\nvia `psutil`\n\n; an \"unable-to-kill\" outcome surfaces as a survivor, never as an exception. The reaper **never raises into the caller** — a cleanup that crashes the chat path would be worse than the orphans it tries to kill.\n\nOut of scope on purpose: console hosts spawned by unrelated processes (a different IDE, your shell, another app's child) — the parentage check keeps the sweep narrow.\n\nWhen Tier 2 detects survivors, the user sees a second chat bubble immediately after the main answer:\n\n```\n⚠ Heads-up: Tlamatini tried to clean up after this request but the following\nprocess(es) refused to terminate. They are most likely harmless leftovers from\na tool you ran, but if you do not recognize them please end them manually from\nTask Manager so no Tlamatini-spawned child outlives the app:\n  • conhost.exe — PID 18244\n  • python.exe — PID 19108\n```\n\nThe rendering helper is `orphan_reaper.format_survivors_message()`\n\n; it returns `None`\n\n(so no extra message is sent) when the survivor list is empty, which is the common case after the spawn-site hardening landed.\n\n- \"connection refused\" →\n`ollama serve`\n\nin a dedicated terminal. Check`ollama_base_url`\n\n. - Model not found →\n`ollama list`\n\nto see what's pulled. Pull the missing tag. - Remote Ollama → set\n`ollama_token`\n\nfor bearer auth.\n\n- Set-Context shows no green banner → check file permissions, ensure files are text not binary.\n- \"Out of memory\" during embedding → fallback mode kicks in; retrieval quality drops, files still accessible. Switch to a smaller embedding model.\n**See chapter 10 — the embedding-memory pre-flight guard now warns you about this*** before*the embed burst starts on GPU hosts. - Hit\n`max_doc_chars`\n\n→ bump it. - Session says it was restored after a refresh, but the input stays disabled briefly → that is expected while the contextual RAG chain rebuilds. Wait for the ready state / spinner to clear before sending the next prompt.\n\n- Did you tick\n**Multi-Turn**? Is`enable_unified_agent: true`\n\n? - \"Tool X is not available\" → the planner did not bind X. Check\n`[Planner._select]`\n\nconsole lines, add matching keywords to your prompt, or raise`max_selected_tools`\n\n. - 4096 iterations exhausted → likely a busy-poll loop. Use\n`chat_agent_sleeper`\n\n/`chat_agent_run_wait`\n\ninstead.", "url": "https://wpnews.pro/news/tlamatini-local-first-ai-dev-assistant-with-68-agents-and-hybrid-rag", "canonical_source": "https://github.com/XAIHT/Tlamatini", "published_at": "2026-05-27 14:27:43+00:00", "updated_at": "2026-05-27 14:46:32.996032+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "ai-products", "large-language-models", "artificial-intelligence"], "entities": ["Tlamatini", "Ollama", "Claude Code", "Cursor", "Codex", "Gemini", "Qwen", "STM32"], "alternates": {"html": "https://wpnews.pro/news/tlamatini-local-first-ai-dev-assistant-with-68-agents-and-hybrid-rag", "markdown": "https://wpnews.pro/news/tlamatini-local-first-ai-dev-assistant-with-68-agents-and-hybrid-rag.md", "text": "https://wpnews.pro/news/tlamatini-local-first-ai-dev-assistant-with-68-agents-and-hybrid-rag.txt", "jsonld": "https://wpnews.pro/news/tlamatini-local-first-ai-dev-assistant-with-68-agents-and-hybrid-rag.jsonld"}}