How I Built AegisDesk: A Zero-Token Semantic IT Agent with <5ms Latency

The article describes the development of AegisDesk, an open-source, multi-agent IT helpdesk platform that replaces traditional LLM-based intent routing with a deterministic, zero-token semantic routing system. By using local sentence embeddings and cosine similarity comparisons against an offline vocabulary, the system routes queries to specialized sub-agents in approximately 4.5 milliseconds with no API costs, while incorporating dynamic few-shot learning via SQLite and robust security measures like RCE and SSRF defenses.

If you’ve built AI agents recently, you know the standard playbook: you take a user's prompt, feed it into GPT-4 or Claude alongside a massive JSON schema of available tools, and ask the LLM to figure out which tool to use. This works for prototypes. But in an Enterprise IT environment, it’s a disaster. Using an LLM for Intent Routing takes anywhere from 800ms to 2,000ms. It burns API tokens on every single "hello" or "my laptop is broken" message. Worse, LLMs hallucinate—if a user asks to "Provision an Azure SQL database," an overly helpful LLM might hallucinate a non-existent tool call and crash your pipeline. I wanted to build an autonomous IT Helpdesk agent that was deterministic, instant, and practically free to run. That led me to build AegisDesk, an open-source, multi-agent IT platform powered by LangGraph, SQLite, and Zero-Token Semantic Routing. The Architecture: Zero-Token Routing Instead of relying on a monolithic prompt, AegisDesk abandons LLM-based routing entirely. When a query enters AegisDesk, it never hits the cloud. Instead, the local pipeline intercepts the query and embeds it using the BAAI/bge-small-en-v1.5 sentence-transformer model via ONNX fastembed . This local vector is then mathematically compared via Cosine Similarity against an offline vocabulary of IT intents: network diagnostics: ping, traceroute, nmap, tcp, udp cloud integrations: okta, jira, aws, azure, cyberark web scraping: wiki, internal docs, cve lookup The result? The query is mathematically routed to the correct highly-specialized LangGraph sub-agent in ~4.5 milliseconds for $0.00. TIP Enterprise Safety Net: If the semantic match confidence falls below 0.55, AegisDesk refuses to guess. It safely falls back to a generalized, read-only RAG Retrieval-Augmented Generation agent, guaranteeing no destructive commands are executed by mistake. Dynamic Few-Shot Learning via SQLite Static keywords are great, but IT environments evolve. What happens when a user types an obscure proprietary software name that isn't in our offline vocabulary? To solve this, I integrated Dynamic Few-Shot Learning directly into the routing layer using SQLite Graph Memory. When AegisDesk initializes, it queries a routing examples table inside an ACID-compliant SQLite database. It extracts historical, successfully resolved IT tickets and embeds them dynamically into the routing corpus. If an Administrator notices the agent struggling with a query like "Run a traceroute to internal-git.corp", they can manually inject the learning directly via the CLI: bash aegisdesk teach-router "Run a traceroute to internal-git.corp" it support network diagnostics The next time the router boots, it embeds that exact phrase. The system effectively "fine-tunes" its routing logic in real-time, achieving 90% strict-match routing accuracy without a single line of Python code being altered. Zero-Trust Security Boundaries Building an autonomous agent that can execute ipconfig, ping, or scrape internal HR wikis is inherently dangerous. AegisDesk implements two critical security mitigations at the tool execution layer: RCE Defense Remote Code Execution : Subprocess execution explicitly enforces shell=False. Before any command touches the OS, inputs are scrubbed using strict Regex ^a-zA-Z0-9.- to eliminate bash metacharacters &, |, ;, $ . SSRF Defense Server-Side Request Forgery : The Web Scraping agent is hardened against TOCTOU Time-Of-Check to Time-Of-Use attacks. Outbound HTTP requests undergo pre-flight DNS checks. Any resolution attempting to hit loopback 127.0.0.1 or private cloud metadata subnets 169.254.169.254 is aborted at the socket level. Even with these defenses, AegisDesk utilizes LangGraph's interrupt before functionality to trigger Human-in-the-Loop HITL confirmations before executing any terminal command. Try It Out AegisDesk proves that you don't need massive, bloated monolithic LLMs to build intelligent enterprise agents. By pairing lightning-fast deterministic routing with specialized LangGraph swarms, you can build systems that are safer, cheaper, and exponentially faster. You can install the CLI directly from PyPI today: bash pip install aegisdesk Check out the full source code and documentation on GitHub: github.com/sitanshukr08/Aegisdesk If you’re building multi-agent swarms or semantic routers, I’d love to hear your thoughts in the comments