How Sentience ships 60+ AI tools in one local desktop app — without locking you into one provider

wpnews.pro

I wanted Cursor's UX and Zo Computer's tool breadth in a desktop app I could close the laptop lid on. So I built Sentience — a PySide6 desktop AI assistant that runs entirely on your machine, brings its own browser, its own email client, its own voice controller, and exposes 60+ tool functions to the model.

The hard part was never the tools. The hard part was making 60+ tool schemas work identically across Groq, OpenAI, Anthropic, and a local Ollama instance — without writing a provider-specific tool dispatcher and without forcing the user to think about which model they happen to be on today.

This is the part of the codebase I'm actually proud of, and it's the part nobody ships as a tutorial.

OpenAI's /v1/chat/completions

format is now a de facto standard. Groq implements it. Ollama implements it. Localai implements it. So three out of four providers I wanted to support "just work" with one HTTP call — if you're willing to accept the OpenAI tool schema as ground truth.

Anthropic doesn't. The Messages API uses:

system

field instead of a system message in the arrayx-api-key

instead of Authorization: Bearer

anthropic-version: 2023-06-01

as a required headertool_use

/ tool_result

content block format on the response sideSo the question is: do you write two tool dispatchers and two execution paths, or do you write a thin adapter that lets you keep one unified tool list and one dispatcher?

I chose the adapter. The whole provider layer is 60 lines.

PROVIDERS = {
    "groq": {
        "name": "Groq",
        "base_url": "https://api.groq.com/openai/v1",
        "models": ["llama-3.3-70b-versatile", "llama-3.1-70b-versatile",
                   "llama-3.2-90b-vision-preview", "mixtral-8x7b-32768"],
        "free_tier": True,
    },
    "openai": {
        "name": "OpenAI",
        "base_url": "https://api.openai.com/v1",
        "models": ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo", "gpt-3.5-turbo"],
        "free_tier": False,
    },
    "anthropic": {
        "name": "Anthropic",
        "base_url": "https://api.anthropic.com/v1",
        "models": ["claude-3-5-sonnet-20241022", "claude-3-opus-20240229",
                   "claude-3-haiku-20240307"],
        "free_tier": False,
    },
    "ollama": {
        "name": "Ollama (Local)",
        "base_url": os.getenv("OLLAMA_HOST", "http://localhost:11434/v1"),
        "models": ["llama3.2", "llama3.1", "codellama", "mistral", "qwen2.5"],
        "free_tier": True,
    },
}

Free tier is a first-class field, not a comment. The settings dialog lights up the "free" badge for Groq and Ollama, and the README leads with those two for new users.

chat()

The whole class has one entry point and two private methods. The entry point picks a path based on self.provider

. That's it.

class AIClient:
    def __init__(self, provider: str, model: str, api_key: str = ""):
        self.provider = provider
        self.model = model
        self.api_key = api_key
        self.config = PROVIDERS.get(provider, PROVIDERS["groq"])

    def chat(self, messages, tools=None):
        if self.provider == "anthropic":
            return self._chat_anthropic(messages, tools)
        return self._chat_openai_compatible(messages, tools)

    def _chat_openai_compatible(self, messages, tools=None):
        headers = {"Content-Type": "application/json"}
        if self.api_key:
            headers["Authorization"] = f"Bearer {self.api_key}"
        payload = {
            "model": self.model,
            "messages": messages,
            "max_tokens": 4096,
            "temperature": 0.7,
        }
        if tools:
            payload["tools"] = tools
            payload["tool_choice"] = "auto"
        try:
            resp = requests.post(
                f"{self.config['base_url']}/chat/completions",
                headers=headers, json=payload, timeout=60,
            )
            resp.raise_for_status()
            return resp.json()
        except Exception as e:
            return {"error": str(e)}

    def _chat_anthropic(self, messages, tools=None):
        headers = {
            "Content-Type": "application/json",
            "x-api-key": self.api_key,
            "anthropic-version": "2023-06-01",
        }
        system_msg = None
        anthropic_messages = []
        for msg in messages:
            if msg["role"] == "system":
                system_msg = msg["content"]
            else:
                anthropic_messages.append(msg)
        payload = {
            "model": self.model,
            "messages": anthropic_messages,
            "max_tokens": 4096,
        }
        if system_msg:
            payload["system"] = system_msg
        if tools:
            payload["tools"] = tools
        try:
            resp = requests.post(...)
            ...

The Anthropic branch has exactly three differences from the OpenAI branch: header names, system-message location, and the path. Everything else — the tools

list, the messages

array, the max_tokens

field — is identical. So the same 60 tool schemas I registered for Groq work on Claude without rewriting a single function definition.

This means I can hand the user a dropdown that says "switch to Claude Sonnet" and the next message the user types goes to Anthropic's API with the exact same tool surface. The model can call read_file

, list_directory

, browser_navigate

, send_email

, and oauth_github_login

on any of the four providers. The dispatcher doesn't care.

The tools live in their own modules and are aggregated with a single *

spread:

from browser.automation import BROWSER_TOOLS, PLAYWRIGHT_AVAILABLE
from email_agent.client import EMAIL_TOOLS, init_email, execute_email_tool
from oauth_manager.manager import OAUTH_TOOLS, get_oauth_manager, execute_oauth_tool
from voice.controller import VOICE_TOOLS, get_voice_controller, execute_voice_tool
from skills.registry import SKILL_TOOLS, get_skill_registry, execute_skill_tool
from hosting.server import HOSTING_TOOLS, get_hosting_server

TOOLS = [
    {"type": "function", "function": {"name": "read_file", ...}},
    {"type": "function", "function": {"name": "write_file", ...}},
    {"type": "function", "function": {"name": "list_directory", ...}},
    {"type": "function", "function": {"name": "run_command", ...}},
    {"type": "function", "function": {"name": "search_files", ...}},
    *BROWSER_TOOLS,    # browser_navigate, browser_click, browser_screenshot, ...
    *EMAIL_TOOLS,      # email_read_inbox, email_send, email_search
    *OAUTH_TOOLS,      # oauth_google_login, oauth_github_login, oauth_notion_login
    *VOICE_TOOLS,      # voice_listen, voice_speak, voice_set_wake_word
    *SKILL_TOOLS,      # skill_list, skill_load, skill_run
    *HOSTING_TOOLS,    # hosting_start, hosting_stop, hosting_status, hosting_logs
]

Each submodule exports both the schema (BROWSER_TOOLS

, EMAIL_TOOLS

, etc.) and the executor (execute_browser_tool

, execute_email_tool

). The schemas are OpenAI function-calling dicts. The executors are the actual Python functions the dispatcher calls when the model invokes the tool.

def execute_tool(name: str, args: dict, workspace: str) -> dict:
    try:
        if name == "read_file":
            path = Path(args.get("path", ""))
            if not path.is_absolute():
                path = Path(workspace) / path
            if path.exists():
                return {"success": True, "content": path.read_text()[:10000]}
            return {"success": False, "error": "File not found"}

        elif name == "run_command":
            result = subprocess.run(
                args.get("command", ""), shell=True,
                capture_output=True, text=True, timeout=30,
                cwd=args.get("cwd", workspace),
            )
            return {"success": True, "stdout": result.stdout[:5000],
                    "stderr": result.stderr[:5000], "exit_code": result.returncode}

        elif name.startswith("browser_"):
            return execute_browser_tool(name, args)
        elif name.startswith("email_"):
            return execute_email_tool(name, args)
        elif name.startswith("oauth_"):
            return execute_oauth_tool(name, args)
        elif name.startswith("voice_"):
            return execute_voice_tool(name, args)
        elif name.startswith("skill_"):
            return execute_skill_tool(name, args)
        elif name.startswith("hosting_"):
            return execute_hosting_tool(name, args)
        else:
            return {"success": False, "error": f"Unknown tool: {name}"}
    except Exception as e:
        return {"success": False, "error": str(e)}

Two patterns I want to call out:

Every executor returns {"success": bool, ...}. The model gets a uniform response shape, no matter which tool blew up. The model can then decide to retry, escalate, or just tell the user "I couldn't read that file." This is what makes the system actually usable when one of the 60 tools fails mid-conversation.

All file paths are resolved against the workspace. The model never gets to specify an absolute path that the user didn't authorize. Path(workspace) / path

is a tiny line, but it's the line that means "I can run this app on a stranger's laptop and not worry about ~

-escape exploits."

The killer feature, in practice, is the settings dropdown. The user can:

llama-3.3-70b-versatile

)The tool dispatcher doesn't know or care which provider is in self.provider

. The tool list doesn't change. The chat history doesn't get a special "this is the Anthropic thread" branch. The 800-line main.py

has one AIClient

and one execute_tool

and the rest is PySide6 widgets and message-routing glue.

Two things:

The Anthropic tool result format is still different on the response side. When Claude calls a tool, the response uses tool_use

blocks and you have to send back tool_result

blocks in the next turn. I handle this in the message-routing layer, not the dispatcher. If I were starting over I'd move the tool result translation into the _chat_anthropic

method, so the dispatcher could pretend all four providers return identical shapes.

Streaming is half-done. Groq and OpenAI stream identically; Anthropic streams differently (event types, not data-only SSE). The current build buffers the full response and shows it once. That's fine for a 4k-token answer; for a 64k reasoning trace it's not. The fix is the same shape as the chat dispatcher — one streaming method per provider family, one normalized chunk iterator for the UI.

imaplib

smtplib

from stdlib

git clone https://github.com/AmSach/sentience
cd sentience
pip install -r requirements.txt
playwright install chromium
export GROQ_API_KEY=your_key_here   # or OPENAI / ANTHROPIC / OLLAMA_HOST
python src/main.py

For 100% offline use:

ollama pull llama3.2
OLLAMA_HOST=http://localhost:11434 python src/main.py

MIT licensed. PRs welcome — especially on the streaming layer, a new provider, or another tool module (calendar? GitHub? Linear?).

source & further reading

dev.to — original article I built an AI job-search agent solo — here's the full stack Yelp’s OpenAI Deal Brings Local Reviews and Business Data to ChatGPT Publishers Blocking AI Crawlers Are Reshaping the Economics of Training Data

How Sentience ships 60+ AI tools in one local desktop app — without locking you into one provider

Run your AI side-project on zahid.host