I wanted Cursor's UX and Zo Computer's tool breadth in a desktop app I could close the laptop lid on. So I built Sentience — a PySide6 desktop AI assistant that runs entirely on your machine, brings its own browser, its own email client, its own voice controller, and exposes 60+ tool functions to the model.
The hard part was never the tools. The hard part was making 60+ tool schemas work identically across Groq, OpenAI, Anthropic, and a local Ollama instance — without writing a provider-specific tool dispatcher and without forcing the user to think about which model they happen to be on today.
This is the part of the codebase I'm actually proud of, and it's the part nobody ships as a tutorial.
OpenAI's /v1/chat/completions
format is now a de facto standard. Groq implements it. Ollama implements it. Localai implements it. So three out of four providers I wanted to support "just work" with one HTTP call — if you're willing to accept the OpenAI tool schema as ground truth.
Anthropic doesn't. The Messages API uses:
system
field instead of a system message in the arrayx-api-key
instead of Authorization: Bearer
anthropic-version: 2023-06-01
as a required headertool_use
/ tool_result
content block format on the response sideSo the question is: do you write two tool dispatchers and two execution paths, or do you write a thin adapter that lets you keep one unified tool list and one dispatcher?
I chose the adapter. The whole provider layer is 60 lines.
PROVIDERS = {
"groq": {
"name": "Groq",
"base_url": "https://api.groq.com/openai/v1",
"models": ["llama-3.3-70b-versatile", "llama-3.1-70b-versatile",
"llama-3.2-90b-vision-preview", "mixtral-8x7b-32768"],
"free_tier": True,
},
"openai": {
"name": "OpenAI",
"base_url": "https://api.openai.com/v1",
"models": ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo", "gpt-3.5-turbo"],
"free_tier": False,
},
"anthropic": {
"name": "Anthropic",
"base_url": "https://api.anthropic.com/v1",
"models": ["claude-3-5-sonnet-20241022", "claude-3-opus-20240229",
"claude-3-haiku-20240307"],
"free_tier": False,
},
"ollama": {
"name": "Ollama (Local)",
"base_url": os.getenv("OLLAMA_HOST", "http://localhost:11434/v1"),
"models": ["llama3.2", "llama3.1", "codellama", "mistral", "qwen2.5"],
"free_tier": True,
},
}
Free tier is a first-class field, not a comment. The settings dialog lights up the "free" badge for Groq and Ollama, and the README leads with those two for new users.
chat()
The whole class has one entry point and two private methods. The entry point picks a path based on self.provider
. That's it.
class AIClient:
def __init__(self, provider: str, model: str, api_key: str = ""):
self.provider = provider
self.model = model
self.api_key = api_key
self.config = PROVIDERS.get(provider, PROVIDERS["groq"])
def chat(self, messages, tools=None):
if self.provider == "anthropic":
return self._chat_anthropic(messages, tools)
return self._chat_openai_compatible(messages, tools)
def _chat_openai_compatible(self, messages, tools=None):
headers = {"Content-Type": "application/json"}
if self.api_key:
headers["Authorization"] = f"Bearer {self.api_key}"
payload = {
"model": self.model,
"messages": messages,
"max_tokens": 4096,
"temperature": 0.7,
}
if tools:
payload["tools"] = tools
payload["tool_choice"] = "auto"
try:
resp = requests.post(
f"{self.config['base_url']}/chat/completions",
headers=headers, json=payload, timeout=60,
)
resp.raise_for_status()
return resp.json()
except Exception as e:
return {"error": str(e)}
def _chat_anthropic(self, messages, tools=None):
headers = {
"Content-Type": "application/json",
"x-api-key": self.api_key,
"anthropic-version": "2023-06-01",
}
system_msg = None
anthropic_messages = []
for msg in messages:
if msg["role"] == "system":
system_msg = msg["content"]
else:
anthropic_messages.append(msg)
payload = {
"model": self.model,
"messages": anthropic_messages,
"max_tokens": 4096,
}
if system_msg:
payload["system"] = system_msg
if tools:
payload["tools"] = tools
try:
resp = requests.post(...)
...
The Anthropic branch has exactly three differences from the OpenAI branch: header names, system-message location, and the path. Everything else — the tools
list, the messages
array, the max_tokens
field — is identical. So the same 60 tool schemas I registered for Groq work on Claude without rewriting a single function definition.
This means I can hand the user a dropdown that says "switch to Claude Sonnet" and the next message the user types goes to Anthropic's API with the exact same tool surface. The model can call read_file
, list_directory
, browser_navigate
, send_email
, and oauth_github_login
on any of the four providers. The dispatcher doesn't care.
The tools live in their own modules and are aggregated with a single *
spread:
from browser.automation import BROWSER_TOOLS, PLAYWRIGHT_AVAILABLE
from email_agent.client import EMAIL_TOOLS, init_email, execute_email_tool
from oauth_manager.manager import OAUTH_TOOLS, get_oauth_manager, execute_oauth_tool
from voice.controller import VOICE_TOOLS, get_voice_controller, execute_voice_tool
from skills.registry import SKILL_TOOLS, get_skill_registry, execute_skill_tool
from hosting.server import HOSTING_TOOLS, get_hosting_server
TOOLS = [
{"type": "function", "function": {"name": "read_file", ...}},
{"type": "function", "function": {"name": "write_file", ...}},
{"type": "function", "function": {"name": "list_directory", ...}},
{"type": "function", "function": {"name": "run_command", ...}},
{"type": "function", "function": {"name": "search_files", ...}},
*BROWSER_TOOLS, # browser_navigate, browser_click, browser_screenshot, ...
*EMAIL_TOOLS, # email_read_inbox, email_send, email_search
*OAUTH_TOOLS, # oauth_google_login, oauth_github_login, oauth_notion_login
*VOICE_TOOLS, # voice_listen, voice_speak, voice_set_wake_word
*SKILL_TOOLS, # skill_list, skill_load, skill_run
*HOSTING_TOOLS, # hosting_start, hosting_stop, hosting_status, hosting_logs
]
Each submodule exports both the schema (BROWSER_TOOLS
, EMAIL_TOOLS
, etc.) and the executor (execute_browser_tool
, execute_email_tool
). The schemas are OpenAI function-calling dicts. The executors are the actual Python functions the dispatcher calls when the model invokes the tool.
def execute_tool(name: str, args: dict, workspace: str) -> dict:
try:
if name == "read_file":
path = Path(args.get("path", ""))
if not path.is_absolute():
path = Path(workspace) / path
if path.exists():
return {"success": True, "content": path.read_text()[:10000]}
return {"success": False, "error": "File not found"}
elif name == "run_command":
result = subprocess.run(
args.get("command", ""), shell=True,
capture_output=True, text=True, timeout=30,
cwd=args.get("cwd", workspace),
)
return {"success": True, "stdout": result.stdout[:5000],
"stderr": result.stderr[:5000], "exit_code": result.returncode}
elif name.startswith("browser_"):
return execute_browser_tool(name, args)
elif name.startswith("email_"):
return execute_email_tool(name, args)
elif name.startswith("oauth_"):
return execute_oauth_tool(name, args)
elif name.startswith("voice_"):
return execute_voice_tool(name, args)
elif name.startswith("skill_"):
return execute_skill_tool(name, args)
elif name.startswith("hosting_"):
return execute_hosting_tool(name, args)
else:
return {"success": False, "error": f"Unknown tool: {name}"}
except Exception as e:
return {"success": False, "error": str(e)}
Two patterns I want to call out:
Every executor returns {"success": bool, ...}. The model gets a uniform response shape, no matter which tool blew up. The model can then decide to retry, escalate, or just tell the user "I couldn't read that file." This is what makes the system actually usable when one of the 60 tools fails mid-conversation.
All file paths are resolved against the workspace. The model never gets to specify an absolute path that the user didn't authorize. Path(workspace) / path
is a tiny line, but it's the line that means "I can run this app on a stranger's laptop and not worry about ~
-escape exploits."
The killer feature, in practice, is the settings dropdown. The user can:
llama-3.3-70b-versatile
)The tool dispatcher doesn't know or care which provider is in self.provider
. The tool list doesn't change. The chat history doesn't get a special "this is the Anthropic thread" branch. The 800-line main.py
has one AIClient
and one execute_tool
and the rest is PySide6 widgets and message-routing glue.
Two things:
The Anthropic tool result format is still different on the response side. When Claude calls a tool, the response uses tool_use
blocks and you have to send back tool_result
blocks in the next turn. I handle this in the message-routing layer, not the dispatcher. If I were starting over I'd move the tool result translation into the _chat_anthropic
method, so the dispatcher could pretend all four providers return identical shapes.
Streaming is half-done. Groq and OpenAI stream identically; Anthropic streams differently (event types, not data-only SSE). The current build buffers the full response and shows it once. That's fine for a 4k-token answer; for a 64k reasoning trace it's not. The fix is the same shape as the chat dispatcher — one streaming method per provider family, one normalized chunk iterator for the UI.
imaplib
smtplib
from stdlib
git clone https://github.com/AmSach/sentience
cd sentience
pip install -r requirements.txt
playwright install chromium
export GROQ_API_KEY=your_key_here # or OPENAI / ANTHROPIC / OLLAMA_HOST
python src/main.py
For 100% offline use:
ollama pull llama3.2
OLLAMA_HOST=http://localhost:11434 python src/main.py
MIT licensed. PRs welcome — especially on the streaming layer, a new provider, or another tool module (calendar? GitHub? Linear?).