{"slug": "i-rebuilt-siri-ai-from-scratch-and-open-sourced-it", "title": "I rebuilt Siri AI from scratch and open sourced it", "summary": "Developer wassgha released OpenDex, an open-source desktop app that turns any LLM into a hands-free voice assistant with a cinematic interface. The app supports fully offline operation on Mac with Apple Intelligence and offers customizable models, voice engines, and themes.", "body_md": "**A voice-first, open-source AI assistant for your desktop.**\nWake it, talk to it, and a tool-using agent talks back — in a cinematic interface you choose.\n\nOr browse every version on the [Releases](https://github.com/wassgha/opendex/releases) page.\n\nThe app **auto-updates**: it checks GitHub Releases on launch (and hourly), downloads new versions in the background, and prompts you to restart when one is ready.\n\nOpenDex is a desktop app that turns any LLM into a hands-free, **Iron-Man-style voice assistant**. Say the wake word (or push to talk), speak naturally, and the agent thinks, uses tools, and replies out loud — streaming its answer into a live visualization.\n\nIt's a **harness**, not a single bot: the model, the voice, the wake/transcription engines, the on-screen theme, the greeting, and the agent's skills are all configurable, and it can run **fully offline and free** (local speech in, local speech recognition, system voice out) — and on a Mac with Apple Intelligence, even the model runs on-device, so the whole loop is free with no key.\n\n- 🎙️\n**Voice-first loop**— wake word → listen → think (with tools) → speak, plus natural follow-ups and opt-in barge-in (interrupt mid-reply). - 🧠\n**Bring any model**— pick your provider in setup:** Apple Intelligence**(on-device, free, no key — macOS), your own** OpenAI**or** Anthropic**key, or the** Vercel AI Gateway**(one key → Claude, GPT, Gemini, and more).*(An OpenDex hosted subscription — sign in, no keys, cloud-synced — is coming soon.)* - 🆓\n**Free & offline option**— Vosk wake word + local Whisper transcription (WASM, no signup) and your OS's built-in voice. No data leaves the machine except the LLM call. - 🔌\n**Pluggable voice I/O**— wake via push-to-talk, Vosk, or Web Speech; transcribe via local Whisper/Vosk, OpenAI, or Web Speech; speak via ElevenLabs or system TTS. - 🎨\n**Full-interface themes**— the theme*is*the whole UI: a cinematic**Jarvis HUD** with an animated arc reactor, a minimal**Talking Dot**, or a** Typing Cursor**terminal. All react to your voice. - 🛠️\n**Agentic skills with a permission gate**— the agent can take real actions (e.g. open apps & URLs); sensitive actions pop an** Allow once / Always / Deny**prompt that's remembered per skill. - 🖥️\n**Computer-use (opt-in)**— let it*see the screen and drive the mouse & keyboard*to operate apps for you. Works with any vision model (screenshots stream back as images), and stays behind the permission gate. - 🔐\n**Secure by design**— API keys are encrypted with your OS keychain and live only in the main process, never in the UI.\n\nOpenDex is fully theamable, you can change anything about the user interface and make it yours.\n\n| First-run setup | Minimal \"typing cursor\" theme |\n|---|---|\n\n```\ngit clone https://github.com/wassgha/opendex.git\ncd opendex\npnpm install\npnpm dev            # launches the OpenDex desktop window\n```\n\nOn first launch a short **onboarding wizard** walks you through choosing a model provider, voice, wake/transcription engine, theme, and greeting. Everything is changeable later from the **Settings** gear (⚙).\n\nPick where the thinking happens — every part of the loop can be free/offline:\n\n**Model:****Apple Intelligence**(on-device, free, no key — macOS only), your own** OpenAI**/** Anthropic**key, or the** Vercel AI Gateway**(one key, any provider).** Voice out:**\"System voice\" (free) or ElevenLabs (key).** Voice in:**local** Whisper**/** Vosk**(free, offline, one-time model download) or OpenAI Whisper (key).** Wake:**push-to-talk / Vosk (free, offline) or Web Speech (browser).\n\nOn a Mac with Apple Intelligence enabled, the whole loop (model + speech + voice) runs locally with **no keys at all**.\n\nKeys are normally entered in-app and stored encrypted. For development you can seed them via `.env`\n\n(used only as a fallback):\n\n```\ncp .env.local.example .env\n```\n\n| Variable | Purpose |\n|---|---|\n`AI_GATEWAY_API_KEY` |\nchat via the Vercel AI Gateway |\n`OPENAI_API_KEY` |\nchat via OpenAI directly, and/or OpenAI Whisper transcription |\n`ANTHROPIC_API_KEY` |\nchat via Anthropic (Claude) directly |\n`ELEVENLABS_API_KEY` |\nElevenLabs TTS (skip if using the system voice) |\n`TAVILY_API_KEY` |\nweb-search tool (optional) |\n\nThe chat provider needs\n\noneof`AI_GATEWAY_API_KEY`\n\n/`OPENAI_API_KEY`\n\n/`ANTHROPIC_API_KEY`\n\n— matching the provider you select. Apple Intelligence needs none.\n\nThe agent's capabilities are **skills** — declarative tool bundles. Sensitive ones run behind a permission gate: when the model wants to act, OpenDex pauses and asks, and your choice (Allow once / Always / Never) is remembered. Built-in skills today: **Open apps & URLs**, and **Control the computer** (screen capture + mouse/keyboard — opt-in, off by default).\n\nComputer-use setup (macOS):enableControl the computerinSettings → Skills & tools, then grant OpenDexScreen RecordingandAccessibilitypermission inSystem Settings → Privacy & Security(without them, screenshots come back blank and clicks do nothing). It's powerful — keep the permission onAsk, and \"Allow once\" covers the whole task it's working on.\n\n- Electron shell + secure agent/TTS-over-IPC\n- Config, onboarding & OS-keychain key storage\n- Full-interface themes (Jarvis HUD · Talking Dot · Typing Cursor)\n- Pluggable wake-word + speech-to-text (incl. free offline Whisper & Vosk)\n- Skills + permission gate\n*(Open apps & URLs)* - Computer-use — screen capture + mouse/keyboard control, gated & opt-in\n- Pluggable model providers — Apple on-device, OpenAI/Anthropic keys, AI Gateway\n- OpenDex hosted subscription — sign in, no keys, cloud-synced settings & history\n- MCP servers + more built-in skills (shell, filesystem, …)\n- Signed GitHub releases + auto-update\n\n| Command | Description |\n|---|---|\n`pnpm dev` |\nrun the app with hot reload |\n`pnpm build` |\nbuild main/preload/renderer into `out/` |\n`pnpm start` |\nrun the built app |\n`pnpm dist` |\npackage installers (mac/win/linux) via electron-builder |\n`pnpm typecheck` |\n`tsc --noEmit` |\n`pnpm smoke:chat [briefing]` |\nexercise the agent loop without Electron |\n\nElectron · electron-vite · React 19 · Tailwind CSS 4 · Vercel AI SDK v6 · ElevenLabs · Vosk · transformers.js (Whisper) — all local speech engines are WASM. The only native module is **nut.js** (computer-use input control); it ships prebuilt N-API binaries per platform.\n\n[MIT](/wassgha/opendex/blob/main/LICENSE) — contributions welcome.", "url": "https://wpnews.pro/news/i-rebuilt-siri-ai-from-scratch-and-open-sourced-it", "canonical_source": "https://github.com/wassgha/opendex", "published_at": "2026-06-29 06:09:27+00:00", "updated_at": "2026-06-29 06:28:43.367068+00:00", "lang": "en", "topics": ["ai-agents", "large-language-models", "developer-tools"], "entities": ["OpenDex", "Apple Intelligence", "OpenAI", "Anthropic", "Vercel AI Gateway", "ElevenLabs", "Whisper", "Vosk"], "alternates": {"html": "https://wpnews.pro/news/i-rebuilt-siri-ai-from-scratch-and-open-sourced-it", "markdown": "https://wpnews.pro/news/i-rebuilt-siri-ai-from-scratch-and-open-sourced-it.md", "text": "https://wpnews.pro/news/i-rebuilt-siri-ai-from-scratch-and-open-sourced-it.txt", "jsonld": "https://wpnews.pro/news/i-rebuilt-siri-ai-from-scratch-and-open-sourced-it.jsonld"}}