100% in-browser · no server · no LLM · < 50 ms after warm-up
Intent classification using a tiny embedding model (MiniLM-L6-v2, 23 MB, WASM) — cosine similarity, not a language model
Click to speak
Transcript
🛒 Shopping list
- Say "add milk" or "remove bread"…
⚡ Custom actions
Intent
—
Confidence
—
Latency
—
Example commands — click to trigger with this text
Cosine similarity per intent
Waiting for first command…
Local pipeline · no server · no LLM
Web Speech API → Transcript → MiniLM embedding (WASM) → Cosine similarity → DOM action