{"slug": "embeddings-is-all-you-need", "title": "Embeddings is all you need", "summary": "A new in-browser voice-to-action system uses a tiny embedding model (MiniLM-L6-v2) to classify intents via cosine similarity, achieving sub-50ms latency without any server or large language model. The pipeline runs entirely in the browser using Web Speech API and WASM, enabling fast, private intent classification for tasks like shopping list management.", "body_md": "# Voice → Embedding → Action\n\n100% in-browser · no server · no LLM · < 50 ms after warm-up\n\nIntent classification using a tiny embedding model (MiniLM-L6-v2, 23 MB, WASM) — cosine similarity, not a language model\n\nClick to speak\n\nTranscript\n\n🛒 Shopping list\n\n- Say \"add milk\" or \"remove bread\"…\n\n⚡ Custom actions\n\nIntent\n\n—\n\nConfidence\n\n—\n\nLatency\n\n—\n\nExample commands — click to trigger with this text\n\nCosine similarity per intent\n\nWaiting for first command…\n\nLocal pipeline · no server · no LLM\n\nWeb Speech API\n→\nTranscript\n→\nMiniLM embedding (WASM)\n→\nCosine similarity\n→\nDOM action", "url": "https://wpnews.pro/news/embeddings-is-all-you-need", "canonical_source": "https://lusob.github.io/embeddings-is-all-you-need/", "published_at": "2026-06-16 18:03:26+00:00", "updated_at": "2026-06-16 18:19:40.620591+00:00", "lang": "en", "topics": ["machine-learning", "natural-language-processing", "ai-tools", "developer-tools"], "entities": ["MiniLM-L6-v2", "Web Speech API", "WASM"], "alternates": {"html": "https://wpnews.pro/news/embeddings-is-all-you-need", "markdown": "https://wpnews.pro/news/embeddings-is-all-you-need.md", "text": "https://wpnews.pro/news/embeddings-is-all-you-need.txt", "jsonld": "https://wpnews.pro/news/embeddings-is-all-you-need.jsonld"}}