cd /news/machine-learning/embeddings-is-all-you-need · home topics machine-learning article
[ARTICLE · art-29961] src=lusob.github.io ↗ pub= topic=machine-learning verified=true sentiment=· neutral

Embeddings is all you need

A new in-browser voice-to-action system uses a tiny embedding model (MiniLM-L6-v2) to classify intents via cosine similarity, achieving sub-50ms latency without any server or large language model. The pipeline runs entirely in the browser using Web Speech API and WASM, enabling fast, private intent classification for tasks like shopping list management.

read1 min views3 publishedJun 16, 2026

100% in-browser · no server · no LLM · < 50 ms after warm-up

Intent classification using a tiny embedding model (MiniLM-L6-v2, 23 MB, WASM) — cosine similarity, not a language model

Click to speak

Transcript

🛒 Shopping list

  • Say "add milk" or "remove bread"…

⚡ Custom actions

Intent

Confidence

Latency

Example commands — click to trigger with this text

Cosine similarity per intent

Waiting for first command…

Local pipeline · no server · no LLM

Web Speech API → Transcript → MiniLM embedding (WASM) → Cosine similarity → DOM action

── more in #machine-learning 4 stories · sorted by recency
danielmay.co.uk · · #machine-learning
Claudity
── more on @minilm-l6-v2 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/embeddings-is-all-yo…] indexed:0 read:1min 2026-06-16 ·