cd /news/large-language-models/gemma-4-e2b-running-in-browser-at-25… · home topics large-language-models article
[ARTICLE · art-31790] src=huggingface.co ↗ pub= topic=large-language-models verified=true sentiment=↑ positive

Gemma 4 E2B running in-browser at 255 tok/s

A new Hugging Face Space demonstrates Gemma 4 E2B running in-browser via WebGPU at 255 tokens per second, showcasing efficient on-device AI inference.

read1 min views2 publishedJun 17, 2026

Article URL:

https://huggingface.co/spaces/webml-community/gemma-4-webgpu-kernels Comments URL: https://news.ycombinator.com/item?id=48577195

Points: 3

── more in #large-language-models 4 stories · sorted by recency
── more on @gemma 4 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/gemma-4-e2b-running-…] indexed:0 read:1min 2026-06-17 ·