cd /news/artificial-intelligence/fable-5-pushed-gemma-4-to-255-tok-s-… · home topics artificial-intelligence article
[ARTICLE · art-32692] src=xcancel.com ↗ pub= topic=artificial-intelligence verified=true sentiment=↑ positive

Fable 5 pushed Gemma 4 to 255 tok/s on WebGPU

Fable 5, an AI agent, achieved 255 tokens per second on Gemma 4 inference using WebGPU before its access was suspended. The developer released the demo and kernels, claiming agentic kernel optimization is the future of on-device inference.

read1 min views1 publishedJun 18, 2026

Before Fable 5 was shut down, it pushed Gemma 4 to 255 tok/s on WebGPU. Some didn't believe it was real. Today we're releasing the demo and kernels it wrote for you to see yourself. Run it locally in your browser. Agentic kernel optimization is the future of on-device inference

I gave Fable 5 one job: write custom WebGPU kernels for Gemma 4 inference. It climbed to 84 tok/s, then hit a wall, insisting further optimization was impossible. Hours later, Anthropic rolled back invisible LLM development safeguards, and it hit 255 tok/s. The next day, access to Fable 5 was suspended globally.

Jun 17, 2026 · 4:54 PM UTC

69

160

1,733

264,359

In case you hadn't noticed, we're working on something big. Stay tuned. 🔗 Link to the demo:

huggingface.co/spaces/webml-… 5 9

120

7,879

── more in #artificial-intelligence 4 stories · sorted by recency
── more on @fable 5 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/fable-5-pushed-gemma…] indexed:0 read:1min 2026-06-18 ·