# Local AI: 775 tok/s, DiffusionGemma (BF16) on Nvidia RTX 6000 Pro

> Source: <https://twitter.com/OrganicGPT/status/2064883777499795716>
> Published: 2026-06-11 21:33:42+00:00

Inanely Fast Local AI: 775 token per second! 🤯 I was able to run the new DiffusionGemma (full BF16 model) by @googlegemma on vLLM (fork by Red Hat) on Nvidia RTX 6000 Pro. It's blazing fast at short contexts, but gets slow very quickly. At 100k, TTFT is 22s!
■ Leave a comment setup and command to run the model.
