{"slug": "diffusiongemma", "title": "DiffusionGemma", "summary": "Google released DiffusionGemma, a new open-weight AI model under the Apache 2 license, available on Hugging Face. NVIDIA is hosting the model for free on its NIM cloud API, where it generated 2,409 tokens in 4.4 seconds, achieving speeds of at least 500 tokens per second.", "body_md": "That research has returned in the best possible way: as a new open weight (Apache 2 licensed) Gemma model, [google/diffusiongemma-26B-A4B-it](https://huggingface.co/google/diffusiongemma-26B-A4B-it).\n\nNVIDIA are currently [hosting the model for free](https://build.nvidia.com/google/diffusiongemma-26b-a4b-it) on their NIM cloud API. I used that API to [generate this pelican](https://tools.simonwillison.net/markdown-svg-renderer#url=https%3A%2F%2Fgist.github.com%2Fsimonw%2Fe5e234a6dc6eef61e209ce1629620042), which took 4.4s (according to `time uv run generate.py`\n\n) to return 2,409 tokens - so at least 500 tokens/second.\n\nVia [Hacker News](https://news.ycombinator.com/item?id=48478471)\n\nTags: [google](https://simonwillison.net/tags/google), [ai](https://simonwillison.net/tags/ai), [generative-ai](https://simonwillison.net/tags/generative-ai), [llms](https://simonwillison.net/tags/llms), [nvidia](https://simonwillison.net/tags/nvidia), [pelican-riding-a-bicycle](https://simonwillison.net/tags/pelican-riding-a-bicycle), [gemma](https://simonwillison.net/tags/gemma), [llm-release](https://simonwillison.net/tags/llm-release), [llm-performance](https://simonwillison.net/tags/llm-performance)", "url": "https://wpnews.pro/news/diffusiongemma", "canonical_source": "https://simonwillison.net/2026/Jun/10/diffusiongemma/#atom-everything", "published_at": "2026-06-10 20:00:54+00:00", "updated_at": "2026-06-11 19:15:02.098030+00:00", "lang": "en", "topics": ["large-language-models", "generative-ai", "ai-products", "ai-tools", "ai-infrastructure"], "entities": ["Google", "NVIDIA", "Gemma", "DiffusionGemma", "Apache 2", "Hugging Face", "NIM", "Simon Willison"], "alternates": {"html": "https://wpnews.pro/news/diffusiongemma", "markdown": "https://wpnews.pro/news/diffusiongemma.md", "text": "https://wpnews.pro/news/diffusiongemma.txt", "jsonld": "https://wpnews.pro/news/diffusiongemma.jsonld"}}