Neither Parallel nor Sequential: How DiffusionGemma Commits Tokens

wpnews.pro

cd /news/large-language-models/neither-parallel-nor-sequential-how-… · home › topics › large-language-models › article

[ARTICLE · art-30650] src=arxiv.org ↗ pub=2026-06-17T06:18Z topic=large-language-models verified=true sentiment=· neutral

Neither Parallel nor Sequential: How DiffusionGemma Commits Tokens

Researchers at Google DeepMind instrumented DiffusionGemma 26B, a masked discrete-diffusion mixture-of-experts model, and found its decoding is neither parallel nor sequential but follows a partial left-to-right commit bias that depends on measurement granularity. The model commits tokens in large simultaneous batches with regime-dependent behavior, such as arbitrary-order commitment for structured JSON and confidence tracking correctness on math reasoning but not factual recall. The study highlights methodological pitfalls in measuring decoding order, including trailing-EOS padding and block-size sensitivity.

read2 min views10 publishedJun 17, 2026

[Submitted on 12 Jun 2026]


[View PDF](/pdf/2606.14620)

[HTML (experimental)](https://arxiv.org/html/2606.14620v1)

Abstract:Open diffusion language models are marketed as parallel, non-autoregressive decoders, yet the order in which a shipped checkpoint actually commits its tokens is almost never measured. We instrument DiffusionGemma 26B, a masked discrete-diffusion mixture-of-experts model built on Gemma 4, hooking its sampler's accept step to record which canvas positions commit, when, and at what confidence. Across a 686-prompt, six-regime probe suite we find that its decoding is neither parallel nor block-autoregressive: it follows a partial left-to-right commit bias whose apparent strength depends almost entirely on the granularity at which you look. Order is weak token by token and strengthens smoothly as the analysis is coarsened, so the model's "block size" turns out to be an artifact of the measuring ruler rather than the architecture. The model commits in large simultaneous batches, leaving much of the within-batch order genuinely undefined rather than merely unobserved. The behaviour is regime-dependent: structured JSON is committed in essentially arbitrary order, and a position's commit confidence tracks correctness on mathematical reasoning but carries no signal on factual recall. Commitment is aggressive, finishing in a short late burst well inside the step budget, while task accuracy matches the model's autoregressive Gemma-4 sibling. Beyond these findings, our central contribution is methodological: measuring decoding order honestly demands handling trailing-EOS padding, within-regime confounding, commit non-monotonicity, block-size sensitivity, and large commit-batch ties, each of which can otherwise manufacture a decoding-order result that is not really there.

References & Citations

...

Bibliographic Explorer

(What is the Explorer?) Connected Papers

(What is Connected Papers?) Litmaps

(What is Litmaps?) scite Smart Citations

(What are Smart Citations?)# Code, Data and Media Associated with this Article alphaXiv

(What is alphaXiv?) CatalyzeX Code Finder for Papers

(What is CatalyzeX?) DagsHub

(What is DagsHub?) Gotit.pub

(What is GotitPub?) Hugging Face

(What is Huggingface?) ScienceCast

(What is ScienceCast?)# Demos Influence Flower

(What are Influence Flowers?) CORE Recommender

(What is CORE?) IArxiv Recommender

(What is IArxiv?)# arXivLabs: experimental projects with community collaborators arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/neither-parallel-nor-seq…

Read original on arxiv.org → arxiv.org/abs/2606.14620

mentioned entities

DiffusionGemma

Gemma 4

Google DeepMind

metadata

slugneither-parallel-nor-sequential-how-diffusiongemma-commits-tokens

topic#large-language-models

secondary3 topics

sentimentneutral

canonicalarxiv.org

navigation

← prevDisbatch – turn a PowerShell/bat…

next →ETF Issuers File MANGOS-Themed A…

── more in #large-language-models 4 stories · sorted by recency

pub.towardsai.net · 19 Jun · #large-language-models

Why You Lose With (and Against) AI

venturebeat.com · 18 Jun · #large-language-models

Why Weibo's tiny VibeThinker-3B has the AI world arguing over benchmarks again

letsdatascience.com · 18 Jun · #large-language-models

G7 Hosts AI CEOs Amid Protests and Criticism

cryptobriefing.com · 18 Jun · #large-language-models

OpenAI demonstrates alignment gains through reinforcement learning on beneficial traits

── more on @diffusiongemma 3 stories trending now

wpnews · 18 Jun · #large-language-models

ICYMI: ZAI launches GLM-5.2 open model with 1M context

wpnews · 18 Jun · #ai-chips

Apple and Intel join forces in Trump’s push to bring chipmaking home

wpnews · 18 Jun · #ai-agents

How to Automate Business Reports With an AI Agent Instead of Dashboards

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required