{"slug": "targeted-remasking-replacing-token-editing-with-token-to-mask-refinement-in", "title": "Targeted Remasking: Replacing Token Editing with Token-to-Mask Refinement in Discrete Diffusion Language Models", "summary": "Researchers have developed Token-to-Mask (T2M) remasking, a training-free replacement for Token-to-Token (T2T) editing in discrete masked diffusion language models like LLaDA. The method resets suspected erroneous tokens back to the mask state for cleaner re-prediction, improving performance across 12 benchmarks with the largest gain of +5.92% on mathematics (CMATH). T2M repairs 59.4% of last-mile token corruption cases, where correct reasoning produces a corrupted final answer.", "body_md": "arXiv:2605.26436v1 Announce Type: new\nAbstract: Discrete masked diffusion language models such as LLaDA generate text through iterative denoising, where mask tokens are progressively replaced with predicted tokens. LLaDA2.1 introduced a Token-to-Token (T2T) editing mechanism that accelerates generation by directly replacing committed tokens suspected of being incorrect. However, we identify fundamental limitations of T2T editing: it couples error detection with replacement, pollutes the generation context with potentially incorrect tokens, and introduces a train-inference noise mismatch where systematic model-generated errors differ from the random perturbations seen during training. We propose Token-to-Mask (T2M) remasking, a training-free, drop-in replacement for T2T editing that resets suspected erroneous tokens back to the mask state, allowing the diffusion process to re-predict them under cleaner context. We design and empirically validate three complementary error detection strategies -- probability-based, trigger-mirrored, and temporal-difference-based -- and provide a unified theoretical analysis showing that T2M remasking purifies the generation context, converts systematic inference errors back to the model's native mask noise type, and enables delayed commitment for joint multi-position optimization. Comprehensive experiments across 12 benchmarks spanning knowledge, reasoning, mathematics, coding, and instruction following show that T2M generally improves performance on tasks requiring precise token-level output, with the largest gain on mathematics (+5.92% on CMATH). Error analysis on CMATH reveals that the dominant failure mode is last-mile token corruption -- where correct reasoning produces a corrupted final answer -- and that T2M repairs 59.4% of such cases.", "url": "https://wpnews.pro/news/targeted-remasking-replacing-token-editing-with-token-to-mask-refinement-in", "canonical_source": "https://arxiv.org/abs/2605.26436", "published_at": "2026-05-27 04:00:00+00:00", "updated_at": "2026-05-27 04:34:59.510563+00:00", "lang": "en", "topics": ["artificial-intelligence", "machine-learning", "large-language-models", "generative-ai", "natural-language-processing"], "entities": ["LLaDA", "LLaDA2.1", "Token-to-Token", "Token-to-Mask"], "alternates": {"html": "https://wpnews.pro/news/targeted-remasking-replacing-token-editing-with-token-to-mask-refinement-in", "markdown": "https://wpnews.pro/news/targeted-remasking-replacing-token-editing-with-token-to-mask-refinement-in.md", "text": "https://wpnews.pro/news/targeted-remasking-replacing-token-editing-with-token-to-mask-refinement-in.txt", "jsonld": "https://wpnews.pro/news/targeted-remasking-replacing-token-editing-with-token-to-mask-refinement-in.jsonld"}}