{"slug": "from-ar-to-diffusion-efficiently-adapting-large-language-models-with-strictly", "title": "From AR to Diffusion: Efficiently Adapting Large Language Models with Strictly Causal and Elastic Horizons", "summary": "Researchers have developed FLUID, a framework that adapts autoregressive large language models for efficient parallel text generation using diffusion models. By enforcing Strictly Causal Alignment and Elastic Horizons, FLUID achieves state-of-the-art performance while reducing training costs by orders of magnitude, eliminating the need for pre-training from scratch.", "body_md": "arXiv:2605.27387v1 Announce Type: new\nAbstract: Diffusion models promise efficient parallel text generation but rely on bidirectional attention, creating a structural mismatch with pre-trained Autoregressive (AR) models. This incompatibility precludes reusing robust AR priors, necessitating prohibitive pre-training from scratch. To bridge this gap, we propose FLUID, a framework that efficiently adapts AR backbones to the diffusion paradigm. By enforcing Strictly Causal Alignment, FLUID enables seamless initialization from standard GPT-style checkpoints, circumventing the need for massive pre-training. Furthermore, we introduce Elastic Horizons, an entropy-driven mechanism that dynamically modulates denoising strides based on local information density rather than fixed schedules. Experiments demonstrate that FLUID achieves state-of-the-art performance while reducing training costs by orders of magnitude, effectively reconciling established AR foundations with efficient parallel generation. Our code is available at https://github.com/Oli-lab-nun/FLUID/tree/main.", "url": "https://wpnews.pro/news/from-ar-to-diffusion-efficiently-adapting-large-language-models-with-strictly", "canonical_source": "https://arxiv.org/abs/2605.27387", "published_at": "2026-05-28 04:00:00+00:00", "updated_at": "2026-05-28 04:34:33.410933+00:00", "lang": "en", "topics": ["large-language-models", "generative-ai", "natural-language-processing", "machine-learning", "artificial-intelligence"], "entities": ["FLUID", "GPT", "arXiv"], "alternates": {"html": "https://wpnews.pro/news/from-ar-to-diffusion-efficiently-adapting-large-language-models-with-strictly", "markdown": "https://wpnews.pro/news/from-ar-to-diffusion-efficiently-adapting-large-language-models-with-strictly.md", "text": "https://wpnews.pro/news/from-ar-to-diffusion-efficiently-adapting-large-language-models-with-strictly.txt", "jsonld": "https://wpnews.pro/news/from-ar-to-diffusion-efficiently-adapting-large-language-models-with-strictly.jsonld"}}