{"slug": "diffusion-based-llms-that-generate-many-parallel-tokens-rather-than-one-by-one", "title": "Diffusion‑based LLMs that generate many parallel tokens rather than one‑by‑one", "summary": "Inception launched Mercury, a family of diffusion-based large language models that generate tokens in parallel rather than sequentially, achieving faster speeds and higher GPU efficiency. The models are available through AWS Bedrock and Azure Foundry, offering OpenAI API compatibility for enterprise applications.", "body_md": "Inception’s breakthrough diffusion-based approach to language generation enables the world’s fastest, most efficient AI models with best-in-class quality.\n\n## The diffusion difference. From sequential to parallel\n\nAll other LLMs generate text one token at a time. Mercury diffusion LLMs (dLLMs) generate tokens in parallel, increasing speed and maximizing GPU efficiency.\n\n## Blazing-fast performance you can notice\n\n## Build the future of AI apps with Mercury\n\nLightning fast agents\n\nAutomate complex coding and other business workflows with with ultra-responsive AI.\n\nReal-time voice\n\nEngage naturally with AI in voice-powered workflows like customer support, translation, and immersive gaming.\n\nInstant code editing\n\nStay in-the-flow with responsive autocomplete, intelligent tab suggestions, and fast chat responses.\n\nFast, creative co-pilots\n\nSupercharge editorial and creative work—less waiting, more creating.\n\nRapid search\n\nInstantly surface the right data from across your organization’s knowledge base.\n\nFoundational models\n\n## Meet our family of diffusion models\n\nResearch\n\n## Led by visionary AI researchers\n\nOur founders pioneered diffusion modeling and invented cornerstone AI technologies.\n\n## Loved by leaders and innovators\n\nWe’re available through major cloud providers like AWS Bedrock and Azure Foundry. Talk with us about fine-tuning and private deployments.\n\nIntegrate in seconds\n\nOur models are OpenAI API compatible and a drop-in replacement for traditional LLMs.\n\nEnterprise AI partner\n\nWe’re available through major cloud providers like AWS Bedrock and Azure Foundry.\n\nReliability at scale\n\nGet 99.5%+ uptime and priority support with custom SLAs.", "url": "https://wpnews.pro/news/diffusion-based-llms-that-generate-many-parallel-tokens-rather-than-one-by-one", "canonical_source": "https://www.inceptionlabs.ai/", "published_at": "2026-06-20 02:20:48+00:00", "updated_at": "2026-06-20 02:37:22.961551+00:00", "lang": "en", "topics": ["large-language-models", "generative-ai", "ai-infrastructure", "ai-products"], "entities": ["Inception", "Mercury", "AWS Bedrock", "Azure Foundry", "OpenAI"], "alternates": {"html": "https://wpnews.pro/news/diffusion-based-llms-that-generate-many-parallel-tokens-rather-than-one-by-one", "markdown": "https://wpnews.pro/news/diffusion-based-llms-that-generate-many-parallel-tokens-rather-than-one-by-one.md", "text": "https://wpnews.pro/news/diffusion-based-llms-that-generate-many-parallel-tokens-rather-than-one-by-one.txt", "jsonld": "https://wpnews.pro/news/diffusion-based-llms-that-generate-many-parallel-tokens-rather-than-one-by-one.jsonld"}}