cd /news/large-language-models/diffusion-based-llms-that-generate-m… · home topics large-language-models article
[ARTICLE · art-34583] src=inceptionlabs.ai ↗ pub= topic=large-language-models verified=true sentiment=↑ positive

Diffusion‑based LLMs that generate many parallel tokens rather than one‑by‑one

Inception launched Mercury, a family of diffusion-based large language models that generate tokens in parallel rather than sequentially, achieving faster speeds and higher GPU efficiency. The models are available through AWS Bedrock and Azure Foundry, offering OpenAI API compatibility for enterprise applications.

read1 min views1 publishedJun 20, 2026
Diffusion‑based LLMs that generate many parallel tokens rather than one‑by‑one
Image: source

Inception’s breakthrough diffusion-based approach to language generation enables the world’s fastest, most efficient AI models with best-in-class quality.

The diffusion difference. From sequential to parallel #

All other LLMs generate text one token at a time. Mercury diffusion LLMs (dLLMs) generate tokens in parallel, increasing speed and maximizing GPU efficiency.

Blazing-fast performance you can notice #

Build the future of AI apps with Mercury #

Lightning fast agents

Automate complex coding and other business workflows with with ultra-responsive AI.

Real-time voice

Engage naturally with AI in voice-powered workflows like customer support, translation, and immersive gaming.

Instant code editing

Stay in-the-flow with responsive autocomplete, intelligent tab suggestions, and fast chat responses.

Fast, creative co-pilots

Supercharge editorial and creative work—less waiting, more creating.

Rapid search

Instantly surface the right data from across your organization’s knowledge base.

Foundational models

Meet our family of diffusion models #

Research

Led by visionary AI researchers #

Our founders pioneered diffusion modeling and invented cornerstone AI technologies.

Loved by leaders and innovators #

We’re available through major cloud providers like AWS Bedrock and Azure Foundry. Talk with us about fine-tuning and private deployments.

Integrate in seconds

Our models are OpenAI API compatible and a drop-in replacement for traditional LLMs.

Enterprise AI partner

We’re available through major cloud providers like AWS Bedrock and Azure Foundry.

Reliability at scale

Get 99.5%+ uptime and priority support with custom SLAs.

── more in #large-language-models 4 stories · sorted by recency
── more on @inception 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/diffusion-based-llms…] indexed:0 read:1min 2026-06-20 ·