0G trains 107B parameter decentralized model with China Mobile, a first for AI above 100 billion parameters

wpnews.pro

cd /news/artificial-intelligence/0g-trains-107b-parameter-decentraliz… · home › topics › artificial-intelligence › article

[ARTICLE · art-14700] src=cryptobriefing.com ↗ pub=2026-05-26T22:51Z topic=artificial-intelligence verified=true sentiment=↑ positive

0G trains 107B parameter decentralized model with China Mobile, a first for AI above 100 billion parameters

0G Labs, in partnership with China Mobile, successfully trained a 107-billion-parameter AI model using its DiLoCoX framework, marking the first decentralized training of a model exceeding 100 billion parameters. The project, completed in July 2025, achieved a 357x improvement in communication efficiency over standard 1 Gbps network links compared to traditional methods. This breakthrough could enable telecom providers to host distributed AI training networks without specialized high-bandwidth infrastructure, challenging the concentration of AI training resources.

read3 min views12 publishedMay 26, 2026

The DiLoCoX framework achieved 357x better communication efficiency than traditional methods, all over standard 1 Gbps network links.

Training a 107-billion-parameter AI model is hard enough when you have a warehouse full of cutting-edge GPUs connected by ultra-fast networking. Doing it across decentralized clusters on a standard 1 Gbps network? That’s a fundamentally different engineering challenge. 0G Labs claims to have pulled it off.

The project, completed in July 2025 in partnership with China Mobile, represents the first successful decentralized training of an AI model exceeding 100 billion parameters. The research paper detailing the methodology was published on arXiv on June 26, 2025, under the code arXiv:2506.21263.

How DiLoCoX actually works #

The standard approach, known as AllReduce, requires all nodes to constantly share gradient updates with each other. DiLoCoX instead lets clusters of NVIDIA A800 GPUs work semi-independently, synchronizing far less frequently.

The framework employs several technical innovations to make this possible. Pipeline parallelism breaks the model into stages that can be processed sequentially across devices. A dual optimizer policy uses different optimization strategies for local and global training steps. One-step-delay overlap allows computation to continue while synchronization happens in the background. And adaptive gradient compression squeezes down the data that needs to travel between clusters.

The result, according to 0G Labs, is a 357x improvement in communication efficiency compared to traditional AllReduce methods, without sacrificing model convergence.

Why China Mobile matters here #

China Mobile is the world’s largest mobile network operator. Its involvement signals something broader than a one-off research collaboration, as telecom companies sit on vast distributed infrastructure including cell towers, edge data centers, and network backbone. If decentralized AI training can genuinely work over standard bandwidth links, telecom providers could become potential hosts for distributed training networks without the need for specialized high-bandwidth interconnects.

0G Labs CEO Michael Heinrich framed the achievement in democratization terms:

“DiLoCoX marks a pivotal step in democratizing LLM training.”

What this means for the decentralized AI landscape #

DiLoCoX challenges the concentration of AI training directly. Clusters of A800 GPUs, the export-compliant version of NVIDIA’s A100 available in China, were coordinated across geographically distributed locations over 1 Gbps links to train a 100B+ parameter model.

In March 2026, 0G Labs announced plans to publicly retrain the model with full transparency, with a commitment to open-source its technologies. That would allow independent verification of the efficiency claims and enable other teams to build on the methodology.

The key risk is reproducibility. A 357x efficiency improvement is extraordinary. Independent teams will need to validate these results before the market should price in a paradigm shift. The arXiv paper provides a starting point for that scrutiny, and the planned open-source release will determine whether DiLoCoX becomes a building block for the broader ecosystem or remains an impressive but isolated demonstration.

Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our

Editorial Policy.

source & further reading

cryptobriefing.com — original article Chinese chip startup launches world’s first 8-inch 2D semiconductor line, escalating the US-China tech war Microsoft’s Brad Smith criticizes US AI regulation for lack of clarity Grok 4.5 ranks second on APEX-SWE leaderboard as AI coding race heats up

~/api · this article 200

$curl api.wpnews.pro/v1/news/0g-trains-107b-parameter…

Read original on cryptobriefing.com → cryptobriefing.com/0g-107b-decentralized-model-c…

mentioned entities

0G Labs

China Mobile

NVIDIA

A800 GPU

DiLoCoX

AllReduce

arXiv

metadata

slug0g-trains-107b-parameter-decentralized-model-with-china-mobile-a-first-for-ai

topic#artificial-intelligence

secondary4 topics

sentimentpositive

canonicalcryptobriefing.com

navigation

← prevKeyblind – encrypted secrets vau…

next →Figure's robots sorted packages …

── more in #artificial-intelligence 4 stories · sorted by recency

machinebrief.com · 11 Jul · #artificial-intelligence

Transformer Efficiency: A Closer Look at KV Cache Compression

blog.stackademic.com · 11 Jul · #artificial-intelligence

I Have 10 Minutes to Train an AI Model. Here’s Exactly What Happened.

byteiota.com · 11 Jul · #artificial-intelligence

Oxmiq Raises $35M: Is OxCore the ARM Model for AI GPUs?

machinebrief.com · 11 Jul · #artificial-intelligence

Accelerating SwiGLU: A breakthrough for Large Language Models

── more on @0g labs 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI unveils Grok 4.5 AI model ahead of July 2026 public release

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required