Huawei-led team completes post-training of DeepSeek 1.6T model

wpnews.pro

cd /news/artificial-intelligence/huawei-led-team-completes-post-train… · home › topics › artificial-intelligence › article

[ARTICLE · art-23383] src=letsdatascience.com ↗ pub=2026-06-06T13:21Z topic=artificial-intelligence verified=true sentiment=· neutral

Huawei-led team completes post-training of DeepSeek 1.6T model

A research group led by Huawei completed full-parameter post-training of DeepSeek's V4-Pro, a 1.6-trillion-parameter model, using a cluster of at least 1,000 Huawei Ascend 910C chips. The effort, which involved the Shenzhen Loop Area Institute and other local research bodies, updated every model weight rather than using adapter layers. Industry observers view the demonstration as a meaningful step toward reducing reliance on foreign GPUs for alignment and tuning workloads, though it does not prove the chips can pre-train a frontier model from scratch.

read3 min views13 publishedJun 6, 2026

According to the Shenzhen municipal government, reported by Tom's Hardware, a Huawei-led research group completed full-parameter post-training of DeepSeek's V4-Pro, a 1.6-trillion-parameter model, on a cluster of at least 1,000 Huawei Ascend 910C chips. Tom's Hardware reports the effort involved Huawei, the Shenzhen Loop Area Institute, the Shenzhen campus of Harbin Institute of Technology, and the Shenzhen Research Institute of Big Data. The article cites DeepSeek documentation that places V4-Pro's pre-training corpus at more than 32 trillion tokens and notes that the team updated every model weight rather than using adapter layers. Tom's Hardware also notes that completing post-training on Ascend silicon does not demonstrate the chips can pre-train a frontier model from scratch. Editorial analysis: Industry observers will view a full-parameter post-training run on domestic accelerators as a meaningful step toward reducing reliance on foreign GPUs for alignment and tuning workloads.

What happened

According to the Shenzhen municipal government, reported by Tom's Hardware, a research group that includes Huawei completed full-parameter post-training of DeepSeek's V4-Pro, a 1.6-trillion-parameter model, using a cluster of at least 1,000 Ascend 910C chips. Tom's Hardware reports the collaborators included the Shenzhen Loop Area Institute, the Shenzhen campus of Harbin Institute of Technology, and the Shenzhen Research Institute of Big Data. The article cites DeepSeek documentation that places V4-Pro's pre-training corpus above 32 trillion tokens. Tom's Hardware reports the team performed full-parameter post-training, meaning every weight was updated rather than using a thin adapter layer, and explicitly notes this demonstration does not show Ascend chips can pre-train a frontier model from scratch.

Technical details

Tom's Hardware describes the Ascend 910C as Huawei's current flagship AI accelerator, a dual-die part that earlier DeepSeek testing returned roughly 60% of an Nvidia H100's inference performance. The report frames post-training as the tuning stage after pre-training, used for instruction-following, safety alignment, and task-specific behaviour shaping. The Shenzhen municipal government is the primary on-record source for the cluster size claim in the coverage cited.

Editorial analysis - technical context

Demonstrations of full-parameter post-training on non-Nvidia silicon address a narrower but important slice of the ML lifecycle: alignment and fine-tuning at scale. Industry-pattern observations: groups that move tuning workloads off incumbent GPUs often still rely on foreign accelerators for the heavier, costlier pre-training phase. Successful post-training runs reduce a practical barrier for domestic deployment of model-alignment workflows, but they do not, by themselves, prove parity for pre-training at frontier scale.

Context and significance

Industry context: This report sits at the intersection of chip capability, national supply-chain strategy, and large-model engineering. Observers tracking compute sovereignty and export-control impacts will treat a verified full-parameter post-training run as a notable milestone for training-class workloads on domestic accelerators, even if it stops short of pre-training demonstrations claimed elsewhere.

What to watch

Indicators to follow include independent performance and reproducibility benchmarks for V4-Pro post-training on Ascend hardware, detailed power and throughput metrics for the Ascend 910C cluster, and any lab or open documentation from DeepSeek or participating institutes about dataset splits, optimizer settings, and wall-clock training time. Reporting that clarifies whether multiple parallel cabinets or networking fabrics were involved will also matter for assessing generalisability.

Scoring Rationale #

A reported full-parameter post-training of a 1.6-trillion model on a 1,000-chip Ascend cluster is a notable infrastructure milestone for training-class workloads on domestic accelerators. It matters to practitioners assessing compute options, but it stops short of demonstrating frontier-scale pre-training parity.

Practice interview problems based on real data

1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.

Try 250 free problems

source & further reading

letsdatascience.com — original article Court Reprimands Lawyer for AI Hallucinations in Briefs Ghostcommit: PNG prompt-injection makes AI agents leak repository secrets Google Expands Gemini Ad Agents In India

── more in #artificial-intelligence 4 stories · sorted by recency

scmp.com · 19 Jun · #artificial-intelligence

Huawei chips refine DeepSeek model in major leap for China's AI self-reliance

dev.to · 21 Jul · #artificial-intelligence

AI Agent Profiler — Measure agent cost, cache waste, and context bloat

machinebrief.com · 21 Jul · #artificial-intelligence

Why analysts say stick with the AI hardware trade despite major volatility in chips stocks

dev.to · 21 Jul · #artificial-intelligence

Eval-Gated AI Releases: Treating Retrieval Quality Like Unit Tests

── more on @huawei 3 stories trending now

wpnews · 26 May · #ai-agents

Think, Durable Objects, and the Real Shape of AI Applications

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 8 Jul · #ai-tools

What's the Future of Clay?

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required