{"slug": "huawei-led-team-completes-post-training-of-deepseek-1-6t-model", "title": "Huawei-led team completes post-training of DeepSeek 1.6T model", "summary": "A research group led by Huawei completed full-parameter post-training of DeepSeek's V4-Pro, a 1.6-trillion-parameter model, using a cluster of at least 1,000 Huawei Ascend 910C chips. The effort, which involved the Shenzhen Loop Area Institute and other local research bodies, updated every model weight rather than using adapter layers. Industry observers view the demonstration as a meaningful step toward reducing reliance on foreign GPUs for alignment and tuning workloads, though it does not prove the chips can pre-train a frontier model from scratch.", "body_md": "# Huawei-led team completes post-training of DeepSeek 1.6T model\n\nAccording to the Shenzhen municipal government, reported by Tom's Hardware, a Huawei-led research group completed full-parameter post-training of DeepSeek's V4-Pro, a **1.6-trillion-parameter** model, on a cluster of at least **1,000** Huawei **Ascend 910C** chips. Tom's Hardware reports the effort involved Huawei, the Shenzhen Loop Area Institute, the Shenzhen campus of Harbin Institute of Technology, and the Shenzhen Research Institute of Big Data. The article cites DeepSeek documentation that places V4-Pro's pre-training corpus at more than **32 trillion** tokens and notes that the team updated every model weight rather than using adapter layers. Tom's Hardware also notes that completing post-training on Ascend silicon does not demonstrate the chips can pre-train a frontier model from scratch. Editorial analysis: Industry observers will view a full-parameter post-training run on domestic accelerators as a meaningful step toward reducing reliance on foreign GPUs for alignment and tuning workloads.\n\n### What happened\n\nAccording to the Shenzhen municipal government, reported by Tom's Hardware, a research group that includes **Huawei** completed full-parameter post-training of DeepSeek's V4-Pro, a **1.6-trillion-parameter** model, using a cluster of at least **1,000** **Ascend 910C** chips. Tom's Hardware reports the collaborators included the Shenzhen Loop Area Institute, the Shenzhen campus of Harbin Institute of Technology, and the Shenzhen Research Institute of Big Data. The article cites DeepSeek documentation that places V4-Pro's pre-training corpus above **32 trillion** tokens. Tom's Hardware reports the team performed full-parameter post-training, meaning every weight was updated rather than using a thin adapter layer, and explicitly notes this demonstration does not show Ascend chips can pre-train a frontier model from scratch.\n\n### Technical details\n\nTom's Hardware describes the **Ascend 910C** as Huawei's current flagship AI accelerator, a dual-die part that earlier DeepSeek testing returned roughly **60%** of an Nvidia **H100**'s inference performance. The report frames post-training as the tuning stage after pre-training, used for instruction-following, safety alignment, and task-specific behaviour shaping. The Shenzhen municipal government is the primary on-record source for the cluster size claim in the coverage cited.\n\n### Editorial analysis - technical context\n\nDemonstrations of full-parameter post-training on non-Nvidia silicon address a narrower but important slice of the ML lifecycle: alignment and fine-tuning at scale. Industry-pattern observations: groups that move tuning workloads off incumbent GPUs often still rely on foreign accelerators for the heavier, costlier pre-training phase. Successful post-training runs reduce a practical barrier for domestic deployment of model-alignment workflows, but they do not, by themselves, prove parity for pre-training at frontier scale.\n\n### Context and significance\n\nIndustry context: This report sits at the intersection of chip capability, national supply-chain strategy, and large-model engineering. Observers tracking compute sovereignty and export-control impacts will treat a verified full-parameter post-training run as a notable milestone for training-class workloads on domestic accelerators, even if it stops short of pre-training demonstrations claimed elsewhere.\n\n### What to watch\n\nIndicators to follow include independent performance and reproducibility benchmarks for V4-Pro post-training on Ascend hardware, detailed power and throughput metrics for the **Ascend 910C** cluster, and any lab or open documentation from DeepSeek or participating institutes about dataset splits, optimizer settings, and wall-clock training time. Reporting that clarifies whether multiple parallel cabinets or networking fabrics were involved will also matter for assessing generalisability.\n\n## Scoring Rationale\n\nA reported full-parameter post-training of a **1.6-trillion** model on a 1,000-chip Ascend cluster is a notable infrastructure milestone for training-class workloads on domestic accelerators. It matters to practitioners assessing compute options, but it stops short of demonstrating frontier-scale pre-training parity.\n\nPractice interview problems based on real data\n\n1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.\n\n[Try 250 free problems](/problems)", "url": "https://wpnews.pro/news/huawei-led-team-completes-post-training-of-deepseek-1-6t-model", "canonical_source": "https://letsdatascience.com/news/huawei-led-team-completes-post-training-of-deepseek-16t-mode-26dee267", "published_at": "2026-06-06 13:21:53.159482+00:00", "updated_at": "2026-06-06 13:21:56.715343+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-chips", "ai-infrastructure", "ai-research"], "entities": ["Huawei", "DeepSeek", "Ascend 910C", "Shenzhen Loop Area Institute", "Harbin Institute of Technology", "Shenzhen Research Institute of Big Data", "Tom's Hardware", "Shenzhen municipal government"], "alternates": {"html": "https://wpnews.pro/news/huawei-led-team-completes-post-training-of-deepseek-1-6t-model", "markdown": "https://wpnews.pro/news/huawei-led-team-completes-post-training-of-deepseek-1-6t-model.md", "text": "https://wpnews.pro/news/huawei-led-team-completes-post-training-of-deepseek-1-6t-model.txt", "jsonld": "https://wpnews.pro/news/huawei-led-team-completes-post-training-of-deepseek-1-6t-model.jsonld"}}