DeepSeek AI Model Struggles With Huawei Chip Training

▼ Summary
– DeepSeek delayed its new AI model launch after struggling to train it using Huawei’s chips, revealing challenges in China’s effort to replace US technology.
– The company was encouraged by authorities to use Huawei’s Ascend processor but faced persistent technical issues during training.
– DeepSeek ultimately used Nvidia chips for training and Huawei’s for inference, causing the model’s release to be postponed from May.
– Chinese chips lag behind US rivals in performance, stability, and software, hindering China’s goal of technological self-sufficiency.
– Huawei sent engineers to assist DeepSeek, but the company still couldn’t successfully train its model on the Ascend chip.
China’s ambitious push for AI self-sufficiency faces hurdles as domestic chips struggle to match foreign alternatives. A prominent Chinese artificial intelligence firm recently postponed the launch of its latest model after encountering significant challenges while attempting to train it using Huawei’s processors. This setback underscores the ongoing difficulties in Beijing’s strategy to reduce reliance on American technology.
The company, DeepSeek, had initially planned to utilize Huawei’s Ascend chips for developing its R2 model following the release of its R1 version earlier this year. Government pressure encouraged the shift away from Nvidia’s hardware, which has long dominated the AI training space. However, technical complications forced the firm to revert to Nvidia chips for the training phase while reserving Huawei’s processors for inference tasks, the stage where trained models generate responses or predictions.
Sources familiar with the matter revealed that these technical roadblocks were severe enough to delay the R2 model’s scheduled May debut, allowing competitors to gain an edge. Training AI models requires immense computational power and stability, areas where Chinese chips currently fall short compared to their US counterparts. Issues such as inconsistent performance, slower communication between chips, and less mature software ecosystems have hampered progress.
In response to these challenges, Huawei dispatched a team of engineers to assist DeepSeek in optimizing its Ascend-based training process. Despite this hands-on support, the company still couldn’t achieve stable training runs on the domestic hardware. The situation highlights a broader trend in China’s tech sector, where firms are being pressured to adopt locally produced components but often face performance trade-offs.
Recent reports indicate that Chinese authorities are tightening restrictions on Nvidia chip purchases, requiring companies to justify their orders of high-performance alternatives like the H20. The goal is to accelerate adoption of homegrown solutions from Huawei and Cambricon. However, as DeepSeek’s experience demonstrates, the gap in capability remains a significant barrier to full technological independence. Until domestic chips can reliably handle complex AI workloads, China’s AI ambitions may continue to depend on foreign technology, at least in the short term.
(Source: Ars Technica)