Artificial Intelligence thread

tamsen_ikard

Captain
Registered Member
AFAIK from what is being rumored around on Chinese internet, V4 still used Nvidia cards for training and only use Ascend cards for inference.

It's very likely still trained on western hardware, just that inference now use Chinese cards which is a big step forward but not as big as what most people are trying to make it out to be.
I thought Chinese cards were already widely used for inference. Otherwise what is the point of buying even a single non nvdia card?

If deepseek spent all this extra time just to make inference work, then that will be disappointing
 

sunnymaxi

Colonel
Registered Member
AFAIK from what is being rumored around on Chinese internet, V4 still used Nvidia cards for training and only use Ascend cards for inference.
Image
 

Michael90

Senior Member
Registered Member
I thought Chinese cards were already widely used for inference. Otherwise what is the point of buying even a single non nvdia card?

If deepseek spent all this extra time just to make inference work, then that will be disappointing
I think some of you guys are getting too spoiled to the point that you believe China should magically catch up with the US in every field today. Yo guys forget that the US has a huge advantage and head start of almost a century of industrialization and technological dominance , at the times when China was ravaged by wars, famines, malnutrition etc. There was basically little to no R&D spending in China during all that era until the 90s to be honest . You guys don’t see to understand how far behind China comes from. Even South Korea was outspending China by far for decades forget about the US. So China has achieved a miracle actually in such a short period of time (same as the Asian tigers who grow and industrialized spectacularly fast in a short period of time, so China was also lucky to have a suitable environment to grow in and successful neighbours who share similar traits to China to learn from (Korea, Japan, Taiwan, Singapore, large Chinese diaspora in Asia who helped a lot as well etc). India by contrast didn’t have that luxury, surrounded by subpar poor countries and the poorest/least integrated regions on earth(only matched by subsaharan Africa ).

So I think people should be aware of where China come from. China has done exceptionally well. It will take some time for them to really catch up in every field, but at least in new emerging technology they have leapfrogged and taken a lead in some of those new technology sectors which doesn’t hav a legacy western system/dominance , so they are in the right track.
so I think we should be more patient with advancements and catch up phase China is in. So I’m not particularly surprised Deepseek is still relying on Nvidia for Training their model with minor input from Huawei , at least full interference is relying on Huawei now, so it’s good progress , eventually with time they will move their training models fully homegrown as the country’s technology and AI chips/data system improve
 

bsdnf

Senior Member
Registered Member
What I've heard is that Flash still uses NVIDIA for pre-training, while Ascend is used for inference and post-training. Pro only uses Ascend for inference because the model is too large and unstable to train on Ascend, they given it up
 

tphuang

General
Staff member
Super Moderator
VIP Professional
Registered Member
Keep in mind that atlas 950 supernode is not out yet and only when it is out, will DeepSeek have significant domestic compute for training. I see they have put out firmware updates for training and fine tuning. Considering how under trained v4 is, they can continue to iterate on it from the base model.
 

tamsen_ikard

Captain
Registered Member
I think some of you guys are getting too spoiled to the point that you believe China should magically catch up with the US in every field today. Yo guys forget that the US has a huge advantage and head start of almost a century of industrialization and technological dominance , at the times when China was ravaged by wars, famines, malnutrition etc. There was basically little to no R&D spending in China during all that era until the 90s to be honest . You guys don’t see to understand how far behind China comes from. Even South Korea was outspending China by far for decades forget about the US. So China has achieved a miracle actually in such a short period of time (same as the Asian tigers who grow and industrialized spectacularly fast in a short period of time, so China was also lucky to have a suitable environment to grow in and successful neighbours who share similar traits to China to learn from (Korea, Japan, Taiwan, Singapore, large Chinese diaspora in Asia who helped a lot as well etc). India by contrast didn’t have that luxury, surrounded by subpar poor countries and the poorest/least integrated regions on earth(only matched by subsaharan Africa ).

So I think people should be aware of where China come from. China has done exceptionally well. It will take some time for them to really catch up in every field, but at least in new emerging technology they have leapfrogged and taken a lead in some of those new technology sectors which doesn’t hav a legacy western system/dominance , so they are in the right track.
so I think we should be more patient with advancements and catch up phase China is in. So I’m not particularly surprised Deepseek is still relying on Nvidia for Training their model with minor input from Huawei , at least full interference is relying on Huawei now, so it’s good progress , eventually with time they will move their training models fully homegrown as the country’s technology and AI chips/data system improve
If deepseek is only relying on huawei now for its model inference and it took so long because of adapting to huawei is so hard, then why did chinese companies buy huawei cards in 100k+ to million in 2025 and 2024? Did they buy it just for decoration? What about cambricorn, Biren, moore threads and all these other cards?

The whole point of all these cards is that they can do deep learning. If training is so hard, then companies are buying them for atleast to do inference.

So your whole argument falls apart.

Again, the only logical reason for taking this long for deepseek is to be able to do training on these cards. Because inference has already been done since 2 years ago.
 

tokenanalyst

Lieutenant General
Registered Member
If deepseek is only relying on huawei now for its model inference and it took so long because of adapting to huawei is so hard, then why did chinese companies buy huawei cards in 100k+ to million in 2025 and 2024? Did they buy it just for decoration? What about cambricorn, Biren, moore threads and all these other cards?

The whole point of all these cards is that they can do deep learning. If training is so hard, then companies are buying them for atleast to do inference.

So your whole argument falls apart.

Again, the only logical reason for taking this long for deepseek is to be able to do training on these cards. Because inference has already been done since 2 years ago.
Because there are more models than DeepSeek V4, and the use that those cards goes on inference and training other models that are not LLMs like diffusion models, CNN and other AI models
The issue is and that is why Jensen is so insistent is that the only thing keeping Nvidia holding the monopoly in the AI industry now is their ecosystem. Once popular models like DeepSeek and other start training and inference in other GPUs will lead to the creation of an alternative ecosystem and permanent market share loss for Nvidia.
 

Engineer

Major
If deepseek is only relying on huawei now for its model inference and it took so long because of adapting to huawei is so hard, then why did chinese companies buy huawei cards in 100k+ to million in 2025 and 2024? Did they buy it just for decoration? What about cambricorn, Biren, moore threads and all these other cards?

The whole point of all these cards is that they can do deep learning. If training is so hard, then companies are buying them for atleast to do inference.

So your whole argument falls apart.

Again, the only logical reason for taking this long for deepseek is to be able to do training on these cards. Because inference has already been done since 2 years ago.
Deepseek's stack is highly optimized for and tightly integrated with cards from Nvidia, to the point of bypassing CUDA completely and interact directly with the driver. Switching to Huawei would basically require a rewrite of the entire stack, including all the R&D and debugging.
 
Top