Artificial Intelligence thread

9dashline

Captain
Registered Member
It might run inference on 2048 GPUs with modified vram. Training cluster is probably of unknown size.
read the paper, training cluster is 2048 H800 GPUs, which are gimped H100s due to US ban

The 685b deepseek v3 can be ran on appro 800GB Vram, which is about 10 GPUs (H100), infact since its a MoE, whole thing can be ran in CPU with enough system RAM
 

9dashline

Captain
Registered Member
It’s sad how China, dancing with export ban shackles, is able to beat the U.S. at pretty much anything nowadays. From EV/self driving to next gen fighter aircraft/EWAC, from video games to LLM/AI models. If they continue at this pace not even core tech like EUV and stepmom pron are safe.
DeepSeek killed 3 birds with one stone...

By giving a SOTA frontier model as open weights that puts pressure on what OAI/Antropic etc can charge on the lower end of intelligence, since now any inference provider can directly compete with chatgpt4o, claude 3.5 sonnet

By pursuing algorithmic efficiences instead of brute force hacks (cough cough o3 cough cough) Deepseek shows US that its EUV chokehold and Nvidia supermacy (which are temporary moats at best since China is catching up rapidly) are not as effective in curtailing Chinese AI advancement as hoped, raising the "cost" of US sanctions by lowering the actual effevtiveness

And by publishing the paper with the techniques used, and show how a SOTA model as of late 2024 can be trained on less than the nominal cost of say an average useless NYU grant, for example, it truly democratizes AI and gives power and hope back to the peoples of the world and diversifies it away from top US tech companies which when it comes to AI context are all intelligence arms of US gov acting under orders of this new Manhattan project of race to AGI etc
 

Eventine

Junior Member
Registered Member
But that is precisely the best strategy vs closed source companies like “Open” AI. Fast follow + optimize + true open source; destroy their ability to capitalize on their billions of dollars of investment & force their financial backers to burn ever more money until the whole thing crashes down due to lack of profitability.

It’s only when research can happen on a level playing field and be shared, that innovation can benefit all humanity. The West has historically weaponized everything they've ever created. They're trying to do that again with generative AI, hoping it will bring forth a new age of Western hegemony and World Empire. Too bad for them, this time China is there to check their ambitions; and as long as China supports the open source movement, it has the moral high ground.
 

Overbom

Brigadier
Registered Member
But that is precisely the best strategy vs closed source companies like “Open” AI. Fast follow + optimize + true open source; destroy their ability to capitalize on their billions of dollars of investment & force their financial backers to burn ever more money until the whole thing crashes down due to lack of profitability.

It’s only when research can happen on a level playing field and be shared, that innovation can benefit all humanity. The West has historically weaponized everything they've ever created. They're trying to do that again with generative AI, hoping it will bring forth a new age of Western hegemony and World Empire. Too bad for them, this time China is there to check their ambitions; and as long as China supports the open source movement, it has the moral high ground.
Can only imagine that at least Zuck is probably getting a lot of heat over this.

He had previously repeatedly mentioned shareholders and justifying high costs as being a challenge

I cannot imagine how he will sit at the next shareholder meeting and try to explain why Meta needs to spend multi-billions every year in it's AI models when a random Chinese company can do the same (and better) with just a few millions of dollars in cost..
 

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member

Tencent with critical care model.

Can only imagine that at least Zuck is probably getting a lot of heat over this.

He had previously repeatedly mentioned shareholders and justifying high costs as being a challenge

I cannot imagine how he will sit at the next shareholder meeting and try to explain why Meta needs to spend multi-billions every year in it's AI models when a random Chinese company can do the same (and better) with just a few millions of dollars in cost..
Meta actually has huge reason to do this open source. It helps their entire ecosystem.

If they don't open source, then deepseek and Qwen will take over as the main LLMs that people use. open models will win.
 

Overbom

Brigadier
Registered Member
Meta actually has huge reason to do this open source. It helps their entire ecosystem.

If they don't open source, then deepseek and Qwen will take over as the main LLMs that people use. open models will win.
I know, but if you listen to his interviews it is clear that he is being pressured by shareholders to close-source Llama because of the exponential (?) increase in AI dev costs.

DeepSeek V3 just blew a huge hole on his argument that an infinitely growing multi-billion investment is necessary.

Similarly for Sam, you can see that OpenAI will get pressured.

In the West, Google seems to be the best going forward on inference and scaling up.
 
Last edited:
Top