Artificial Intelligence thread

diadact

Just Hatched
Registered Member
And just why do they need 100k H100 GPU cluster to stay competitive? Alibaba just trained its latest Qwen 2.5 on 18 trillion tokens
Please, Log in or Register to view URLs content!

It got released just a few months after Qwen-2.0.
If they had a larger cluster with 20K more GPUs then they would have crushed OpenAI, Anthropic, etc
There are 2 constraints in AI rn, AI GPUs & human-labeled data for RLHF(In the future RLHF will act as an aid for RLAIF)

How much non duplicate tokens are out there globally that you can use? How much larger do their cluster really need to get?
Every AI firm(big firms in US, cohere, Mistral, and every AI firm in China) is using synthetic data generation to generate more tokens for training
You just need to make sure that the synthetic data is diverse & has high entropy
Qwen hired lots of human data annotators for RLHF
OpenAI gives you free access to GPT 3.5/GPT 4o mini so that they can use the user interaction with the model for training their models
They use their users for synthetic data generation
Data wall doesn't exist anymore
Chinese firms are trying to compensate for their lack of GPUs with better quality data
OpenAI & Anthropic will have huge amounts of GPUs along with quality data
We haven't even seen the full power of o1 yet
They just forced the model to think step by step at inference and trained it on RL fine tuned datasets
No MCTS or tree search was involved during inference(they don't have enough compute to serve MCTS to millions of users)
AI has become a compute/GPU game now
The more compute you have the more you can experiment, the bigger models you can make, the more inference time compute you can use


1726808063860.png
 

Eventine

Junior Member
Registered Member
Seems like a fast follower mentality, which is probably expected out of profit guided companies like Alibaba. What sets Open AI above the competition is their mission driven culture - from talking to people inside the company, money is not the main motivation, rather it’s the deeply held belief that they’re on the edge of revolutionary breakthroughs in Artificial General Intelligence, which will fundamentally change the course of human history.

Rumors are, many of the talent in Open AI actually took demotions to join the company. Heard of people who were former directors & VPs at Google, Facebook, etc. joining as just regular managers & employees. The passion & sense of mission from all indications is off the charts. That seems to be a missing quality among their competitors who are looking more to just keep up / cash in.

We can fault the West for many things but when it comes to single minded, obsessive compulsive, mission driven mentality, it feels they still have the edge over Chinese technology leaders who tend to be more money motivated.
 

diadact

Just Hatched
Registered Member
Seems like a fast follower mentality, which is probably expected out of profit guided companies like Alibaba. What sets Open AI above the competition is their mission driven culture - from talking to people inside the company, money is not the main motivation, rather it’s the deeply held belief that they’re on the edge of revolutionary breakthroughs in Artificial General Intelligence, which will fundamentally change the course of human history.
Considering the performance of Qwen 2.5, I would disagree with that statement
If they had more compute they would have crushed Anthropic, Deepmind/google, OpenAI
OpenAI can behave like that because Microsoft & other investors are willing to bank roll them

Rumors are, many of the talent in Open AI actually took demotions to join the company. Heard of people who were former directors & VPs at Google, Facebook, etc. joining as just regular managers & employees. The passion & sense of mission from all indications is off the charts. That seems to be a missing quality among their competitors who are looking more to just keep up / cash in.

We can fault the West for many things but when it comes to single minded, obsessive compulsive, mission driven mentality, it feels they still have the edge over Chinese technology leaders who tend to be more money motivated.
Deepseek & Zhipu from China has this mentality
Deepseek bankrolls its AI development through High Flyer's(parent company) quant trading firm
If AGI/ASI development requires trillions or billions of dollars then you will have to generate revenue

Chinese government will have to get involved at some point in the future for funding just like US government will get involved

I personally think that if China wants to make better models than US big labs then they should just make a 100K H100 class GPU cluster for Deepseek and let them do their job
 

Eventine

Junior Member
Registered Member
I’m not questioning the practical performance of Chinese models, but if you read the Qwen 2.5 researchers’ statements above you can tell they’re not doing any ground breaking research but just refining existing practices. The final sentence about being shocked by Open AI’s breakthrough in chain of thought reasoning is especially telling. From decades in the industry and academia, I'm painfully aware of East Asia's tendency to focus on practical, incremental improvements over basic research, and this is just another instance of it.

Often I wonder how much of this is due to culture. People in the West are far more intense and psychotic about their beliefs, but that same psychosis seems to drive them to pursue subjects with single-minded devotion, even if it ruins them.
 
Last edited:

diadact

Just Hatched
Registered Member
Qwen 2.5 researchers’ statements above you can tell they’re not doing any ground breaking research but just refining existing practices.
Nobody has released something magical in AI rn(They may be doing something groundbreaking internally but it hasn't been released)
OpenAI o1 model is CoT with some RL fine tuning(Every serious lab in the world(US or China) is doing something similar, other models with this ability will be released publicly in weeks or months)

OpenAI has a paper-thin moat otherwise they wouldn't hide the reasoning tokens of o1 from being displayed to their users

The final sentence about being shocked by Open AI’s breakthrough in chain of thought reasoning is especially telling
Junyang was being humble here
Alibaba has released papers related to MCTS & CoT
The next release will have this
 
Top