Artificial Intelligence thread

9dashline

Captain
Registered Member
Marco-o1 seems to be worse than Deepseek. Makes sense as it is a very low parameter model, 7B

To be more exact, it's a Qwen2-7B-Instruct fine-tune
Seems like a different division of Alibaba altogether, not associated with Qwen. Plus they havent done the RL part yet, so its not going to be strong at reasoning until they do it.

Sounds like to me everyone is now jumping the gun to announce something asap, even if not yet finished or still baking

Facebook scambling too

https://www.reddit.com/r/LocalLLaMA/comments/1gxxj4w
 

9dashline

Captain
Registered Member
Qwen needs to pump out a Q1 thats 72b in size, will definetly surpass o1-preview

ClosedAI is dragging feet and holding back. At this rate Qwen is going to do to o1 what Kling did to Sora
 

Overbom

Brigadier
Registered Member
Sounds like to me everyone is now jumping the gun to announce something asap, even if not yet finished or still baking

Facebook scambling too

https://www.reddit.com/r/LocalLLaMA/comments/1gxxj4w
But these new models, from what I can see, are not using o1-like reasoning/thinking techniques where they use inference time compute to "think".

More like Meta is throwing a bunch of stuff at the wall, and will wait to see which will stick...

Qwen needs to pump out a Q1 thats 72b in size, will definetly surpass o1-preview
They might have some sort of what you ask internally and they released the small preliminary version to let the community test it out. Tbh if they release bigger versions, I am curious if they will allow full access to the thinking process as that would be extremely valuable data that would allow competitors to train on.

Still a bit shocked that Deepseek shows it's full thinking process, let's see how the open source/weights model release will look though..


ClosedAI is dragging feet and holding back. At this rate Qwen is going to do to o1 what Kling did to Sora
Qwen3...
As for OpenAI, the gap is closing. Great work by the Chinese AI labs. In fact, if not for OpenAI's existence, I would have said that China is leading the AI race rn
 

Hyper

Junior Member
Registered Member
But these new models, from what I can see, are not using o1-like reasoning/thinking techniques where they use inference time compute to "think".

More like Meta is throwing a bunch of stuff at the wall, and will wait to see which will stick...


They might have some sort of what you ask internally and they released the small preliminary version to let the community test it out. Tbh if they release bigger versions, I am curious if they will allow full access to the thinking process as that would be extremely valuable data that would allow competitors to train on.

Still a bit shocked that Deepseek shows it's full thinking process, let's see how the open source/weights model release will look though..



Qwen3...
As for OpenAI, the gap is closing. Great work by the Chinese AI labs. In fact, if not for OpenAI's existence, I would have said that China is leading the AI race rn
DeepSeek is not interested in making money from AI. They want to be attractive for prospective talent. They want to be attractive for college graduates. Rule of Cool. Otherwise why would a hft firm spend money on LLMs.
 

9dashline

Captain
Registered Member

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member
Please, Log in or Register to view URLs content!

China's largest data center in the middle of the ground has opened up. Currently just has 2000PFLOPS, but will have 10000P by end of this year. First phase will conclude with 30000P (30EFLOPS)
Eventually, this data center will get to 100EFLOPS, filling a major computing blank in the middle of the country.
 
Top