Artificial Intelligence thread

Overbom

Brigadier
Registered Member
LLama 4 is out

Please, Log in or Register to view URLs content!

By their own admission, they took "inspiration" from Deepseek like for instance for the MoE (mixture of expert) model instead of using their classic "dense" model, but also introduced some novelty like
Please, Log in or Register to view URLs content!
and
Please, Log in or Register to view URLs content!
.

Their model is also native multi-mode (text, images, video) and this may explain the huge pretraining dataset of 30T tokens (about X2 compared to Qwen and DeepSeek). They pretrained on 32K GPUs.

LLama approach has always being: simple architecture + brute force...it is not a wrong idea when you have unlimited hardware. I hope Huawei and others will soon fill the GPU gap, so to allow Chinese labs to compete on almost equal footing.

Anyhow kudos to them for opensurcing the models. They are definetely the most open among US companies: Google opensoruces only their tier-3 models, and OpenAI...well, they are a joke regarding Open, at least Anthropic, the closest one, does not pretend.
Very disappointed on this release. Only noteworthy thing imo is the long context window. On everything else, it's behind the competition

Maybe the only saving grace would be when their behemoth big model completes training and then distilled down to a capable smaller one, and then add thinking on top

DeepSeek basically


But then why throw billions every year in AI research if all you do is just copy from other open-source projects? Hopefully after they catch up (?) they can innovate again
 

european_guy

Junior Member
Registered Member
Very disappointed on this release. Only noteworthy thing imo is the long context window. On everything else, it's behind the competition

Maybe the only saving grace would be when their behemoth big model completes training and then distilled down to a capable smaller one, and then add thinking on top

DeepSeek basically


But then why throw billions every year in AI research if all you do is just copy from other open-source projects? Hopefully after they catch up (?) they can innovate again

Sorry to reply, but they are better than DS

This is a non reasoning model, so you have to compare apples with apples.

The Maverick model is a 400B parameters (less than 671 of DS) with 17B active (half of DS 39B), and it has comparable performance with DS V3.1 (the latest one, released just a couple weeks ago), but above DS it has also native multi-modality and 1M context length.

Going on the technical details, they have introduced interesting new ideas in "attention" layers so that they got a powerful model with faster speed for same hardware. Native multimodality is also a powerful feature and many will take inspiration from that. They have also introduced new ideas in training, but unfortunately these are less documented and cannot be revealed just looking at the sources.

On DeepSeek advantage there is a much deeper and detailed documentation and technical papers on all the aspect of the model, including training. DeepSeek is way more open than LLama4, but LLama 4 nevertheless is the top in US at the moment regarding openness.
 

Eventine

Junior Member
Registered Member
The version of Llama 4 they released to the public isn't performing up to par. Many people across the community are reporting bad results, even below QWQ 32B for the 400B model. We'll see if this is a parameters problem & gets fixed in the coming days.

As it is, in terms of pure performance, Google = Open AI > Anthropic >= Deep Seek > Grok 3 > Llama 4 currently, although Grok 3 has the benefit of being uncensored as previously stated.

All eyes are on Open AI's o3 full / o4-mini release later this month and Deep Seek's R2 release around the same time, to see if they can take back the crown from Google. Anthropic is basically still stuck in their coding niche and increasingly threatened by other players. Meta needs to correct course quickly or be in danger of dropping out of the race.

With more Big AI labs moving into multi modal models the compute requirements will likely increase; Open AI recently had to turn off their image generation for public access because it was taking too much GPU time.

Chinese players need to be cooperating more on optimizations; there’s a high compute wall for multi modal that needs another Deep Seek moment.
 
Last edited:

sunnymaxi

Major
Registered Member

luminary

Senior Member
Registered Member
well, the great thing about deepseek and Qwen making the model weights and RL process available is that anyone in the world can utilize their algo to create their own open source reasoning model. So that we the public have the power of controlling AI and that we are no longer just subservient to the tech overloads in Silicon Valley. Whose goal is to achieve global domination and techno feudalism over rest of us, who they want to rule over in their serfdom.
The true intentions of techno culture in SF bay area is getting a lot of attention recently.
Basically white supremacy and eugenics in a nerdy sanctimonious way:
https://www.reddit.com/r/SneerClub/comments/1jqouyl
 

european_guy

Junior Member
Registered Member

Here is the
Please, Log in or Register to view URLs content!


Smaller than DS R1 with 20B active parameters (half than R1) and 200B in total(1/3 of R1), it gives same or better results in STEM due to improved RL (reinforcement learning) based on some published papers.

Fun fact, the only test where it performs way worse than R1 is called SimpleQA and is a set of general knowledge questions (but not easy as someone could think from the test name).

For general knowledge you need a huge model to remember a lot of info, more than a smart one. R1 is 3 times bigger and that's why is better: Gemini 2.5 and Grok 3 seem to be also very big models, while instead OpenAI O3-mini should have similar size of this one, so about 100B / 200B parameters.
 

Eventine

Junior Member
Registered Member
1744326380550.png

Looks like Open AI is gearing up for their next set of releases, o4-mini and o3-full. I wonder if this will affect Deep Seek's R2 release. Clearly Deep Seek would prefer to release in April (since that's what they originally promised) but with an impending Open AI release, they'd want to make sure they can match or beat Open AI's new models to keep their magic reputation.

As I've said before, the race is just getting started. Now that all the Big AI labs in the West have made their play, it's time for China to answer. Also looking to see where Grok goes next, as it's been a while since Grok 3's release (in the AI race, a few months is "a while").
 

AI Scholar

New Member
Registered Member
Since their release, I’ve put both Gemini 2.5 pro and DeepSeek V3-03-25 through extensive testing across programming, web design, creative writing, casual chatting, and AI development tasks. Here are my thoughts:

Front-End Design:
DeepSeek takes the lead. In my experience, it generates more visually refined and aesthetically pleasing designs compared to Gemini.

Creative Writing:
Gemini delivers more consistent writing, especially in long context, but DeepSeek’s stories are more entertaining. DeepSeek wins due to having a more enjoyable writing style and less censorship.

General Chatting:
DeepSeek has a more pleasant and based conversational style. I’ve found myself defaulting to it for casual discussions.

Coding/AI dev:
Gemini tends to overengineer solutions, often producing excessively complex code when simplicity would suffice. While it clearly has stronger reasoning capabilities than DeepSeek V3, I’ve achieved similar (sometimes better) results with V3 by optimizing my prompts.

Workflow-wise, Gemini has been quite frustrating to integrate, whereas DeepSeek V3 feels smoother and more intuitive for actual development.

Gemini is undeniably much smarter in raw capability, and it should be better for many use cases, but DeepSeek V3 has become my preferred model for nearly all of my tasks, as it is more usable and enjoyable in practice.

If V3 already works for my needs so well, I can only imagine how useful DeepSeek R2 will be. I find that DeepSeek V3 is very underrated, and R2 will be a big shock to those not paying attention.
 
Top