Artificial Intelligence thread

Eventine · Mar 20, 2025

MortyandRick said:
But why pay for GROK 3 when one can just use deepseek open source? One can adjust the censorship as well.

Not practical. The hardware requirements for doing inference on the 671B model are astronomical, and there's no third party providers for Deep Seek that have both uncensored the model and offer as generous of a plan as Grok 3. A better provider may eventually emerge, but right now, Grok 3 is the superior service.

Legume7 · Mar 20, 2025

Eventine said:
If I had to guess, it's because performing search requires a search engine, and signing a commercial contract with a search provider for English is probably not particularly high on the priority list for a research company like Deep Seek.

Any way, now that the dust has settled a bit, a couple of observations:

The West is still at the top of the closed weights race.

The biggest winner in recent days, IMO, is Grok 3 from xAI. Not only is Grok 3 cheaper than any offering from Anthropic / Open AI, but it is uncensored. In my testing, xAI is really only worse than Open AI at Deep Research. In everything else, it is equal or better. Elon Musk has a winner here as long as he keeps it uncensored and accessible. This is the real "Open AI killer."

Claude 3.7 still dominates the practical coding domain, but that's about all it really dominates. It's a specialized model for engineers and companies looking to replace engineers. It isn't particularly superior at anything else and its high price means it's only relevant for enterprises and productivity users.

Chat GPT 4.5 was largely a bust. Few people talk about it or have found great general use cases for it. Open AI is being kept alive by Deep Research and their first mover advantage since many enterprises have originally signed with them. It's possible they'll turn it around with a new breakthrough but their pricing being as bad as it is means they're never going to be practical for the wider public.

Google still has a monopoly on the most generous closed AI service/API, but honestly haven't improved their closed models that much since Gemini 2.0 Flash Thinking. They need to start delivering new things or risk falling behind other companies' frontier models.

Everybody else is basically irrelevant. Censorship - not political, but general - is emerging as a challenge for Chinese closed source solutions for casual AI users, as why would you go for a Tencent or Kimi or Baidu solution, when you can just get Grok 3 that has top tier performance and is uncensored and has a relatively cheap plan?

In the open weights landscape, the situation has gotten more diverse

There have been almost a dozen open weights releases since Deep Seek blew the market open and raised the bar. Where previously, the industry was increasingly closing their models in imitation of Open AI and Anthropic, now there is stronger interest across the world in developing open weights models in pursuit of wide adoption.

Deep Seek 671B is still the best over all open weights model, but is increasingly being challenged by smaller models that are better at specific tasks, particularly coding, math, and general logical reasoning.

The advantage of smaller models like QWQ 32B, the new Mistral, Gemma 3, and Nvidia Nemotron is that they can actually be run locally - and far more cheaply than Deep Seek. This is an advantage given the market expectation of cheaper costs post-Deep Seek, and for many purposes, those smaller models perform competitively with Deep Seek.

On the services & API side, none of the open weights companies have a great offering, but QWQ and Gemma 3 probably stand out as the most user friendly.

Deep Seek either really dropped the ball on this one or was forced into dropping the ball by US denial of service attacks - the slow API & search still being broken a month after they said they'll fix it has cost Deep Seek's web service much interest & adoption. That's probably "okay" from Deep Seek's perspective but not great from the perspective of Chinese AI's global adoption, since Chinese closed source models are much less known and popular than Deep Seek outside of China.

In retrospect, Deep Seek was extremely wise to open source / open weights their model. On the AI as a service side, Chinese companies are operating at a fundamental disadvantage due to several factors: 1) the lack of a strong Chinese search engine company (due to Baidu being such a pathetic failure), which means Chinese companies have to pay Google, Bing, etc. and operate at the mercy of Western service providers, 2) the strict censorship on service based solutions, either due to Chinese laws or moral concerns, which puts them at a disadvantage vs. companies like xAI that aren't afraid to go all the way.

Open source / open weights is the only way to get past these disadvantages, since the search capabilities can be provided by third party companies and the censorship can also be removed by those same third parties. If other Chinese AI companies haven't gotten the hint yet, this is the only path towards beating the West in market adoption under current conditions.

Grok 3 does not meaningfully outperform DeepSeek. It may be somewhat better, but that's with 100 times the computational resources and at least hundreds of human labelers. Even with all these advantages, DeepSeek is still way ahead in market share. The difference between DeepSeek and Grok or any USA model is not enough for the general public to care. In fact, DeepSeek narrowly trails OpenAI in market share, with everyone else not even in the running. That's not counting all of the third-party providers of DeepSeek, especially in China, where almost every company is using DeepSeek.

All USA labs are currently losing billions per year, including OpenAI, while DeepSeek is already profitable. There is simply no path to profitablity for these labs unless they can produce some major breakthrough that both leaves DeepSeek in the dust in performance and significantly improves their costs. Sure, anything can happen, but I would put the probability at very low since the USA labs are coping and seething more than ever. That certainly doesn't exude confidence. Also, USA labs have not produced a single innovation in their entire existence that can meaningfully improve efficiency. On the other hand, DeepSeek has produced at least half a dozen, and they can scale compute since R1 barely used any.

Like I've said before, the USA needs to score a complete victory in AI or they lose and their stock market crashes. There is simply no justification for the current valuations of the major tech giants unless they can monopolize AI or achieve ASI, both of which are pipe dreams at this point. We already saw what happened to Tesla. Just the existence of a credible competitor to Tesla, let alone companies that surpass it, means that it can't be worth more than every car company combined. As to DeepSeek's API, they've sad many times their only goal is AGI. The API is not their main priority, and they've rejected all outside investment. Even then, they are the only major AI lab that's profitable.

OptimusLion · Mar 21, 2025

Tencent Hunyuan's self-developed deep thinking model T1 is released: fast in speaking, able to reply in seconds, and good at processing ultra-long texts

Tencent Hunyuan today released its self-developed deep thinking model T1, which is not only fast in speaking, able to respond in seconds, but also good at processing very long texts, showing strong reasoning capabilities. In multiple public benchmark tests, T1 scores are leading the industry, especially in the field of long text reasoning.Through large-scale reinforcement learning and combined with special optimization of science problems such as mathematics, logical reasoning, science and code, the official version of Hunyuan T1 further improves the reasoning ability.

In common benchmarks that reflect the basic capabilities of reasoning models, such as the large language model evaluation enhanced dataset MMLU-PRO, Hunyuan T1 scored 87.2 points, second only to o1 . In public benchmark tests of Chinese and English knowledge and competition-level mathematics and logical reasoning such as CEval, AIME, and Zebra Logic, Hunyuan T1's performance has also reached the level of industry-leading reasoning models.

"T1" also demonstrated very strong adaptability in multiple alignment tasks, instruction following tasks, and tool utilization tasks.

001ZzMwgly1hzp0i7k82gj60ql0dttcp02 (1).jpg

001ZzMwgly1hzp0i7pyszj60u00o2djm02 (1).jpg

001ZzMwgly1hzp0i7upgsj60u00kfjtx02 (1).jpg

001ZzMwgly1hzp0i7z28hj60u00lldic02 (1).jpg

Please, Log in or Register to view URLs content!

Eventine · Mar 21, 2025

The better question is, is Tencent prepared to open weights their model the way Deep Seek did? Catching up to o1 is not particularly impressive these days, but adding to the open weights movement can lead to an ecosystem advantage, as Tencent should know from its experience with Hunyuan Video and Alibaba's Wan 2.1.

Wrought · Mar 21, 2025

You know it's bad when you're getting called out for being delusional by an Indian. No wonder the CHIPS Act turned out the way it did.

https://twitter.com/i/web/status/1903196802171670893

9dashline · Mar 22, 2025

https://twitter.com/i/web/status/1903115511803809895

If true, the its over for OpenAI/Chatgpt/Sama

ZeEa5KPul · Mar 23, 2025

It's become clear from all these model releases landing at about the same level of capability that progress has plateaued (a nicer word than "stalled"). Quelle surprise. It's almost like there were people skeptical of these stochastic parrots from the very beginning and warning against overestimating their capabilities.

It's funny watching the tech-hype switch to "agents" like we're supposed to forget that their underlying models are still trash. Where's GPT-5? We've got 4.5, o-1, o-3-mini-super-for-real-trust-me-bro, but no 5. What happened to 5?

nativechicken · Mar 23, 2025

ZeEa5KPul said:
It's become clear from all these model releases landing at about the same level of capability that progress has plateaued (a nicer word than "stalled"). Quelle surprise. It's almost like there were people skeptical of these stochastic parrots from the very beginning and warning against overestimating their capabilities.

It's funny watching the tech-hype switch to "agents" like we're supposed to forget that their underlying models are still trash. Where's GPT-5? We've got 4.5, o-1, o-3-mini-super-for-real-trust-me-bro, but no 5. What happened to 5?

When GPT-4.0 was released, all globally available training corpus for AI had been exhausted. GPT-5 attempted to generate its own training data, which led to data contamination, i.e., AI hallucinations. As a result, AI has stagnated.

In reality, all leading global AI foundational models now face this issue: the training corpus is essentially identical, so their ultimate capabilities will converge.

Currently, piling up generic corpus no longer contributes to AI advancement. Instead, specialized teams are designing finely crafted corpus to enhance intelligence.

Now, top Chinese and American AI teams cannot make progress in foundational models (humanity’s historically accumulated universal corpus has been depleted; no amount of human-AI dialogue can improve further). Thus, competition now shifts to efficiency: running large models at lower costs and achieving large-scale deployment. Hence, various AI agents have become prevalent.

ZeEa5KPul · Mar 23, 2025

nativechicken said:
When GPT-4.0 was released, all globally available training corpus for AI had been exhausted. GPT-5 attempted to generate its own training data, which led to data contamination, i.e., AI hallucinations. As a result, AI has stagnated.

In reality, all leading global AI foundational models now face this issue: the training corpus is essentially identical, so their ultimate capabilities will converge.

Currently, piling up generic corpus no longer contributes to AI advancement. Instead, specialized teams are designing finely crafted corpus to enhance intelligence.

Now, top Chinese and American AI teams cannot make progress in foundational models (humanity’s historically accumulated universal corpus has been depleted; no amount of human-AI dialogue can improve further). Thus, competition now shifts to efficiency: running large models at lower costs and achieving large-scale deployment. Hence, various AI agents have become prevalent.

The problem isn't data availability, children develop language and reason on many orders of magnitude less data. The problem is ANNs trained by backpropagation are a dead end to AGI and no one has the first clue how to develop a reasoning system.

Eventine · Mar 23, 2025

It's too early to say we've stagnated. The only thing that's certain is that Sam and Dario are not sleeping well at night, not just because of Deep Seek, but because they sold the idea that AGI / ASI was just 2-3 years away to the US government and their investors, and if they don't deliver it, then their whole business model of burning investor / government money while exhorting customers with high API costs will collapse.

Companies like Google, Deep Seek, Meta, Alibaba, and even xAI will survive because they have other revenue sources and more realistic business applications. As I've said before, the smarter companies have, by now, realized that generative models are a productivity multiplier, and like other productivity multipliers, having an open source ecosystem is key.

Just look at Google search. How dominant and powerful is Google search, putting much of the world's knowledge at the average person's finger tips? But it's all free*. Google doesn't charge you for web search. Of course, when it's free, you're the actual product, and that's how it will be for AI.

Artificial Intelligence thread

Eventine

Senior Member

Legume7

New Member

OptimusLion

Junior Member

Eventine

Senior Member

Wrought

Senior Member

9dashline

Captain

ZeEa5KPul

Brigadier

nativechicken

Junior Member

ZeEa5KPul

Brigadier

Eventine

Senior Member