Artificial Intelligence thread

Eventine

Junior Member
Registered Member
But why pay for GROK 3 when one can just use deepseek open source? One can adjust the censorship as well.
Not practical. The hardware requirements for doing inference on the 671B model are astronomical, and there's no third party providers for Deep Seek that have both uncensored the model and offer as generous of a plan as Grok 3. A better provider may eventually emerge, but right now, Grok 3 is the superior service.
 

Legume7

New Member
Registered Member
If I had to guess, it's because performing search requires a search engine, and signing a commercial contract with a search provider for English is probably not particularly high on the priority list for a research company like Deep Seek.

Any way, now that the dust has settled a bit, a couple of observations:
  • The West is still at the top of the closed weights race.
    • The biggest winner in recent days, IMO, is Grok 3 from xAI. Not only is Grok 3 cheaper than any offering from Anthropic / Open AI, but it is uncensored. In my testing, xAI is really only worse than Open AI at Deep Research. In everything else, it is equal or better. Elon Musk has a winner here as long as he keeps it uncensored and accessible. This is the real "Open AI killer."
    • Claude 3.7 still dominates the practical coding domain, but that's about all it really dominates. It's a specialized model for engineers and companies looking to replace engineers. It isn't particularly superior at anything else and its high price means it's only relevant for enterprises and productivity users.
    • Chat GPT 4.5 was largely a bust. Few people talk about it or have found great general use cases for it. Open AI is being kept alive by Deep Research and their first mover advantage since many enterprises have originally signed with them. It's possible they'll turn it around with a new breakthrough but their pricing being as bad as it is means they're never going to be practical for the wider public.
    • Google still has a monopoly on the most generous closed AI service/API, but honestly haven't improved their closed models that much since Gemini 2.0 Flash Thinking. They need to start delivering new things or risk falling behind other companies' frontier models.
    • Everybody else is basically irrelevant. Censorship - not political, but general - is emerging as a challenge for Chinese closed source solutions for casual AI users, as why would you go for a Tencent or Kimi or Baidu solution, when you can just get Grok 3 that has top tier performance and is uncensored and has a relatively cheap plan?
  • In the open weights landscape, the situation has gotten more diverse
    • There have been almost a dozen open weights releases since Deep Seek blew the market open and raised the bar. Where previously, the industry was increasingly closing their models in imitation of Open AI and Anthropic, now there is stronger interest across the world in developing open weights models in pursuit of wide adoption.
    • Deep Seek 671B is still the best over all open weights model, but is increasingly being challenged by smaller models that are better at specific tasks, particularly coding, math, and general logical reasoning.
    • The advantage of smaller models like QWQ 32B, the new Mistral, Gemma 3, and Nvidia Nemotron is that they can actually be run locally - and far more cheaply than Deep Seek. This is an advantage given the market expectation of cheaper costs post-Deep Seek, and for many purposes, those smaller models perform competitively with Deep Seek.
    • On the services & API side, none of the open weights companies have a great offering, but QWQ and Gemma 3 probably stand out as the most user friendly.
    • Deep Seek either really dropped the ball on this one or was forced into dropping the ball by US denial of service attacks - the slow API & search still being broken a month after they said they'll fix it has cost Deep Seek's web service much interest & adoption. That's probably "okay" from Deep Seek's perspective but not great from the perspective of Chinese AI's global adoption, since Chinese closed source models are much less known and popular than Deep Seek outside of China.
In retrospect, Deep Seek was extremely wise to open source / open weights their model. On the AI as a service side, Chinese companies are operating at a fundamental disadvantage due to several factors: 1) the lack of a strong Chinese search engine company (due to Baidu being such a pathetic failure), which means Chinese companies have to pay Google, Bing, etc. and operate at the mercy of Western service providers, 2) the strict censorship on service based solutions, either due to Chinese laws or moral concerns, which puts them at a disadvantage vs. companies like xAI that aren't afraid to go all the way.

Open source / open weights is the only way to get past these disadvantages, since the search capabilities can be provided by third party companies and the censorship can also be removed by those same third parties. If other Chinese AI companies haven't gotten the hint yet, this is the only path towards beating the West in market adoption under current conditions.

Grok 3 does not meaningfully outperform DeepSeek. It may be somewhat better, but that's with 100 times the computational resources and at least hundreds of human labelers. Even with all these advantages, DeepSeek is still way ahead in market share. The difference between DeepSeek and Grok or any USA model is not enough for the general public to care. In fact, DeepSeek narrowly trails OpenAI in market share, with everyone else not even in the running. That's not counting all of the third-party providers of DeepSeek, especially in China, where almost every company is using DeepSeek.

All USA labs are currently losing billions per year, including OpenAI, while DeepSeek is already profitable. There is simply no path to profitablity for these labs unless they can produce some major breakthrough that both leaves DeepSeek in the dust in performance and significantly improves their costs. Sure, anything can happen, but I would put the probability at very low since the USA labs are coping and seething more than ever. That certainly doesn't exude confidence. Also, USA labs have not produced a single innovation in their entire existence that can meaningfully improve efficiency. On the other hand, DeepSeek has produced at least half a dozen, and they can scale compute since R1 barely used any.

Like I've said before, the USA needs to score a complete victory in AI or they lose and their stock market crashes. There is simply no justification for the current valuations of the major tech giants unless they can monopolize AI or achieve ASI, both of which are pipe dreams at this point. We already saw what happened to Tesla. Just the existence of a credible competitor to Tesla, let alone companies that surpass it, means that it can't be worth more than every car company combined. As to DeepSeek's API, they've sad many times their only goal is AGI. The API is not their main priority, and they've rejected all outside investment. Even then, they are the only major AI lab that's profitable.
 

OptimusLion

Junior Member
Registered Member
Tencent Hunyuan's self-developed deep thinking model T1 is released: fast in speaking, able to reply in seconds, and good at processing ultra-long texts

Tencent Hunyuan today released its self-developed deep thinking model T1, which is not only fast in speaking, able to respond in seconds, but also good at processing very long texts, showing strong reasoning capabilities. In multiple public benchmark tests, T1 scores are leading the industry, especially in the field of long text reasoning.Through large-scale reinforcement learning and combined with special optimization of science problems such as mathematics, logical reasoning, science and code, the official version of Hunyuan T1 further improves the reasoning ability.

In common benchmarks that reflect the basic capabilities of reasoning models, such as the large language model evaluation enhanced dataset MMLU-PRO, Hunyuan T1 scored 87.2 points, second only to o1 . In public benchmark tests of Chinese and English knowledge and competition-level mathematics and logical reasoning such as CEval, AIME, and Zebra Logic, Hunyuan T1's performance has also reached the level of industry-leading reasoning models.

"T1" also demonstrated very strong adaptability in multiple alignment tasks, instruction following tasks, and tool utilization tasks.


001ZzMwgly1hzp0i7k82gj60ql0dttcp02 (1).jpg

001ZzMwgly1hzp0i7pyszj60u00o2djm02 (1).jpg

001ZzMwgly1hzp0i7upgsj60u00kfjtx02 (1).jpg
001ZzMwgly1hzp0i7z28hj60u00lldic02 (1).jpg
Please, Log in or Register to view URLs content!
 
Last edited:
Top