Artificial Intelligence thread

mossen

Senior Member
Registered Member
People obsess on the high-end (Mythos, Opus, GPT-Pro).

But for most users, the Chinese ecosystem gives them near-frontier performance at the price of the cheapest American models. Just compare Anthropic's cheapest model Haiku. You get 5x greater context window with MiMo V2.5 Pro and performance which is nearly 2x. And it's still somewhat cheaper!

For regular people, this fact is obscured because they tend to use monthly subscriptions. But there is a lot of space here for a cheap monthly subscription plan by a Chinese frontier lab to give folks near-frontier performance and dramatically undercut the big US labs on price. I don't think most people in the West understand the massive cost differential here. And how they are getting fleeced. For the Chinese players, it's a bit like the EV story again. The profits are made abroad.



1.png
 

meedicx

Junior Member
Registered Member
People obsess on the high-end (Mythos, Opus, GPT-Pro).

But for most users, the Chinese ecosystem gives them near-frontier performance at the price of the cheapest American models. Just compare Anthropic's cheapest model Haiku. You get 5x greater context window with MiMo V2.5 Pro and performance which is nearly 2x. And it's still somewhat cheaper!

For regular people, this fact is obscured because they tend to use monthly subscriptions. But there is a lot of space here for a cheap monthly subscription plan by a Chinese frontier lab to give folks near-frontier performance and dramatically undercut the big US labs on price. I don't think most people in the West understand the massive cost differential here. And how they are getting fleeced. For the Chinese players, it's a bit like the EV story again. The profits are made abroad.

The big news today is that OpenAI missed its revenue and user targets.
Please, Log in or Register to view URLs content!

1777355574003.png

The previous big use case for LLMs was chatbots, where OpenAI/ChatGPT was the leader and biggest beneficiary. However, since the end of 2025, chatbot progress has become "good enough" for most users. It no longer makes as much sense to buy a ChatGPT subscription when DeepSeek Instant offers amazing responses for free in most cases. Most users won't max out the need for advanced knowledge or reasoning. High scores on benchmarks like MMLU-Pro or GPQA Diamond become irrelevant after a certain threshold.

ChatGPT's free-tier chat app experience is actually terrible compared to many free competitors (Gemini, DeepSeek, Doubao). So it's not surprising that ChatGPT's growth has hit a wall and they've missed their targets.

---

My hot take is that by this time next year, Anthropic will hit a growth wall just like OpenAI .

Now the trend has moved from chatbots to agentic coding / office work, where Anthropic/Claude is the current leader and biggest beneficiary. They've been growing super-fast, but other models are getting very close to the "good enough" threshold. Just like most normal consumers don't care about state-of-the-art MMLU-Pro scores, most coders and office workers won't care about performance on obscure programming languages. A model that is solid at basic data processing and CRUD app development is probably enough for 95% of companies.

I've been heavily using DeepSeek v4 Pro for coding and data processing. I've used over 200 million tokens and spent less than $4 total. The results have been pretty good so far, not always perfect, but re-prompting with more information always fixes things.

If this trend keeps up, especially if Chinese memory/ai chips can further drive down costs, I think by next year, the growth wall will hit Claude as well. Right now enterprises are in the naive stage and are trend chasing the new thing. Uber just burned through its 2026 budget in just 4 months using Claude. Companies have competition / leaderboards on number of tokens being used instead of value delivered. As the tech matures and novelty wears out, these enterprises will start optimizing and be much smarter the models it uses and the costs.[/URL]

---

As an aside, my other hot take is that the trend will move from agentic coding to humanoids in 2 years as the threshold is saturated
 
Last edited:

playmaker1478

New Member
Registered Member
What about cambricorn, Biren, moore threads and all these other cards?

The whole point of all these cards is that they can do deep learning. If training is so hard, then companies are buying them for atleast to do inference.
This is definitely one of my main concerns with the GPGPU programming ecosystem as a whole. While I think Huawei may catch up to Nvidia by developing their own CUDA-like framework, the same cannot be said for other Chinese vendors. Multiple GPGPU companies writing their own CUDA-like frameworks instead of pooling efforts together will create a situation where no vendor comes out on top, and the market becomes more reliant on established players like Nvidia or Huawei. This may slow down the overall effort to develop an ecosystem that can break free from CUDA. I would love to be wrong, so if someone knows whether China is proposing an open framework for GPGPU programming, please let me know.

I do want to point out that the reason training is so hard isn't only because of software, but also due to hardware. Training requires broader kinds of fixed-function hardware and support for multiple data types that aren't usually available in inference cards, which in turn drives training accelerators to be more expensive to implement. In the last two years, I believe the Chinese market hasn't rolled out any dedicated training accelerator yet. I could be wrong, but they are mainly targeting inference right now due to market demand, and Huawei might roll out their own training accelerator soon.
 

lockedemosthenes1

New Member
Registered Member
People obsess on the high-end (Mythos, Opus, GPT-Pro).

But for most users, the Chinese ecosystem gives them near-frontier performance at the price of the cheapest American models. Just compare Anthropic's cheapest model Haiku. You get 5x greater context window with MiMo V2.5 Pro and performance which is nearly 2x. And it's still somewhat cheaper!

For regular people, this fact is obscured because they tend to use monthly subscriptions. But there is a lot of space here for a cheap monthly subscription plan by a Chinese frontier lab to give folks near-frontier performance and dramatically undercut the big US labs on price. I don't think most people in the West understand the massive cost differential here. And how they are getting fleeced. For the Chinese players, it's a bit like the EV story again. The profits are made abroad.



View attachment 174061
I'm sorry, but I would recommend careful evaluation before commenting on the cheap Chinese model, because models by Xiaomi is exactly that kind of model who cannot live up to its name and "score". It's benchmaxxed.1777358263185.png1777358307682.png
Please, Log in or Register to view URLs content!
Please, Log in or Register to view URLs content!
 

Wrought

Captain
Registered Member
20% of metas Chinas ad revenue came from scams/peonography? Lool is this for real? If true then I don’t think anybody will miss that. lol

Then you need to think again. Zuckerberg personally intervened to keep the money flowing.

“As a result of Integrity Strategy pivot and follow-up from Zuck,” a late 2024 document notes, the China ads-enforcement team was “asked to pause” its work. Reuters was unable to learn the specifics of the CEO’s involvement or what the so-called “Integrity Strategy pivot” entailed.

But after Zuckerberg’s input, the documents show, Meta disbanded its China-focused anti-scam team. It also lifted a freeze it had introduced on granting new Chinese ad agencies access to its platforms. One document shows that Meta shelved yet other anti-scam measures that internal tests had indicated would be effective. The document didn’t detail the specifics of those measures.

Please, Log in or Register to view URLs content!

Revenue is revenue, and it's not Facebook's problem if its users are dumb enough to fall for scams. They aren't the ones paying after all; just the ones handing over their personal data.

"People just submitted it. I don't know why. They 'trust me'. Dumb f***s."
 

Michael90

Senior Member
Registered Member
Then you need to think again. Zuckerberg personally intervened to keep the money flowing.



Please, Log in or Register to view URLs content!

Revenue is revenue, and it's not Facebook's problem if its users are dumb enough to fall for scams. They aren't the ones paying after all; just the ones handing over their personal data.

"People just submitted it. I don't know why. They 'trust me'. Dumb f***s."
It should be their responsibility to ensure they implement the strictest measures to prevent their users from falling for scams . Obviously being a private business they care more about revenue/profits which is normal, just like alibaba, pindoudou etc have been tolerating scams/fake/counterfeit/and even harzardous goods on their platforms to prioritize revenues/profits. I don’t blame them, it’s the government job to make sure companies adhere to rules/regulation, in this aspect I think only the EU is really trying to enforce them on tech companies .
anyway, even if they lost that, it won’t be the end of the world , Facebook will continue her operations globally like before, they have no real presence in China anyway. So no real loss per se there . They can still live without this scam ads revenue , obviously won’t be ideal of them.
 

bsdnf

Senior Member
Registered Member
What I've heard is that Flash still uses NVIDIA for pre-training, while Ascend is used for inference and post-training. Pro only uses Ascend for inference because the model is too large and unstable to train on Ascend, they given it up
Please, Log in or Register to view URLs content!
Ascend team released a video demonstrating post-training based on TorchTitan.

Please, Log in or Register to view URLs content!
And yes, Ascend was only used for post-training in Flash; Pro‘s post-training is only on the TorchTitan Q2 roadmap.
 
Last edited:

Michael90

Senior Member
Registered Member
Please, Log in or Register to view URLs content!
Ascend team released a video demonstrating post-training base on TorchTitan.

Please, Log in or Register to view URLs content!
And yes, Ascend was only used for post-training in Flash; Pro‘s post-training is only on the TorchTitan Q2 roadmap.
Still a profess. We can’t expect China to catchup so soon. The country will need more years to get to that level as their chip industry and technology matures.
 

meedicx

Junior Member
Registered Member
Please, Log in or Register to view URLs content!
Ascend team released a video demonstrating post-training based on TorchTitan.

Please, Log in or Register to view URLs content!
And yes, Ascend was only used for post-training in Flash; Pro‘s post-training is only on the TorchTitan Q2 roadmap.

1) 续训练 is continuous pretraining (CPT) which is different from post-training which refers to SFT/RL. In theory the same code can be run for full pretraining if you have enough cards

2) The second URL is TorchTitan support for Ascend. It's not clear that DeepSeek used this framework for training. They mentioned using TileLang in the paper, which would probably require another framework.

3) These CANN videos are half marketing for enterprise buyers. Advertising PyTorch CPT support is actually really useful for an enterprise, but may not be relevant for DeepSeek since they work directly at the kernel level.

For example, in agentic coding, if your code base is big, it can't all fit in the context even at 1M, so the harness has to pick and choose relevant parts. In theory, doing CPT on your code base and building a refined DSv4 model could result in a stronger and cheaper model than even SOTA US ones for your specific use case. Supporting PyTorch / TorchTitan makes this process easier for enterprises using ascend.
 
Top