Artificial Intelligence thread

Eventine · Mar 1, 2025

Stalled because there was one round of mediocre model releases in a fast moving industry? Come on. The difference between what we have today with o1, o3-mini, Claude 3.7, Deep Seek, Grok 3, and what we had two years ago with Chat GPT 4, is night and day. Reasoning models are the most recent break through and they happened less than a year ago. Deep Seek's performance break throughs were less than a month ago. This speed of innovation is rare in any industry, and we shouldn't assume it's "stalled" just because nobody has made a breakthrough in the last month.

Also, if you haven't already, go try the Sesame demo. That's a new frontier being opened just yesterday in the open source world.

Legume7 · Mar 1, 2025

Sesame is very impressive, but it is not a frontier model. It is in a different category from the models I referred to. Frontier models do not refer to opening up a new frontier, they refer to being at the frontier of performance. No USA lab has made a breakthrough since o1 came out 6 months ago. Even o1 was unprofitable, and I already mentioned how OpenAI deemed o3 to be so commercially infeasible they had to cancel it. GPT-4.5, the only other product they relased, is also a failure. You can perform the above exercise with any USA lab. This absolutely qualifies as "stalling."

Of course, it's possible that the one of the big USA labs makes a crazy breakthrough. But the level of breakthrough needed to justify their existing valuations is so astronomical that I consider it very unlikely (but not impossible). The main reason is their attitude. They are still obsessed with stacking more and more GPUs to brute-force their way to AGI when it is clear that avenue isn't effective anymore (see GPT-4o to GPT-4.5 and o1 to o3).

On the contrary, last month, both DeepSeek and kimi independently found out how to use RL alone to improve reasoning. DeepSeek also made numerous breakthroughs in infrastructure, and showed today that it is profitable beyond a shadow of a doubt.

siegecrossbow · Mar 1, 2025

siegecrossbow said:
It will go to CHAD-Xi-PT.

Please, Log in or Register to view URLs content!

From Hohhot in northern China’s Inner Mongolia Autonomous Region to Guangzhou in the southern Guangdong province, cities across China are rolling out DeepSeek-powered AI within cloud platforms to automate governance, handling everything from administrative paperwork to public service requests.

9dashline · Mar 1, 2025

Legume7 said:
Coping and seething from an OpenAI research scientist:
https://twitter.com/i/web/status/1895876331495477615

Every American lab has shown their hand post R1, and none managed to convince. At this point, the entire $10T+ house of cards is going to start coming down once R2 is released in March/April (likely April).

OpenAI is screwed the most. Their largest investor is Microsoft, whose largest source of revenue is from Azure (cloud). Azure is already serving DeepSeek V3/R1 and will have to implement DeepSeek's latest optimizations as a matter of basic market economics. But doing so would increase DeepSeek's market share and damage the perception of Microsoft as an AI leader, which is what their ~$3T valuation is based off of. GPT-4.5 is a flop, and OpenAI were caught lying on the model card multiple times. First, they compared GPT-4.5 to o1-mini, but referred to it as the full o1. They also claimed a 10x increase in training efficiency, but removed the claim once people pointed out it was incompatible with their token costs. o3 was so computationally expensive ($3000 per query on highest thinking time) that they cannot release it as a standalone model, instead having to fold it into GPT-5. They were also caught cheating the FrontierMath benchmark last month.

Anthropic is in a similar situation. Claude-3.7 is impressive, but too expensive for its performance. I assume they are also bleeding billions per year. Their biggest investor Amazon, and the point about Azure and OpenAI applies equally to AWS and Anthropic. Also, Anthropic's CEO Dario Amodei claimed that DeepSeek smuggled GPUs and were lying about their numbers. He's now caught with his pants down given today's release. He also has a history of sketchy behavior. He admitted that he discovered scaling laws while working at Baidu, but when he moved to OpenAI and published a paper on scaling laws, he didn't cite Baidu's work. The mythology that the founders of OpenAI and Anthropic (who are ex-OpenAI) discovered scaling laws in one of the reasons they are still perceived to be leading in AI.

Meta Generative AI will also throw in the towel soon. Llama 4 was delayed due to R1, and if they don't get it out before R2, it's game over. Even if they do, it's unlikely it will outperform R1. Their ads team is already using DeepSeek as opposed to LLama.

x and Google will stay around longer because of their comparatively larger financial resources. They are willing to bleed money to serve their models at low cost. Gemini isn't impressive, and Grok barely outperformed DeepSeek on benchmarks which they specifically trained for. Also, while all American labs are bleeding cash, I suspect x is bleeding the most relative to its market share. x has the largest cluster and training costs, and they have nowhere near the most users. I also have inside information that x is paying many people thousands per week to solve math/coding problems so their solutions can be used to train Grok. This isn't factored into the publicly revealed costs, and also means that their algorithmic improvements are quite poor. I don't see how they can scale this up either.

Nvidia is also screwed. Jevon's paradox doesn't apply when existing GPUs can meet all inference needs. They will still have a market for training, but even then, the only major purchasers of Blackwell are the above 5 companies, the exact 5 companies who are bleeding billions while still struggling against DeepSeek.

For Chinese companies, as mentioned earlier in this thread, 4 companies are alive in the race for AGI: DeepSeek, Moonshot (makers of kimi), ByteDance, and Alibaba. These are the only labs with reasoning models. All of them have a positive outlook, and it's worth noting that Alibaba is the largest investor of Moonshot. On the other hand, the head of Qwen left for ByteDance, which indicates that ByteDance has more resources and/or desirability. There will also be a Qwen release next week.

https://twitter.com/i/web/status/1895871796228096236

Of course, other Chinese companies have very strong models for specialized tasks, but I'm only referring to AGI, since China dominates those categories already.

Heard Deepseek R1.5 is dropping next week, early as Monday. And yeah R2 will come way earlier than May. Prob late to mid March is my guess

9dashline · Mar 1, 2025

Legume7 said:
Sesame is very impressive, but it is not a frontier model. It is in a different category from the models I referred to. Frontier models do not refer to opening up a new frontier, they refer to being at the frontier of performance. No USA lab has made a breakthrough since o1 came out 6 months ago. Even o1 was unprofitable, and I already mentioned how OpenAI deemed o3 to be so commercially infeasible they had to cancel it. GPT-4.5, the only other product they relased, is also a failure. You can perform the above exercise with any USA lab. This absolutely qualifies as "stalling."

Of course, it's possible that the one of the big USA labs makes a crazy breakthrough. But the level of breakthrough needed to justify their existing valuations is so astronomical that I consider it very unlikely (but not impossible). The main reason is their attitude. They are still obsessed with stacking more and more GPUs to brute-force their way to AGI when it is clear that avenue isn't effective anymore (see GPT-4o to GPT-4.5 and o1 to o3).

On the contrary, last month, both DeepSeek and kimi independently found out how to use RL alone to improve reasoning. DeepSeek also made numerous breakthroughs in infrastructure, and showed today that it is profitable beyond a shadow of a doubt.

The moment they announced o3, with no source code (for an "Open"AI company), no API, no Chat release, and not even a DEMO but rather just bought for and paid off benchmarks I knew it was over

Spending $1000 for a task that human can do for $10, is not on the right side of AI progress...

And now 4.5 just seals the fate....not as smart as R1 but cost nearly 50x more

Sora, o3, 4.5, all each bigger letdowns than the last one

OpenAI is done for, Im calling it now.

OptimusLion · Mar 2, 2025

China's first photon AI smart engine was born in Nanjing

Nanzhi Optoelectronics, together with partners such as Zhiman Technology, successfully developed the first photonic AI intelligent engine "OptoChat AI" in China, and has completed internal testing. This intelligent engine is based on the top domestic large language models and semiconductor industry models such as DeepSeek, combined with the deep empowerment of more than 300,000 patent documents and industry databases, aiming to provide new technical support for global scientific research and industrial institutions, and build a "new link" for the development and production of photonic chips.

It is reported that OptoChat AI will be officially launched in March and will be open to the public for free. It will provide a fully open architecture to attract partners to build it together and will be open to domestic and foreign industries for free. In the future, the Nanzhi Optoelectronics team will also build an industry-university-research data sharing network to gather industry knowledge graphs, help scientific research institutions, enterprises, and universities to collaborate and innovate, and accelerate the transformation of technological achievements and the integration of the industrial chain.

Please, Log in or Register to view URLs content!

iewgnem · Mar 2, 2025

9dashline said:
The moment they announced o3, with no source code (for an "Open"AI company), no API, no Chat release, and not even a DEMO but rather just bought for and paid off benchmarks I knew it was over

Spending $1000 for a task that human can do for $10, is not on the right side of AI progress...

And now 4.5 just seals the fate....not as smart as R1 but cost nearly 50x more

Sora, o3, 4.5, all each bigger letdowns than the last one

OpenAI is done for, Im calling it now.

The fact that they even released 4.5 is in itself another datapoint. Theres absolutely no benifit to releasing 4.5 as it is and as we can see massive blowback to its pricing, the only reason OAI would release it is if they are extremly desperate to release something and anything

Legume7 · Mar 2, 2025

More coping and seething from OpenAI research scientist:

https://twitter.com/i/web/status/1895946062923514270

tphuang · Mar 2, 2025

https://twitter.com/i/web/status/1896169439055196624

so the stuff I've heard about happening in Western countries like automating white collar jobs look like they are starting to take place in China in a big way despite basically not much happening on that front until recently.

AI ecosystem development and societal adoption is never talked about.

european_guy · Mar 2, 2025

9dashline said:
The moment they announced o3, with no source code (for an "Open"AI company), no API, no Chat release, and not even a DEMO but rather just bought for and paid off benchmarks I knew it was over

Spending $1000 for a task that human can do for $10, is not on the right side of AI progress...

And now 4.5 just seals the fate....not as smart as R1 but cost nearly 50x more

Sora, o3, 4.5, all each bigger letdowns than the last one

OpenAI is done for, Im calling it now.

Andrej Karpathy is a famous AI researcher and ex-OpenAI, when

Please, Log in or Register to view URLs content!

:

GPT4.5, which I had access to for a few days, and which saw 10X more pretraining compute than GPT4

We can assume he is a reliable source: (1) he knows what is talking about (2) he is an OpenAI insider.

Due to well known LLM scaling laws (btw empirically found by OpenAI themselves around 2020), we can assume that they trained a corresponding bigger model than 4.0, i.e. GPT 4.5 should be about 10X bigger than 4.0

The fact that the 4.5's benchmarks didn't skyrocket compared to 4.0 is a key outcome.

Before OpenAI, also Anthropic trained both Sonnet 3.5 and the much bigger Opus 3.5, but they never released the latter. Why? Because the performance improvement above Sonnet was small and did not justify a much bigger model, with much bigger cost to run.

What we are experiencing is that just scaling up pre-training data and model size hit a wall: you need an order of magnitude more resources to get only marginal gains. This was absolutely not the common understanding, just 2/3 years ago.

This new reality will have deep consequences, and is yet another lucky occurrence for China: US cannot simply brute-force China competition out of the game.

To keep seriously improving performance you need something new. A new training paradigm, like is the "reasoning model" new trend, that boils down to an innovative post-training recipe, or a new LLM architecture, different from the Transformer family with its sub-variants, above which all current frontier models are based.

Artificial Intelligence thread

Eventine

Senior Member

Legume7

New Member

siegecrossbow

Field Marshall

9dashline

Captain

9dashline

Captain

OptimusLion

Junior Member

iewgnem

Captain

Legume7

New Member

tphuang

General

european_guy

Junior Member