Artificial Intelligence thread

Bellum_Romanum

Brigadier
Registered Member
What is really sad is they are coming out with a $2000/mo subscription today for the "o3" model.

At this rate it will be cheaper just to hire a Jia Hind and have him user QwQ

https://www.reddit.com/r/OpenAI/comments/1higq81
Sam Altman thinks he's a big time PIMP and the gullible AI ignorant masses (myself included) are his LITERAL HO-ES that will be dumb enough to HO for his $$$ ridiculous subscription fee. Lol
 

9dashline

Captain
Registered Member
Sam Altman thinks he's a big time PIMP and the gullible AI ignorant masses (myself included) are his LITERAL HO-ES that will be dumb enough to HO for his $$$ ridiculous subscription fee. Lol
Google is currently offering Gemini 2.0 Thinking Experimential (o1 competitor) for free to everyone... so this is price undercut stategy as loss lead to gain marketshare /mindshare ; and Qwen is coming out with QvQ a 72b vision reasoning model soon

There is no moat

Please, Log in or Register to view URLs content!
 

Overbom

Brigadier
Registered Member
Please, Log in or Register to view URLs content!
OpenAI is cooking some good stuff
o-series-performance.jpg
OpenAI's new o3 system - trained on the ARC-AGI-1 Public Training set - has scored a breakthrough 75.7% on the Semi-Private Evaluation set at our stated public leaderboard $10k compute limit. A high-compute (172x) o3 configuration scored 87.5%.
Despite the significant cost per task, these numbers aren't just the result of applying brute force compute to the benchmark. OpenAI's new o3 model represents a significant leap forward in AI's ability to adapt to novel tasks. This is not merely incremental improvement, but a genuine breakthrough, marking a qualitative shift in AI capabilities compared to the prior limitations of LLMs. o3 is a system capable of adapting to tasks it has never encountered before, arguably approaching human-level performance in the ARC-AGI domain.

Caveats
Passing ARC-AGI does not equate to achieving AGI, and, as a matter of fact, I don't think o3 is AGI yet. o3 still fails on some very easy tasks, indicating fundamental differences with human intelligence.

Furthermore, early data points suggest that the upcoming ARC-AGI-2 benchmark will still pose a significant challenge to o3, potentially reducing its score to under 30% even at high compute (while a smart human would still be able to score over 95% with no training). This demonstrates the continued possibility of creating challenging, unsaturated benchmarks without having to rely on expert domain knowledge. You'll know AGI is here when the exercise of creating tasks that are easy for regular humans but hard for AI becomes simply impossible.
 
Top