Miscellaneous News

FairAndUnbiased · Jan 25, 2025

Eventine said:
You cannot run the full Deep Seek model in a Rasperry Pi. It is mathematically impossible to run a 671 billion parameters model on anything less than a cluster of work station GPUs.

What you're talking about is running the distilled versions of the model, which is significantly weaker than the full R1, and while still impressive, it's more for hobby developers than enterprise users.

The real benefit of Deep Seek is the ~20x reduction in enterprise API costs (according to benchmarks - Deep Seek generates more thinking tokens so costs a bit more than it might seem just looking at cost/token vs. O1). That is a consequence of several factors, of which cheaper training costs is only a small contributor.

It took Meta ~$60 million to train Llama 3.1. But how much money do you think they spent on engineering resources within the company? A team of ~200 researchers/engineers costs >$250 million a year for a Silicon Valley company to keep around. Add that on top of all the GPUs they had to buy to build up infrastructure, the support staff, the building costs, and we're talking a billion dollars a year for a generative AI team. It is well known that Open AI pays its researchers $1 million a year. Compared to the training costs - estimated at ~$100 million for O1 - this is a much larger expense.

That's the thing that Deep Seek was able to get around, not just from significantly reducing model training costs, but from superior value/currency spent, which is shared by all Chinese companies relative to the West. The cost to train a model is just a fraction of the cost it takes to maintain a viable LLM product. It's everything else where the cost difference makes the most impact, and that's also why I think, longer term, China's real advantage won't be from algorithmic innovations - which will be quickly copied - it'll be from structural advantages, which are far more sustainable.

If it's labor cost then how come the same Indians that are H1Bs in the US can't do it in India?

Eventine · Jan 25, 2025

FairAndUnbiased said:
If it's labor cost then how come the same Indians that are H1Bs in the US can't do it in India?

It's not just labor cost. It's labor cost + quality. Scientific talent in India is weak - not a single Indian university is in the top 100 of institutions. Their best university ranks like, 118th?

China has both the talent and the cost advantage. While the US only has the talent, but not the cost. That's how China is able to make structural plays like charging 1/20th the API cost for comparable performance.

tygyg1111 · Jan 25, 2025

Fatty said:
View attachment 144351
Wow so the NIH grant pause might be an even bigger deal than I thought. The new admin is actually going full anti-science. The odd thing is that NIH is actually pretty bipartisan because there’s a lot of conservative scientists too… wonder if this is Musk’s doing…

https://twitter.com/i/web/status/1882893055222837321

Cue the quote "The best way to stop a civilizations progress is to kill their science". This is an exemplary nth example of "Do nothing, win" in 2025.

tygyg1111 · Jan 25, 2025

GulfLander said:
https://twitter.com/i/web/status/1882911949723447782

Trump! Trump! Trump!

FairAndUnbiased · Jan 25, 2025

Eventine said:
It's not just labor cost. It's labor cost + quality. Scientific talent in India is weak - not a single Indian university is in the top 100 of institutions. Their best university ranks like, 118th?

China has both the talent and the cost advantage. While the US only has the talent, but not the cost. That's how China is able to make structural plays like charging 1/20th the API cost for comparable performance.

But these Indians are working as H1Bs in the US so they can at least work under direction.

dingyibvs · Jan 25, 2025

Eventine said:
You cannot run the full Deep Seek model in a Rasperry Pi. It is mathematically impossible to run a 671 billion parameters model on anything less than a cluster of work station GPUs.

What you're talking about is running the distilled versions of the model, which is significantly weaker than the full R1, and while still impressive, it's more for hobby developers than enterprise users.

The real benefit of Deep Seek is the ~20x reduction in enterprise API costs (according to benchmarks - Deep Seek generates more thinking tokens so costs a bit more than it might seem just looking at cost/token vs. O1). That is a consequence of several factors, of which cheaper training costs is only a small contributor.

It took Meta ~$60 million to train Llama 3.1. But how much money do you think they spent on engineering resources within the company? A team of ~200 researchers/engineers costs >$250 million a year for a Silicon Valley company to keep around between compensation and benefits. Add that on top of all the GPUs they had to buy to build up infrastructure, the support staff, the building costs, and we're talking a billion dollars a year for a generative AI team. It is well known that Open AI pays its researchers $1 million a year, so they're probably paying even more.

That's the thing that Deep Seek was able to get around, not just from significantly reducing model training costs, but from superior value/currency spent, which is shared by all Chinese companies relative to the West. The cost to train a model is just a fraction of the cost it takes to maintain a viable LLM product. It's everything else where the cost difference makes the most impact, and that's also why I think, longer term, China's real advantage won't be from algorithmic innovations - which will be quickly copied - it'll be from structural advantages, which are far more sustainable.

Stop spewing nonsense, salary is only a small part of the cost.

tygyg1111 · Jan 25, 2025

Bellum_Romanum said:
Why are the white American supremacist and muh innovation champs worried when they can simply replace the unthinking and copy paste Chyna with the true geniuses of our time: the Indians!!

Although, some Indians get it:

https://twitter.com/i/web/status/1883190269660778591

siegecrossbow · Jan 25, 2025

Please, Log in or Register to view URLs content!

State Department propaganda went to 0 overnight. Decimated. This is what total devastation looks like.

TK3600 · Jan 25, 2025

Temstar said:
It's an interesting question been discussed in China, when Wang Yi said "好自为之" to Rubio how would you translate that into English? Some suggestions from DeepSeek:
View attachment 144294
View attachment 144293
View attachment 144295
View attachment 144296
View attachment 144299
Me I'm going with "check yo self before you wreck yourself".

It is a polite way of saying "fuck around and find out" in American English.

tygyg1111 · Jan 25, 2025

jiajia99 said:
A concept that I am going to enjoy watching China destroy bit by bit in the next few years. May China free to world from this Anglo Zionist mother friggers once and for all and do a lot of damage to the brains of these idiots so that they never recover. Never has there been a class of people that deserve to absolutely suffer, their entitlement is beyond the pale

Genghis Khan's hordes ride again.... through the minds of westoids

Miscellaneous News

FairAndUnbiased

Brigadier

Eventine

Senior Member

tygyg1111

Major

tygyg1111

Major

FairAndUnbiased

Brigadier

dingyibvs

Senior Member

Attachments

tygyg1111

Major

siegecrossbow

Field Marshall

TK3600

Colonel

tygyg1111

Major