Artificial Intelligence thread

tphuang · Sep 19, 2024

Please, Log in or Register to view URLs content!

btw, you can try out the 72B version on huggingface right now

tphuang · Sep 19, 2024

Please, Log in or Register to view URLs content!

China Telecom claims that it has trained Llama-3.1 405B parameter model using just Domestic 10k+ AI cluster (I guess when they say wan ka, I'm not sure how big it is. Could be answer where from 10k to 99k)

apparently, has MFU of 43%

diadact · Sep 19, 2024

tphuang said:
China Telecom claims that it has trained Llama-3.1 405B parameter model using just Domestic 10k+ AI cluster (I guess when they say wan ka, I'm not sure how big it is. Could be answer where from 10k to 99k)

apparently, has MFU of 43%

They didn't train a new 405B model
They fine-tuned it on a dataset
Good for gaining experience on how to effectively utilize large domestic GPU clusters and testing the stability of cluster
They need to start building 100K H100 class GPU clusters if they want to stay competitive in 2025
Any news about ascend 910C??
Its production volume & performance??

tphuang · Sep 19, 2024

https://twitter.com/i/web/status/1836871860413165756

Kling 1.5 has been released now and it looks great

diadact said:
They didn't train a new 405B model
They fine-tuned it on a dataset
Good for gaining experience on how to effectively utilize large domestic GPU clusters and testing the stability of cluster
They need to start building 100K H100 class GPU clusters if they want to stay competitive in 2025
Any news about ascend 910C??
Its production volume & performance??

yes, they fine tuned it. I was repeating what they said in there, but I don't see why it wasn't obvious what I meant.

And just why do they need 100k H100 GPU cluster to stay competitive? Alibaba just trained its latest Qwen 2.5 on 18 trillion tokens

Please, Log in or Register to view URLs content!

It got released just a few months after Qwen-2.0.

How much non duplicate tokens are out there globally that you can use? How much larger do their cluster really need to get?

tphuang · Sep 19, 2024

https://twitter.com/i/web/status/1836939132632707548

Transsion partnering up with Mediatek to add AI functions to its phones. Imagine that, Transsion phones getting AI functions.

tphuang · Sep 19, 2024

https://twitter.com/i/web/status/1836948945177624863

Some numbers to think about for Qwen-2.5

btw, you can test it out on arena. Comparing its results with ChatGPT for the same prompt.

Please, Log in or Register to view URLs content!

antiterror13 · Sep 19, 2024

PCK11800 said:
I can give it twenty years and AI wouldn't replace coding or programmers - at the end of the day you need someone to actually look through, test and debug the output, which requires someone knowledgable in writing and developing software.

If an AI is capable of self developing, debugging and testing software, it must have the ability to be self aware and actually understand what it's creating. Therefore it has certainly ascended into an AGI, at which point we have bigger problems.

Certainly AI wouldn't replace top class coders (let's say top 1%) in 10 years time, however AI certainly able to replace >50% of coders now and slowly moving up and will get harder and harder

9dashline · Sep 19, 2024

antiterror13 said:
Certainly AI wouldn't replace top class coders (let's say top 1%) in 10 years time, however AI certainly able to replace >50% of coders now and slowly moving up and will get harder and harder

90% within 2 years

antiterror13 · Sep 19, 2024

9dashline said:
90% within 2 years

I like of how optimistic you are

PCK11800 · Sep 19, 2024

antiterror13 said:
Certainly AI wouldn't replace top class coders (let's say top 1%) in 10 years time, however AI certainly able to replace >50% of coders now and slowly moving up and will get harder and harder

Not even 5%.

Currently even the latest and greatest LLMs (looking at you ClosedAI o1) sometimes crashes and burns even with simple standalone React components. They are completely and utterly useless with writing code for my legacy backend codebase. Unless I can fit the entire codebase into the context window, no LLMs can contribute without me spoonfeeding them all the relevant context... which is like 90% of the work.

LLMs are immensely helpful and allows me to become much more productive, being able to ask my specific coding questions and receiving a specific answer insanely streamlines my development process, but that still relies on you knowing what questions to ask

That said with future advancements, I would say your cookie-cutter front-end only code monkeys are going to struggle in the future, but more generalist developers should still be fine.

9dashline said:
90% within 2 years

I have to ask, had you done any professional software development? Do you have a degree in Comp Sci or write code for a living? Do you even know how to write code? Any projects, toy programs or anything?

Artificial Intelligence thread

tphuang

General

tphuang

General

diadact

New Member

tphuang

General

tphuang

General

tphuang

General

antiterror13

Brigadier

9dashline

Captain

antiterror13

Brigadier

PCK11800

Just Hatched