Artificial Intelligence thread

Hyper · Jun 27, 2024

PopularScience said:
https://twitter.com/i/web/status/1798776057304326458

Deepseek and Claude 3.5 are missing. It's an incomplete benchmark.

Eventine · Jun 28, 2024

Because that list is open source LLMs only. Claude 3.5 is not open source, nor is Chat GPT, Gemini, etc., which are significantly more powerful than any open source model. Not sure about Deep Seek.

tphuang · Jun 28, 2024

this is a logical comparison. 70B open source LLMs that are available for everyone to use is something that open source provider can compare pretty well. And it's a good thing for hugging face to recommend

tokenanalyst · Jun 30, 2024

Alibaba Cloud showcases its self-developed network design for large language model training

On June 29, Alibaba Cloud recently announced its Ethernet network design specially created for ultra-large data transmission for training large language models (LLMs), and it has been used in actual business for 8 months.
Alibaba Cloud chose Ethernet out of a desire to avoid over-dependence on a few suppliers and to leverage the "power of the entire Ethernet Alliance to achieve faster development." This decision also seems to be in line with the fact that more and more manufacturers are beginning to support Ethernet and escape Nvidia's NVlink monopoly on cloud AI interconnection.

Alibaba's Ethernet networking plans were revealed on the GitHub page of Ennan Zhai, a senior engineer at Alibaba Cloud and a researcher in networking research. Zhai published a paper that will be presented in August at the SIGCOMM conference, the annual gathering of the Association for Computing Machinery's Special Interest Group on Data Communications.

The paper, titled “Alibaba HPN: A Datacenter Network for Large-Scale Language Model Training,” begins by noting that cloud computing traffic “…generates millions of small flows (e.g., less than 10 Gbit/s)” while large language model training “generates a small amount of periodic, bursty traffic on each host (e.g., 400 Gbit/s)”.

Equal-cost multipath routing is a commonly used method for sending packets to a single destination over multiple paths, but it is prone to hash polarization, a phenomenon that makes load balancing difficult and significantly reduces available bandwidth.
Alibaba Cloud’s home-grown alternative, called High Performance Network (HPN), “avoids hash polarization by reducing the presence of ECMP, while also greatly reducing the search space for path selection, allowing us to accurately select network paths that can accommodate large traffic flows.”

HPN also addresses the fact that GPUs need to work synchronously when training large language models, which makes AI infrastructure sensitive to single points of failure—especially top-of-rack switches.
As a result, Alibaba's network design uses a pair of switches—but not in the stacked configuration recommended by switch vendors.

Please, Log in or Register to view URLs content!

tphuang · Jul 3, 2024

Please, Log in or Register to view URLs content!

Moore threads boasted about its KUAE GPU clusters.
It can now be expanded to 10s of thousands and provide 10 EFLOPS+ in total computation

can continue to train of 15+ days

摩尔线程将开展三个万卡集群项目，分别为青海零碳产业园万卡集群项目、青海高原夸娥万卡集群项目、广西东盟万卡集群项目。

they have 3 such clusters, 2 in Qinghai and 1 in Guangxi

sunnymaxi · Jul 3, 2024

Please, Log in or Register to view URLs content!

China is now home to more than 1/3 of the world's 1,328 AI large language models and 15% of nearly 30,000 AI enterprises worldwide, according to a whitepaper released at the Global Digital Economy Conference 2024 in Beijing on Tuesday..

Please, Log in or Register to view URLs content!

will develop more than 50 new national and industrial standards for

Please, Log in or Register to view URLs content!

by 2026 to facilitate the high-quality development of the AI industry. This goal is part of guidelines on standardizing systems for the AI industry that were jointly issued by four State government agencies yesterday.

The country also aims to participate in the formation of more than 20 international AI standards by 2026 to promote the development of the global AI sector, according to the guidelines. Furthermore, China aims to have more than 1,000 companies adopt and advocate for these new standards..

Taiban · Jul 3, 2024

China is far ahead of other countries in generative AI inventions like chatbots, filing six times more patents than its closest rival the United States - a quarter of them were filed in 2023 alone.

Please, Log in or Register to view URLs content!

tphuang · Jul 4, 2024

Please, Log in or Register to view URLs content!

Beijing aims to create an AI-native city in 2 years with 45 EFLOPS of computation power to be installed by 2025

tphuang · Jul 4, 2024

Huawei Cloud focuses in on corporate customers and leave the GenAI foundational models to other hyperscalers

Please, Log in or Register to view URLs content!

Talks about applying its Pangu model to improve efficiency at Baosteel plant

caudaceus · Jul 4, 2024

tphuang said:
this is well known fact that they joined Ascend earlier on after they got entity listed.

https://twitter.com/i/web/status/1806332328039297403

so despite the fact that Kling is in China and very few people access it, somehow we get far more Kling clips than Sora. Where is Sora btw?

Rumour is that sama prefers to exclusively give it to Hollywood / Media complex

Artificial Intelligence thread

Hyper

Junior Member

Eventine

Junior Member

tphuang

General

tokenanalyst

Brigadier

Alibaba Cloud showcases its self-developed network design for large language model training

tphuang

General

sunnymaxi

Major

Taiban

Junior Member

tphuang

General

tphuang

General

caudaceus

Senior Member

Artificial Intelligence thread

Junior Member

Junior Member

General

Brigadier

Alibaba Cloud showcases its self-developed network design for large language model training​

General

Major

Junior Member

General

General

Senior Member

Alibaba Cloud showcases its self-developed network design for large language model training