Artificial Intelligence thread

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member
Please, Log in or Register to view URLs content!
大模型的蓬勃发展,引爆了算力的海量需求。临港新片区提出,到2025年,总算力超过5EFLOPS(FP32),AI算力占比达到80%,算力产业总体规模突破100亿元。
Even more on top of that. Lingang region alone is going for 5 EFLOPS of FP32 by 2025!

Please, Log in or Register to view URLs content!
Keep in mind each A100 does 19.5 TFLOPS of FP32 and 312 TFLOPS of float 16

So FP32 computation means a lot more than FP16 or INT16. Of course, it's different with different GPUs

截至目前,新片区共有商汤人工智能计算中心(AIDC)、有孚网络、信息飞鱼等8个数据中心,总算力超过3EFLOPS(FP32),智能算力占比近80%,总算力规模约占上海市近20%。同时,新片区算力产业已在上游软硬件、中游的数据中心、调度平台,下游应用进行了相应布局。
Currently bw SenseTime, Youfu network & Feiyu + 5 other data centers, they have 3 EFLOPS FP32 in this area. This is approximately 20% of all computation in Shanghai area

目前,临港的商汤AIDC已接入了公共算力服务平台。他表示,希望参会的各单位,特别是各电信运营商结合临港的网络特点,积极在临港建设极速算力承载网,推动算力成为像水电一样的公共服务。
A while back, Shanghai formed their AI data exchange for public computational service platform. Sensetime already part of that. The goal is to make computation a public utility like water and electricity

大会上,中国电信临港公共智算服务平台暨国产GPU联合创新基地正式发布。中国电信成立临港算力(上海)科技有限公司,快速开展临港算力园区建设,并将分批次投放4万个适用于智算、超算的高功率机架,为临港、上海、长三角的企业提供智算公共服务。
Now for the China telecom part, it will use domestic GPUs in its new base of 40k high power racks

当天,商汤科技被授予“新片区智算产业链链主”企业,将基于商汤AIDC参与临港智算产业链的协同融合和集聚发展。徐立表示,目前商汤AIDC有接近3万块的GPU,一期算力规模超过5000PFLOPS,可以支持20个千亿参数超大模型同时训练。“我们相信未来可以有更好的开发者效率,并且能够支持更多千亿规模的大模型算力训练。”

商汤科技打造了软硬结合的通用人工智能(AGI)基础设施大装置SenseCore,并在此基础上构建了“商汤日日新SenseNova”大模型体系,推进自身AGI发展战略的同时,也为行业提供大模型算法服务、训练和推理优化以及数据服务。截至今年5月,商汤大装置已累计服务超40个核心客户,其中大模型客户10家以上,涵盖智能驾驶、生物制药、芯片设计、智慧商业、高校科研等前沿领域,并已在超过20个落地场景中实现大模型交付。
Sensetime's current AIDC has almost 30k GPUs with 5 EFLOPS of computation in phase 1 and can support training 20 large models with hundreds of billions of parameters. They believe they can built it to support even more.

Sensetime creaated software/hardware integrated platform SenseCore and built SenseNova large model system. By May, Sensetime's service has already served over 40 core customers, including 10 large model customers for ADAS, biopharma, chip design, smart commerce & higher education science research. Realized the delivery of 20+ large models.
 

sunnymaxi

Captain
Registered Member
More than 14 provincial regions in
Please, Log in or Register to view URLs content!
have contributed to the research and development of large-scale
Please, Log in or Register to view URLs content!
models, the groundbreaking technology behind
Please, Log in or Register to view URLs content!
. Up to 38 are from Beijing, followed by 20 from Guangdong province..

Please, Log in or Register to view URLs content!
has developed at least 79 large-scale artificial intelligence models with over 1 bln parameters each, a research institute said in a rare public statement, amid the worldwide buzz created by OpenAI's artificial intelligence chat bot ChatGPT..

Image
 

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member
Please, Log in or Register to view URLs content!

Turning AI computing resource into public service like electricity & water. Think about that. What a great concept. How to efficiently utilize computational resources? Make it available for all.

Keep in mind that when Sensetime started its AIDC back in Jan 2022
Please, Log in or Register to view URLs content!
, it was said to want to have spent 5.6B RMB to build this and designed it to have 3.74 EFLOPS of computation w/ 5000 racks. And now it is said to have 5 EFLOPS of computation and have already done 20+ large models with at least 10 customers looking for big data.

It's known that XPeng built its own 600 PFLOPS data center for ADAS training. I always wondered how other automakers were training for ADAS. This type of publicly available resource would allow very efficient usage of computational resource for ADAS & other purposes
 

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member
this isn't restricted to Shanghai or Lingang area.

There is also a project like this going on in Beijing
Please, Log in or Register to view URLs content!
Please, Log in or Register to view URLs content!

And Shenzhen
Please, Log in or Register to view URLs content!

it would seem to be that all the tier 1 cities will want projects like this. No one wants to be left behind and put industries in their area at a disadvantage because they moved too slowly
 

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member
I think there are some really interesting stuff here
Please, Log in or Register to view URLs content!
When the initial idea of public computation resource got put out, they also mentioned the planned 算力浦江 projects which got put in action in late April. Quite interesting to pool resource together through these interconnects. More nodes will be added to this as China mobile, china unicom & china telecom data center come online.

China telecom news just came out yesterday, but as seen here, China Unicom DC at Lingang has been announced. I'm sure China mobile will announce soon too. And I would think they all use domestic chips
Please, Log in or Register to view URLs content!
Lingang has its advantages in developing AI computing network, with huge volumes of industrial data and surging demands from integrated circuit, advanced manufacture and intelligent car firms in the zone. It will boost AI algorithms' usage in more industries and create commercial value quickly, Tang added.

Currently Shanghai has announced 16 new city-level data centers, including SenseTime's AIDC in Lingang and centers constructed by China Unicom and China Telecom's local branches.

I find the idea of connecting computational resource between Shanghai, Ningxia, Hangzhou & Shenzhen to be interesting, but I'm not sure how well that works in practice (especially between Shanghai & Shenzhen) considering the distance.

Looks like Guangzhou is also planning its own public utility service for computation
Please, Log in or Register to view URLs content!
The thing in China is that no local govt wants to be seen left behind. Inter-municipality competition is a good thing

I liked this section in Shenzhen portion
搭建生态孵化平台。依托鹏城云脑搭建城市级人工智能生态孵化平台,为中小企业提供低成本智能算力资源,以及算法、工具集、模型库、适配认证等支持,赋能生态伙伴开展联合创新。实施公共技术服务平台扶持计划,培育一批各具特色的人工智能公共技术服务平台。(责任单位:市工业和信息化局、市发展改革委、市科技创新委等)
basically, it seems like a big part of their plan is the new Huawei 16 EFLOPS Pengcheng-3 data center. That's humongous. And this will provide low cost, easily accessible computation which will use HW's ascend/pangu platform. So in terms of tools they develop, basically you need to know Ascend & Pangu to be relevant

That to me is a great victory for Huawei. When it is the basis of all the smart city projects. Then people have to move off PyTorch and Cuda and learn how to use Ascend & Pangu. That will be how China replaces Nvidia products. The only people left behind are Tencent, Inspur & Bytedance
 

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member
Please, Log in or Register to view URLs content!
More update here
MIIT hosted & CAICT organized computing power innovation & development forum was held in Beijing?? with China Telecom & CAICT jointly releasing China's first national platform to realize multiple computing platform scheduling -> 全国一体化算力算网调度平台 1.0

CAICT & China Telecom started this as part of the 东数西算 project.

This platform pools together general purpose computing power, AI computing, edge computing from multiple resources and uses resource pool scheduling engine to meet needs of different customers. It uses AI training job scheduling, realizing cross resource/pool/architecture/vendor scheduling of computing power.

So far Tianyi cloud (china telecom cloud), Huawei cloud, Alibaba cloud are all participating along with some others. So quite a few heavyweight.

Also 东数西算 allows computing resource to be built in Western provinces with clean energy & low electricity cost to satisfy demand on east coast.

Please, Log in or Register to view URLs content!
more update here
例如,
Please, Log in or Register to view URLs content!
创新推出了算力分发网络——息壤,结合自研算力调度引擎、算力资源管理平台两大基础能力,提供快速上云、按需使用算力的一站式解决方案。目前,中国电信总体算力规模达到3.1 EFLOPS,预计“十四五”末达 16.3 EFLOPS
China Telecom has 3.1EFLOPS right now and plang to have 16.3EFLOPS by end of 2025. So yeah, that new Lingang data center likely will not be the only one they are building right now

english article here from Yicai
Please, Log in or Register to view URLs content!

The networking part of this is interesting and can be found on Huawei's website here. How to reduce latency & balance on business between data centers
Please, Log in or Register to view URLs content!
 
Top