Artificial Intelligence thread

GulfLander

Major
Registered Member
"China's smartphone maker Honor pledged $10 billion in AI investments over the next five years and announced on Sunday a deepening partnership with Google.

The investment plan, revealed at Mobile World Congress (MWC) in Barcelona, is designed to reposition the firm from a smartphone player into an "AI device ecosystem company".

Honor meanwhile displayed their new deepfake detection technology on its Magic 7 Pro smartphone, a world's first.

The new Hornor also features a new AI agent, is expected to be rolled out globally later this year.

Honor is somewhat of an upstart in the smartphone world, after spinning off from Huawei in 2020 when the Chinese tech giant was hit with US sanctions. Since then, Honor has looked to expand outside of China and push into the higher-end part of the market where Apple and Samsung play.[...]"
 

OptimusLion

Junior Member
Registered Member
Bytedance has open-sourced an optimization system called Comet, which improves the execution efficiency of the Mixed Experts Model (MoE) by overlapping fine-grained computation and communication.
It seems similar to the DualPipe & DeepEP open-sourced by DeepSeek, but not as complicated. The advantage is that it can be used with only a few lines of code changes. The experimental results of Bytedance are that Comet speeds up the execution of a single MoE layer by 1.96 times and end-to-end execution by 1.71 times. It has been deployed in large-scale GPU clusters, saving millions of GPU hours

Paper: arxiv.org/pdf/2502.19811
Code: github.com/bytedance/flux

82c654dfly1hz63plv313j213916fb29 (2).png
 
Last edited:

tphuang

General
Staff member
Super Moderator
VIP Professional
Registered Member

tokenanalyst

Brigadier
Registered Member

Day0 supports Moore's Thread and quickly supports Tongyi Qianwen QwQ-32B open source model​


On March 6, the Alibaba Cloud team officially open-sourced a new inference model, Tongyi Qianwen QwQ-32B. Moore Threads quickly and efficiently completed support for Qianwen QwQ-32B within 2 hours after the release of the model. QwQ-32B, which is deployed based on the large language model high-speed inference framework vLLM and the MT Transformer inference engine, has demonstrated excellent inference performance and stability in actual operation, fully demonstrating the strong advantages of the MUSA architecture and full-featured GPU in ecological compatibility and fast support.

Moore Threads has opened this achievement to the model square "Kuae Factory". Kuae Factory is a model display center created by Moore Threads with great effort, aiming to provide users with model capability experience based on Moore Threads KUAE (KUAE) intelligent computing cluster support. Users can experience the powerful inference model performance and innovative technology of QwQ-32B by visiting Kuae Factory or clicking "Read original text".​

Please, Log in or Register to view URLs content!
 

tphuang

General
Staff member
Super Moderator
VIP Professional
Registered Member
Screenshot 2025-03-07 at 7.46.01 AM.png

this guy seems to kind of legit based on the stuff he posted and his audience size.

Basically saying that much of Nvidia sales in Singapore, 100B over 2 years will come to China. Since the Chinese big tech can't buy the smuggled chips directly, they get installed in the public data centers and then rented out to the major cloud/tech providers like Alibaba, Tencent and ByteDance.

Apparently, China's major data center build up are in Gansu (this was the one I posted earlier in the year that said it will be 100 EFLOPS by end of 2025) and Shanghai. My guess is that those are still under estimates.
 

tphuang

General
Staff member
Super Moderator
VIP Professional
Registered Member
Please, Log in or Register to view URLs content!
2025年3月7日,北京——正值全国“两会”如火如荼之际,“人工智能+”与“智能终端”等热点话题引发广泛关注,凸显了业界对AI技术的高度重视。自2025年2月5日联想集团与国产GPU领军企业沐曦股份联合发布首个国产DeepSeek一体机解决方案以来,这一软硬协同的AI产品迅速成为行业焦点。截至今日,该解决方案累计发货量已突破千台,配备沐曦国产GPU卡近万张,覆盖医疗、教育、制造等十余个核心行业,标志着国产AI产业落地的重要里程碑。
since Lenovo and Meta-X partnered up for domestic DeepSeek all in one machine, it has sold over 1000 units covering almost 10000 cards, Covering medical, education and manufacturing.

搭配沐曦曦思N260国产GPU的一体机可支持在本地部署DeepSeek各种参数蒸馏模型,实测数据显示,在相同并发条件下,DeepSeek-R1-Distill-Qwen-14B模型推理性能达到国际主流GPU的110%-130%。而旗舰版DeepSeek训推一体机基于联想问天WA5480 G3 AI服务器搭载曦云C500国产GPU,性能比肩国际一流水平,实测671B满血版模型针对基于4K上下文的用户真实使用环境,在高达64并发时总吞吐量达到1575.4tokens/s,每用户实际可用吞吐达到24.6tokens/s,而在1024个用户并发访问的极限测试时,实现了3725.1tokens/s的极限吞吐。沐曦和联想的技术团队通过编译器优化、张量/数据并行、MLA、FuseMoE等技术手段,仍在持续刷新性能上限。
N260 can infer 14B model at 110-130% of main line GPU.
C500 build AI server can run full blooded version at 3725.1 tps

Please, Log in or Register to view URLs content!

Saudi Aramco praising its integration of DeepSeek in its Aramco data center.
 
Top