Moore Threads has just announced its new MTT S4000 AI GPU that will work "seamlessly" with NVIDIA's CUDA framework, thanks to an in-house MUSIFY translation tool.
The new Moore Threads MTT S4000 AI GPU features MTLink 1.0, which is a new multi-GPU connector that we don't have photos of, but it sounds similar to NVIDIA's in-house NVLink connector. The new MTT S4000 AI GPU has 48GB of GDDR6 memory with up to 768GB/sec of memory bandwidth, compared to the 48GB of memory on NVIDIA's RTX 6000 GPU with 960GB/sec of memory bandwidth.
 
Moore Threads is passively cooling its new MTT S4000 AI GPU, with 25 TFLOPs of FP32 single-precision compute performance, 100 TFLOPs of FP16/BF16, and 200 TOPS in INT8. This is all found on a PCIe Gen5 x16 connector, too.
Moore Threads has made its new MTT S4000 AI GPU with larger customers in mind, who are wanting to have a KUAE Smart COmputing Center Solution. It's a full GPU-based stack for large-scale computing that the Chinese company can deploy in only 30 days. The GPU server supports 70 billion to 130 billion parameters training for Large Language Models (LMMs).
Moore Threads explained on its website: "In the era of large models, intelligent computing power represented by GPU is the cornerstone and the center of the generative AI world. Moore Thread cooperates with China Mobile Beijing Company, China Telecom Beijing Branch, Lenovo, 21Vianet, Halo New Network, Zoomlion Data, Shudao Intelligent Computing, China Development Intelligence Source, Qishang Online, Nortel Digital Beijing Digital Economy Computing Power More than ten companies including the Center, Ziguang Hengyue, Ruihua Industrial Holdings (Shandong), Saier Network, Zhongke Financial, Zhongyun Intelligent Computing, Jinzhou Yuanhang (in no particular order), jointly announced the "Moore Thread PES - KUAE Intelligent Computing Alliance" was established".
"The alliance will vigorously build and promote a national industrial intelligent computing platform from underlying hardware to software, tools and applications, aiming to achieve high utilization of clusters and become the first choice for large model training with easy-to-use, full-stack intelligent computing solutions".