Artificial Intelligence thread

Hyper · Feb 27, 2025

Day 5 of

Please, Log in or Register to view URLs content!

: 3FS, Thruster for All DeepSeek Data Access Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks.
6.6 TiB/s aggregate read throughput in a 180-node cluster 3.66 TiB/min throughput on GraySort benchmark in a 25-node cluster 40+ GiB/s peak throughput per client node for KVCache lookup Disaggregated architecture with strong consistency semantics Training data preprocessing, dataset loading, checkpoint saving/reloading, embedding vector search & KVCache lookups for inference in V3/R1 3FS →

Please, Log in or Register to view URLs content!

Smallpond - data processing framework on 3FS →

Please, Log in or Register to view URLs content!

9dashline · Feb 27, 2025

Please, Log in or Register to view URLs content!

Seems like its over for Chatgpt, on both performance and price

Hyper · Feb 27, 2025

9dashline said:
Please, Log in or Register to view URLs content!

Seems like its over for Chatgpt, on both performance and price

Board fight and talent drain to other startups is what has happened.

OptimusLion · Feb 27, 2025

Bomb No. 5 of DeepSeek Open Source Week is here! It's another cluster bomb! 3FS and smallpond!

Firefly File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks.

This file system can achieve a total read throughput of 6.6 TiB/s in a 180-node cluster, and a peak KVCache search throughput of 40+ GiB per client node.

Another smallpond is a data processing framework based on 3FS!

This framework is powered by DuckDB for high-performance data processing, which can be expanded to handle PB-level data sets!

3FS → github.com/deepseek-ai/3FS

Smallpond - data processing framework on 3FS→github.com/deepseek-ai/smallpond

luminary · Feb 27, 2025

US loosing steam while China only just gathering momentum.

https://twitter.com/i/web/status/1894515365910126884

tphuang · Feb 27, 2025

OpenAI already charges by far the most of any non-reasoning models. This is just ridiculous. There is not a chance my firm can use this. Everyone is going to look into hosting Llama or DeepSeek locally. If Gemini becomes noticeably better than people will move to Gemini.

mossen · Feb 27, 2025

PantsFullOfAnts said:
GPT 4.5, Grok 3, and Sonnet 3.7 have shown that pretraining has hit a wall.

I'm not sure about that.

GPT 4.5 in particular seems wildly inefficient given its price.

This is just Altman's money grab. And when you understand OpenAI's situation it all makes sense.

Fatty · Feb 27, 2025

mossen said:
I'm not sure about that.

GPT 4.5 in particular seems wildly inefficient given its price.

View attachment 146618

This is just Altman's money grab. And when you understand OpenAI's situation it all makes sense.

View attachment 146619

The easy (and likely) explanation was that this was OpenAI’s attempt at training their next gen GPT5 but they ended up with lackluster results, so they just tacked the name 4.5 on and released it to satisfy investors. I think Altman has actually hinted at this before too

Hyper · Feb 27, 2025

PantsFullOfAnts said:
GPT 4.5, Grok 3, and Sonnet 3.7 have shown that pretraining has hit a wall.
I suspect the next big breakthrough will involve parallel chains of thought, which DeepSeek should be able to do better than anyone.
With Kimi 1.6 and other recent models, it seems that China is now in the lead with LLMs, and I don't expect this to change in the future.

Open AI is not close to Anthropic in pre training at all.

OptimusLion · Feb 28, 2025

Hejian Software released the digital design AI intelligent platform UDA, integrating DeepSeek to create a domestic one-stop intelligent EDA platform

Shanghai Hejian Industrial Software Group Co., Ltd. (hereinafter referred to as "Hejiangongsoft"), a leading Chinese digital EDA company, announced the launch of an innovative digital design AI intelligent platform - UniVista Design Assistant (UDA). UDA extends the traditional RTL-to-GDSII design process to NL-to-GDSII (Natural Language to GDSII), becoming the first domestically developed AI intelligent platform specially designed for RTL Verilog design. It integrates advanced large models (LLM) such as DeepSeek R1 and Hejiangongsoft's self-developed EDA engine, providing comprehensive AI-assisted functions, including NL-to-RTL code generation, online QoR (Quality of Result) evaluation and tuning, and functional verification and debugging, to build a one-stop domestic EDA solution

Please, Log in or Register to view URLs content!

Artificial Intelligence thread

Hyper

Junior Member

9dashline

Captain

Hyper

Junior Member

OptimusLion

Junior Member

luminary

Senior Member

tphuang

General

mossen

Junior Member

Fatty

Junior Member

Hyper

Junior Member

OptimusLion

Junior Member