Artificial Intelligence thread

tphuang · Jun 24, 2023

Wuhun said:
Finally... Tsinghua surpassed GPT-4 on C-Eval

View attachment 114937

Any idea of details behind these models? Just curious with all these new models out and where they are getting used.

Please, Log in or Register to view URLs content!

智谱AI是由清华大学计算机系技术成果转化而来的公司，致力于打造新一代认知智能通用模型，提供智能API服务，链接物理世界的亿级用户、赋能元宇宙数字人、成为具身机器人的基座，赋予机器像人一样“思考”的能力。

Looks like Zhipu AI is from Tsinghua university, putting their research into real world through training. I'm curious where they got the computing power to train it

broadsword · Jun 24, 2023

But behind in STEM.

tphuang · Jun 26, 2023

https://twitter.com/i/web/status/1673219815895449600

updated ERNIE bot has seen significantly improved performance in V3.5 over V3

Franklin · Jun 26, 2023

tphuang said:
https://twitter.com/i/web/status/1673219815895449600

updated ERNIE bot has seen significantly improved performance in V3.5 over V3

That's all good and well but is ERNIE bot or any other chatbot in China available to the general public. If not then who has access to this technology if anyone.

el pueblo unido · Jun 26, 2023

Franklin said:
That's all good and well but is ERNIE bot or any other chatbot in China available to the general public. If not then who has access to this technology if anyone.

Most of the LLM developed in China that provide web interface or token API are currently using lottery based invitation systems, so in some way the general public does have the access to these LLM services but the term of general public is only limited to the those who were invited. This is common practice when it comes into software or hardware testing in China.

tphuang · Jun 26, 2023

Please, Log in or Register to view URLs content!

Huawei Pangu Large model to announce a major ugprade on 07/07

China Unicom in 2 days will show how Huawei involvement in its AI development

tokenanalyst · Jun 27, 2023

https://twitter.com/i/web/status/1668973737310396416

tokenanalyst · Jun 27, 2023

Tsinghua published their new ChatGLM2 model with their 6b paramater model being open source and commercial use.

ChatGLM2-6B is the second-generation version of the open-source bilingual (Chinese-English) chat model

Please, Log in or Register to view URLs content!

. It retains the smooth conversation flow and low deployment threshold of the first-generation model, while introducing the following new features:

Stronger Performance: Based on the development experience of the first-generation ChatGLM model, we have fully upgraded the base model of ChatGLM2-6B. ChatGLM2-6B uses the hybrid objective function of
Please, Log in or Register to view URLs content!
, and has undergone pre-training with 1.4T bilingual tokens and human preference alignment training. The
Please, Log in or Register to view URLs content!
show that, compared to the first-generation model, ChatGLM2-6B has achieved substantial improvements in performance on datasets like MMLU (+23%), CEval (+33%), GSM8K (+571%), BBH (+60%), showing strong competitiveness among models of the same size.
Longer Context: Based on
Please, Log in or Register to view URLs content!
technique, we have extended the context length of the base model from 2K in ChatGLM-6B to 32K, and trained with a context length of 8K during the dialogue alignment, allowing for more rounds of dialogue. However, the current version of ChatGLM2-6B has limited understanding of single-round ultra-long documents, which we will focus on optimizing in future iterations.
More Efficient Inference: Based on
Please, Log in or Register to view URLs content!
technique, ChatGLM2-6B has more efficient inference speed and lower GPU memory usage: under the official implementation, the inference speed has increased by 42% compared to the first generation; under INT4 quantization, the dialogue length supported by 6G GPU memory has increased from 1K to 8K.

Please, Log in or Register to view URLs content!

luminary · Jun 28, 2023

“We all heard the sound of the starter pistol in the race. Tech companies, big or small, are all on the same starting line,” Wang, who named his startup Baichuan or “A Hundred Rivers,”

Please, Log in or Register to view URLs content!

. “China is still three years behind the US, but we may not need three years to catch up.”

The number of Chinese venture deals in AI comprised more than two-thirds of the US total of about 447 in the year to mid-June, versus about 50% over the previous two years. China-based AI venture deals also outpaced consumer tech in 2022 and early 2023, according to Preqin.

Ex-Baidu President Zhang Yaqin, now dean of Tsinghua University’s Institute for AI Industry Research and overseer of a number of budding projects, told Chinese media in March that investors sought him out almost daily that month. He estimates there’re as many as 50 firms working on large language models across the country.

tphuang · Jun 28, 2023

Please, Log in or Register to view URLs content!

China Unicom unveils Honghu large language model (pretty sure this uses Huawei technology)
2 versions: 1 with 800 million parameters and another with 2 billion parameters

China Unicom will build large scale computing base for this

Artificial Intelligence thread

tphuang

General

broadsword

Brigadier

tphuang

General

Franklin

Captain

el pueblo unido

Junior Member

tphuang

General

tokenanalyst

Brigadier

tokenanalyst

Brigadier

luminary

Senior Member

tphuang

General