Artificial Intelligence thread

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member
Finally... Tsinghua surpassed GPT-4 on C-Eval

View attachment 114937
Any idea of details behind these models? Just curious with all these new models out and where they are getting used.

Please, Log in or Register to view URLs content!
智谱AI是由清华大学计算机系技术成果转化而来的公司,致力于打造新一代认知智能通用模型,提供智能API服务,链接物理世界的亿级用户、赋能元宇宙数字人、成为具身机器人的基座,赋予机器像人一样“思考”的能力。
Looks like Zhipu AI is from Tsinghua university, putting their research into real world through training. I'm curious where they got the computing power to train it
 

el pueblo unido

Junior Member
Registered Member
That's all good and well but is ERNIE bot or any other chatbot in China available to the general public. If not then who has access to this technology if anyone.
Most of the LLM developed in China that provide web interface or token API are currently using lottery based invitation systems, so in some way the general public does have the access to these LLM services but the term of general public is only limited to the those who were invited. This is common practice when it comes into software or hardware testing in China.
 

tokenanalyst

Brigadier
Registered Member
Tsinghua published their new ChatGLM2 model with their 6b paramater model being open source and commercial use.

ChatGLM2-6B is the second-generation version of the open-source bilingual (Chinese-English) chat model
Please, Log in or Register to view URLs content!
. It retains the smooth conversation flow and low deployment threshold of the first-generation model, while introducing the following new features:


  1. Stronger Performance: Based on the development experience of the first-generation ChatGLM model, we have fully upgraded the base model of ChatGLM2-6B. ChatGLM2-6B uses the hybrid objective function of
    Please, Log in or Register to view URLs content!
    , and has undergone pre-training with 1.4T bilingual tokens and human preference alignment training. The
    Please, Log in or Register to view URLs content!
    show that, compared to the first-generation model, ChatGLM2-6B has achieved substantial improvements in performance on datasets like MMLU (+23%), CEval (+33%), GSM8K (+571%), BBH (+60%), showing strong competitiveness among models of the same size.
  2. Longer Context: Based on
    Please, Log in or Register to view URLs content!
    technique, we have extended the context length of the base model from 2K in ChatGLM-6B to 32K, and trained with a context length of 8K during the dialogue alignment, allowing for more rounds of dialogue. However, the current version of ChatGLM2-6B has limited understanding of single-round ultra-long documents, which we will focus on optimizing in future iterations.
  3. More Efficient Inference: Based on
    Please, Log in or Register to view URLs content!
    technique, ChatGLM2-6B has more efficient inference speed and lower GPU memory usage: under the official implementation, the inference speed has increased by 42% compared to the first generation; under INT4 quantization, the dialogue length supported by 6G GPU memory has increased from 1K to 8K.

Please, Log in or Register to view URLs content!
 

luminary

Senior Member
Registered Member
“We all heard the sound of the starter pistol in the race. Tech companies, big or small, are all on the same starting line,” Wang, who named his startup Baichuan or “A Hundred Rivers,”
Please, Log in or Register to view URLs content!
. “China is still three years behind the US, but we may not need three years to catch up.”

The number of Chinese venture deals in AI comprised more than two-thirds of the US total of about 447 in the year to mid-June, versus about 50% over the previous two years. China-based AI venture deals also outpaced consumer tech in 2022 and early 2023, according to Preqin.

Ex-Baidu President Zhang Yaqin, now dean of Tsinghua University’s Institute for AI Industry Research and overseer of a number of budding projects, told Chinese media in March that investors sought him out almost daily that month. He estimates there’re as many as 50 firms working on large language models across the country.

1687942806840.png1687942811556.png
 

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member
Please, Log in or Register to view URLs content!

China Unicom unveils Honghu large language model (pretty sure this uses Huawei technology)
2 versions: 1 with 800 million parameters and another with 2 billion parameters

China Unicom will build large scale computing base for this
 
Top