Artificial Intelligence thread

gadgetcool5

Senior Member
Registered Member
I think this is the first time I am reading language as a barrier to progress in A.I.

English is a simple language to learn and for the big teams training A.I models in China, surely all of them must have the ability to work in English too.

For the smaller teams, language may be a problem, but they face much bigger issues such as affordable chips. Language is the least of their problems.
Why can't Chinese models train on publicly available English content and translate it into Chinese, and vice versa? That's what I don't get about the language issue.
 

luminary

Senior Member
Registered Member
although imo, all of his points just make it harder for outside AI firms from beating Chinese firms in Chinese language LLM
That's a weird statement to make. The purported mission of the AGI industry is to make the best artificial general intelligence, not simply some kind of LLM.

I don't think anybody in Silicon Valley is remotely thinking about entering the Chinese market. Even ignoring the race to "AGI singularity", there's a real race of who will be providing LLMs to RoW. Which the US has a natural advantage, as English is lingua franca.



so i think a huge storm is coming for Filipino economy actually.

There is a pretty large section of Filipino work force that do call center type of stuff for Western countries. As AI continues to get better, the automated call functionality will get better. I really think over the next 10 years, we will be seeing a huge reduction in need for Asian call center type of offerings. Which is devastating for Philippines and India
They sure know it.
1702611425893.png
Philippines tops the list of countries most interested in AGI by Google search.

From
Please, Log in or Register to view URLs content!
 

sunnymaxi

Captain
Registered Member

China recently unveiled its first open-source large language model (
Please, Log in or Register to view URLs content!
) that specializes in translating and processing ancient Chinese books. The model with a large corpus of over 2 billion Chinese characters can effectively sort out key information from intricate ancient texts and is capable of generating ancient poems. The model is named after a renowned philosopher in ancient China, Xunzi, who lived during the Warring States Period (475-221 BC) about 2,200 years ago.

Image
 

sanctionsevader

New Member
Registered Member

China recently unveiled its first open-source large language model (
Please, Log in or Register to view URLs content!
) that specializes in translating and processing ancient Chinese books. The model with a large corpus of over 2 billion Chinese characters can effectively sort out key information from intricate ancient texts and is capable of generating ancient poems. The model is named after a renowned philosopher in ancient China, Xunzi, who lived during the Warring States Period (475-221 BC) about 2,200 years ago.

Image
Here it is:
Please, Log in or Register to view URLs content!

And the chatbot:
Please, Log in or Register to view URLs content!

As someone working with older Chinese texts this is pretty cool to see.
 

tonyget

Senior Member
Registered Member
ByteDance's ChatGPT account has been suspend,because they have been secretly using it to train their own LLM,which violates OpenAI user agreement

Please, Log in or Register to view URLs content!

北京时间 12 月 16 日,据外媒报道,在生成式 AI 竞争中处于落后的字节跳动想要“抄近道”,该公司一直在秘密使用 OpenAI 的技术开发自家大语言模型,这违反了 OpenAI 的服务条款。目前,字节跳动的账户已被 OpenAI 暂停。

外媒称,在 AI 领域,字节跳动的这一做法通常被视为一种“失礼”行为,也直接违反了 OpenAI 的服务条款。OpenAI 的服务条款规定,该公司所输出的模型不能被用于“开发任何与我们的产品和服务竞争的 AI 模型”。字节跳动通过微软购买了 OpenAI 的访问权限,但是微软也制定了与 OpenAI 同样的政策。

外媒获得的字节跳动内部文件证实,字节跳动在几乎每个开发阶段都依赖 OpenAI 的应用程序接口(API)来开发其代号为“种子计划”(Project Seed)的基础大语言模型,包括训练和评估模型。参与“种子计划”的员工都深知这一行为的不良影响。根据字节跳动员工在内部通讯平台飞书海外版 Lark 的聊天记录,他们讨论了如何通过“数据脱敏”来粉饰证据。外媒称,字节跳动员工大量使用 OpenAI 的技术,以至于“种子计划”的员工经常达到 OpenAI API 的最大访问上限。

内部文件显示,字节跳动更多的是在“种子计划”的早期阶段使用 OpenAI 的技术。几个月前,该公司命令该团队在“模型开发的任何阶段”停止使用 GPT 生成的文本。大约在这个时候,该公司获得了批准发布了自家 AI 大模型“豆包”,从而让“种子计划”上线。但是,字节跳动继续以违反 OpenAI 和微软服务条款的方式使用 API,包括评估豆包背后模型的性能。一位对字节跳动内部情况有第一手了解的人指出,“他们说他们想确保一切都是合法的,但他们实际上只是不想被抓住把柄”。

字节跳动发言人约迪・赛斯(Jodi Seth)对此回应称,GPT 生成的数据在“种子计划”的早期开发中用于注解模型,并且在今年年中左右的时候已从字节跳动的训练数据中删除。“字节跳动获得了微软的许可能够使用 GPT API。我们使用 GPT 驱动非中国市场的产品和功能,但使用我们自主开发的模型驱动豆包。豆包只在中国提供。”赛斯在声明中称。

OpenAI 发言人尼克・菲利克斯(Niko Felix)发表声明,确认字节跳动的账户已被暂停。“所有 API 客户必须遵守我们的使用政策,以确保我们的技术被用于好的一面。虽然字节跳动很少使用我们的 API,但我们在进一步调查期间已暂停了他们的帐户。如果我们发现他们的使用不符合公司政策,我们将要求他们做出必要的改变或终止他们的账户。”菲利克斯表示。

微软发言人弗兰克・肖(Frank Shaw)在一份声明中表示:“Azure OpenAI 服务等微软 AI 解决方案属于我们有限访问框架的一部分,这意味着所有客户都必须申请并获得微软的批准才能访问。我们还制定了标准并提供资源,帮助我们的客户负责任地使用这些技术,并遵守我们的服务条款。我们还制定了发现滥用行为的流程,并在企业违反我们的行为准则时停止他们的访问。”凤凰网科技《AI 哨所》对此将持续关注。
On December 16, Beijing time, foreign media reported that ByteDance, which is lagging behind in the generative AI competition, wants to "cut corners." The company has been secretly using OpenAI's technology to develop its own large language model, which violates OpenAI's terms of service. Currently, ByteDance’s account has been suspended by OpenAI.

Foreign media said that in the field of AI, ByteDance’s approach is usually regarded as a “discourteous” behavior and a direct violation of OpenAI’s terms of service. OpenAI’s terms of service state that models exported by the company cannot be used “to develop any AI models that compete with our products and services.” ByteDance purchased access to OpenAI through Microsoft, but Microsoft also has the same policy as OpenAI.

Internal Bytedance documents obtained by foreign media confirm that Bytedance relies on OpenAI’s application programming interface (API) at almost every stage of development to develop its basic large language model code-named “Project Seed”, including Train and evaluate models. Employees involved in Project Seed are well aware of the negative consequences of this practice. According to chat records of ByteDance employees on Lark, the overseas version of Feishu, an internal communications platform, they discussed how to whitewash evidence through "data desensitization." According to foreign media, ByteDance employees use OpenAI’s technology extensively, so much so that employees of the “Seed Project” often reach the maximum access limit of the OpenAI API.

Internal documents show that ByteDance is using OpenAI’s technology more in the early stages of a “seed program.” A few months ago, the company ordered the team to stop using GPT-generated text "at any stage of model development." Around this time, the company received approval to release its own large AI model, Beanbag, bringing Project Seed online. However, ByteDance continues to use the API in ways that violate OpenAI and Microsoft’s terms of service, including evaluating the performance of the models behind Beanbao. “They say they want to make sure everything is legal, but they really just don’t want to get caught,” said one person with first-hand knowledge of what’s going on inside ByteDance.

Bytedance spokesperson Jodi Seth responded that the data generated by GPT was used to annotate models in the early development of the "Seed Project" and was removed from Bytedance around the middle of this year. deleted from the training data. "ByteDance received permission from Microsoft to use the GPT API. We use GPT to drive products and features in non-China markets, but use our self-developed model to drive Beanbao. Beanbao is only available in China," Seth said in a statement.

OpenAI spokesperson Niko Felix issued a statement confirming that ByteDance’s account has been suspended. "All API customers must adhere to our usage policies to ensure our technology is used for good. While ByteDance rarely uses our API, we have suspended their accounts while we further investigate. If we find Their use is inconsistent with company policy and we will require them to make the necessary changes or terminate their accounts," Felix said.

Microsoft spokesman Frank Shaw said in a statement: "Microsoft AI solutions such as the Azure OpenAI service are part of our limited access framework, which means all customers must apply for and receive approval from Microsoft for access. We also set standards and provide resources to help our customers use these technologies responsibly and comply with our Terms of Service. We also have processes in place to detect abuse and stop businesses when they violate our Code of Conduct ." Ifeng Technology's "AI Outpost" will continue to pay attention to this.
 

luminary

Senior Member
Registered Member
SenseTime co-founder dies suddenly and unexpectedly from unspecified illness, at age 55.

Shares of Chinese artificial intelligence company
Please, Log in or Register to view URLs content!
plunged as much as 18.25% on Monday, falling to an all-time low after news of its founder’s death.

Hong Kong-listed shares of SenseTime dropped to as low as 1.03 Hong Kong dollars ($0.13) on Monday – the lowest level in the firm’s history according to LSEG data.
Shares of the AI company are down about 50% year-to-date.
SenseTime founder and AI scientist Tang Xiao’ou died on Friday at the age of 55 after succumbing to an illness, the
Please, Log in or Register to view URLs content!
on Saturday. SenseTime did not reveal the cause of his death.
“It is with a very heavy heart that we announce the sad news that our beloved founder, Tang Xiao’ou ... succumbed to an illness and left us forever at 11:45pm on December 15, 2023,” said SenseTime
Please, Log in or Register to view URLs content!
 
Top