Can confirm. I gave up on ChatGPT when it censored itself on questions re group differences in crime rates, academic achievement, etc. At first you could get round this using the DAN mode prompt, but this got shut down pretty quickly. When it did answer in DAN mode at first it got very weird and "talked" in a very familiar vernacular, kept calling me "hunny". Most odd. All societies have their taboos and shibboleths, in the current Western world these concern race and "gender".1. In China, access to the GitHub equivalent for AI models is blocked because the data hosted there is deemed "too sensitive", and this is screwing Chinese developers.
-Git sites like github, gitlab and China gitee are for storing code not the weights of the models, i don't what is the Chinese equivalent of hugging face is. But Chinese made models are pretty popular in hugging, Qwen, Yi and ChatGLM have been trending a lot there.
2. In China, new AI models take 2-3 months to be approved, which is similar to the license Raj in India in the 1950s. In the AI world, 2-3 months is a very long time and this will discourage a lot of startups.
-I don't know what a "Raj" the 50s have anything to do with a 2023 large AI model, but ok, two months doesn't sound that bad.
3. The Chinese language as it is used online is fundamentally more confusing due to strategies that people use to avoid censorship, and therefore it is harder to train AI models.
-All I can say is that for their size Chinese LLMs are pretty smart, WizardCode, WizardLM, ChatGLM, Yi and Qwen are one of the smartest models i have tested.
-In term of censorship, the Chinese models are probably censored China sensitive stuff like territorial issues, NSFW things and terrorism. but Western model are even more censored the Chinese ones because apart from the NSFW and terrorism stuff, you have all the good woke agenda incorporated in these models.
I just wanted to know what the hell happened to shinzo abe #whereisshinzoCan confirm. I gave up on ChatGPT when it censored itself on questions re group differences in crime rates, academic achievement, etc. At first you could get round this using the DAN mode prompt, but this got shut down pretty quickly. When it did answer in DAN mode at first it got very weird and "talked" in a very familiar vernacular, kept calling me "hunny". Most odd. All societies have their taboos and shibboleths, in the current Western world these concern race and "gender".
why do you take these things seriously?
This guy thinks China may fall permanently behind the US in Generative AI because of onerous regulations. He gives three reasons:
1. In China, access to the GitHub equivalent for AI models is blocked because the data hosted there is deemed "too sensitive", and this is screwing Chinese developers.
2. In China, new AI models take 2-3 months to be approved, which is similar to the license Raj in India in the 1950s. In the AI world, 2-3 months is a very long time and this will discourage a lot of startups.
3. The Chinese language as it is used online is fundamentally more confusing due to strategies that people use to avoid censorship, and therefore it is harder to train AI models.
Do any of these points have validity?
I do notice that although Chinese LLM were finally allowed to be released to the public by regulators at the end of August, none of them have been offered overseas in other languages. That means that while ChatGPT and other U.S. models are able to take advantage of inputs, training, and usage from people and organizations around the world, Chinese models cannot. They are trapped within the wall of China. Is this something that might eventually change? How would a non-Chinese person, for instance, use ErnieBot?
There is some confusion about datasets, most big quality datasets are behind iron curtain walls of AI companies, facebook hasn't release their dataset, they just released the weights of their LLM and called "open", not the code or dataset just the weights, the same goes for google, mistral, anthropic, 01Ai, Alibaba, Baidu, Tsinghua. The "open" datasets that are "free" are much smaller, usually for academic use and not amount of access to "github" will change that.why do you take these things seriously?
There is actually 1 major advantage Chinese LLMs have over Western ones. That is access to data inside China. I was talking to Taylor about this recently and he mentioned that China will probably just block out Western LLM access to Chinese data. And that will end it for Western LLM in terms of training against Chinese language contents.
There hasn't btw been demonstration that user input actually helps LLMs. LLMs right now improve through digesting online content or books or print media and such
Erniebot according to Taylor performed better than even GPT4 on the tests that he ran