To not derail the China's scientific and technological development thread any further, I'm gonna post my thoughts here on ChatGPT, AI and the current development towards making it multimodal. I think the hype and attention around ChatGPT made a lot of people overestimate the capabilities and potential of the tool, because imo it just basically an upgraded version of a search engine.
All the new features that they're planning to put into it like image generation, video generation and voice synthesis doesn't really add much to its capabilities, but just make those pre-existing generative AI models accessible within the chatbot. The use cases for ChatGPT is more for multi-media and somewhat for computer programming related task, but other than that, it doesn't really help China in critical sectors like scientific discovery, semiconductors and military technology.
On the topic of AGI, I don't think the current approach of making the AI model like ChatGPT multi-modal is the right direction, It looks more like they're adding those image/video generation and speech synthesis features to make it more flashy for the average joe to get more hyped up about the project. The effort should be more on the processing side of things, like giving the AI the ability to simulate reality like physics, logical reasoning or do their own scientific research and discovery.
Also
@Overbom mentioned the issue AI development is having right now is scalability, I think the issue is they're trying to cram and train everything into that one LLM model, I wonder if they could solve the issue by having the tool be composed of multiple specialized AI model and one LLM model to act as the interface and the control unit instead, kind of like LORA being used in Stable Diffusion.