I appreciate that you cited something even though its a New York Times Opinion article. That's a quite an old article (I think even before GPT 4 was released) which was giving examples of GPT 3 (which anyone will tell you how bad it was. Btw not even GPT 3.5...). In any case, I consider the real pros' opinions much more credible than whatever these people are saying.I share a lot of Chomsky's views about these systems that he outlines in this article. Yes, I anticipate all your objections: "Who's Chomsky? Does he work at Nvidia? OpenAI?" I still think his take is the correct one.
By the way, there's a simple retort to the proposition that LLMs learn deep insights about the language that they trained on: Have the LLM formulate this insight into a theory of linguistics. That would be a strong indication of an actual intelligence, not just autocomplete.
Your "autocomplete" claim has been directly countered by Ilya Sutskever but because I am not of his caliber on AI, I don't have anything of value to add on his comments. Its your opinion of course, and you know what they say about opinions, everyone has one
But anyway, lets see from Google's recent report something interesting:
And circling back to the original world model argument, some months ago Runway (a leading text2vid company) published this:
Introducing General World Models
by Anastasis Germanidis / Dec 11, 2023
You can think of video generative systems such as as very early and limited forms of general world models. In order for Gen-2 to generate realistic short videos, it has developed some understanding of physics and motion. However, it’s still very limited in its capabilities, struggling with complex camera or object motions, among other things.
So now we have Nvidia, DeepMind, OpenAI and Runway all of them talking about world models but somehow internet people still proclaim that they are all wrong.To build general world models, there are several open research challenges that we’re working on. For one, those models will need to generate consistent maps of the environment, and the ability to navigate and interact in those environments. They need to capture not just the dynamics of the world, but the dynamics of its inhabitants, which involves also building realistic models of human behavior.
Last edited: