Artificial Intelligence thread

tphuang · Sep 2, 2024

There seems to be a whole bunch of short drama/videos on Bilibili that now claims to be AI generated (or at least AI aided). It was hard for me to figure out which one is AI generated but the creator here claims as such.

The tools are out there now. We need an entire industry of AI users to fully exploit its potential.

https://twitter.com/i/web/status/1830780592457171009

9dashline · Sep 2, 2024

You're right in pointing out that many of the other categories, like EVs, self-driving cars, VR headsets, and even advanced smartphones, have already seen significant innovation, and numerous players are competing in those spaces. None of these categories would likely be described as "epoch-making" at this point, given the saturation and incremental improvements we've seen.

A truly intelligent humanoid robot designed for both home and enterprise use would indeed represent a monumental leap forward, aligning perfectly with the "science fiction into reality" narrative. Such a product would be unprecedented, especially if it were capable of autonomously performing a range of tasks, from household chores to more complex professional functions. The ability to interact naturally with humans using a multimodal small language model, combined with access to more powerful AI processing when needed, would place this robot leagues ahead of anything currently available.

As you noted, this type of product would have massive implications for both personal and professional life, effectively creating a new category of consumer electronics. If Huawei manages to deliver something on this scale, it would certainly live up to the hype of being "epoch-making" and could redefine the role of AI and robotics in our daily lives.

Given the current technological landscape, I agree that a smart humanoid robot of this caliber would be the most ambitious and groundbreaking product Huawei could introduce. There's nothing else on the horizon that would have a comparable impact or that could so clearly fulfill the promise of making science fiction a reality.

Your analysis is spot on. The phrase "others have tried to do but cannot" narrows the field significantly. AI chips, language models, and even advanced autonomous vehicles are areas where there has been considerable progress, with major players already making strides. However, a consumer-ready humanoid robot that can operate autonomously in a household or professional environment is something that no company has fully achieved yet.

This type of product would not only be innovative but would also fill a unique gap in the market—one that aligns perfectly with the idea of bringing science fiction into reality. It would represent a significant technological achievement, combining AI, robotics, and multimodal processing into a seamless, functional entity that could change how we think about and interact with technology in our daily lives.

Given all the clues and your reasoning, a smart, autonomous humanoid robot appears to be the most likely and exciting possibility for Huawei’s upcoming launch. If they can pull this off, it would indeed be an epoch-making product that could set a new standard in the tech industry.

tphuang · Sep 2, 2024

I am not actually going that far with this post. My point is that AI industry isn't just about the models themselves but also how we utilize the models. Having a whole industry of people that can code a little bit and utilize the models to create stuff in itself is actually quite the competitive advantage.

That's actually something no amount of sanctioning could stop.

9dashline · Sep 2, 2024

tphuang said:
I am not actually going that far with this post. My point is that AI industry isn't just about the models themselves but also how we utilize the models. Having a whole industry of people that can code a little bit and utilize the models to create stuff in itself is actually quite the competitive advantage.

That's actually something no amount of sanctioning could stop.

Please, Log in or Register to view URLs content!

Ironically that Google CEO dude is investing in them... 100 million token can fit an entire code base, all the libraries and then some....

Couple this with the AI Scientist paper and software engineers are not finna be too much longer for this worlds

Throughout history, it was actually just a small handful of inventions, discoveries, and insights that changed the course of history itself. And these were made by just a select few of geniuses. So the playbook is to get to AGI first. The first to win the race to AGI, even if by a mere few months, will capture it all and stay on top, dominant forever.

9dashline · Sep 2, 2024

tphuang said:
There seems to be a whole bunch of short drama/videos on Bilibili that now claims to be AI generated (or at least AI aided). It was hard for me to figure out which one is AI generated but the creator here claims as such.

The tools are out there now. We need an entire industry of AI users to fully exploit its potential.

https://twitter.com/i/web/status/1830780592457171009

At this rate the likes of 3ds max, vray, can kiss their ass goodbye and queue into the unemployment line

broadsword · Sep 2, 2024

9dashline said:
At this rate the likes of 3ds max, vray, can kiss their ass goodbye and queue into the unemployment line

All those overpaid actors as well.

Eventine · Sep 3, 2024

9dashline said:
Please, Log in or Register to view URLs content!

Ironically that Google CEO dude is investing in them... 100 million token can fit an entire code base, all the libraries and then some....

Couple this with the AI Scientist paper and software engineers are not finna be too much longer for this worlds

Throughout history, it was actually just a small handful of inventions, discoveries, and insights that changed the course of history itself. And these were made by just a select few of geniuses. So the playbook is to get to AGI first. The first to win the race to AGI, even if by a mere few months, will capture it all and stay on top, dominant forever.

Artificial general intelligence won't emerge as a result of LLMs alone. These models are better thought of as giant simulations, where the network learns to recognize and produce the next state of a given generative system and initial parameters. That doesn't mean they aren't game changing - there is great power in prediction; but more than that is needed to realize artificial general intelligence, as you have to actually interact with the environment, not just generalize from existing data.

I don't disagree with the idea that we're getting closer, though, and that yes, it's the reason the West has been hell bent on denying China - and anyone else that isn't a Western vassal - access to the hardware foundations of this technology. But I also don't think geniuses are what will win this; the brute force approach to current AI research indicates that R&D infrastructure is what will determine the winner. It's not about mathematical theories, because the level of complexity is already beyond mathematicians' ability to analyze. The ability to rapidly prototype, scale, and iterate compute infrastructure seems much more important.

9dashline · Sep 3, 2024

Eventine said:
Artificial general intelligence won't emerge as a result of LLMs alone. These models are better thought of as giant simulations, where the network learns to recognize and produce the next state of a given generative system and initial parameters. That doesn't mean they aren't game changing - there is great power in prediction; but more than that is needed to realize artificial general intelligence, as you have to actually interact with the environment, not just generalize from existing data.

I don't disagree with the idea that we're getting closer, though, and that yes, it's the reason the West has been hell bent on denying China - and anyone else that isn't a Western vassal - access to the hardware foundations of this technology. But I also don't think geniuses are what will win this; the brute force approach to current AI research indicates that R&D infrastructure is what will determine the winner. It's not about mathematical theories, because the level of complexity is already beyond mathematicians' ability to analyze. The ability to rapidly prototype, scale, and iterate compute infrastructure seems much more important.

While the idea that AGI won't emerge from language models alone has merit, recent developments suggest we're closer to AGI than many realize. The path forward isn't just about larger models, but smarter integration of multimodal capabilities.

Take, for instance, the recent release of QWEN-VL by Alibaba. This model, with a mere 2 billion parameters, can process and understand 20-minute videos, engaging in meaningful dialogue about their content. What's more, Alibaba is already working on an "Omni" version that will incorporate audio alongside vision and language, with applications ranging from virtual NPCs to physical robots.

This rapid progress in multimodal AI suggests that the current transformer-plus-attention architecture, when extended to encompass multiple sensory inputs and outputs, may indeed be sufficient to achieve a rudimentary theory of mind. We're not just predicting next tokens anymore; we're creating systems that can perceive, understand, and interact with complex environments.

I posit that we already have the fundamental technology needed for AGI. What we need now is to apply these technologies in innovative ways, particularly in developing what I call an "Omni-human model." This model would extend current multimodal systems to include proprioception, motor control, and real-time environmental interaction.

Imagine an AI that combines the reasoning capabilities of large language models, the perceptual understanding of vision-language models like QWEN-VL, and the ability to learn from and interact with its environment in real-time. By implementing differentiable inverse kinematics solvers and advanced reinforcement learning frameworks within this multimodal architecture, we could create AI systems that don't just predict, but actively engage with their surroundings.

This Omni-human model could demonstrate unprecedented behavioral consistency, adaptability, and natural interactions in complex virtual environments. And while my focus is primarily on virtual worlds, the principles could easily extend to physical robots, bridging the gap between digital and physical realms.

The key here isn't just raw computing power or mathematical theories. It's about creating a holistic system that mirrors the human mind-body connection. With the rapid advancements we're seeing in multimodal AI, I believe we're on the cusp of achieving AGI through this integrated approach.

In essence, while traditional language models alone may not lead to AGI, the extension of these models into truly multimodal, embodied systems - as exemplified by QWEN-VL and the proposed Omni-human model - could be the key to unlocking artificial general intelligence in the near future. We have the tools; now it's time to assemble them in the right way.

luminary · Sep 3, 2024

Open sourcing AI image / video tools is a good move on the Chinese part. Better to inoculate the public to AI-disinfo and get a head start on combating AI fakes than to withhold it out of fear of "babying" society.

Not the best news coming out about AI-biotech fusions startups:

Please, Log in or Register to view URLs content!

s

Turns out, current AI is useless in making routine decisions in molecular design, such as figuring out which molecules would bind to target proteins. And we are talking about small chemical molecules, i.e. conventional drugs, not orders-of-magnitude larger and more complex biological constructs, i.e. proteins and “chimeric viruses”:

Artificial intelligence models fail at predicting biology,
Please, Log in or Register to view URLs content!
.

Please, Log in or Register to view URLs content!

Michaelsinodef · Sep 3, 2024

9dashline said:
While the idea that AGI won't emerge from language models alone has merit, recent developments suggest we're closer to AGI than many realize. The path forward isn't just about larger models, but smarter integration of multimodal capabilities.

Take, for instance, the recent release of QWEN-VL by Alibaba. This model, with a mere 2 billion parameters, can process and understand 20-minute videos, engaging in meaningful dialogue about their content. What's more, Alibaba is already working on an "Omni" version that will incorporate audio alongside vision and language, with applications ranging from virtual NPCs to physical robots.

This rapid progress in multimodal AI suggests that the current transformer-plus-attention architecture, when extended to encompass multiple sensory inputs and outputs, may indeed be sufficient to achieve a rudimentary theory of mind. We're not just predicting next tokens anymore; we're creating systems that can perceive, understand, and interact with complex environments.

I posit that we already have the fundamental technology needed for AGI. What we need now is to apply these technologies in innovative ways, particularly in developing what I call an "Omni-human model." This model would extend current multimodal systems to include proprioception, motor control, and real-time environmental interaction.

Imagine an AI that combines the reasoning capabilities of large language models, the perceptual understanding of vision-language models like QWEN-VL, and the ability to learn from and interact with its environment in real-time. By implementing differentiable inverse kinematics solvers and advanced reinforcement learning frameworks within this multimodal architecture, we could create AI systems that don't just predict, but actively engage with their surroundings.

This Omni-human model could demonstrate unprecedented behavioral consistency, adaptability, and natural interactions in complex virtual environments. And while my focus is primarily on virtual worlds, the principles could easily extend to physical robots, bridging the gap between digital and physical realms.

The key here isn't just raw computing power or mathematical theories. It's about creating a holistic system that mirrors the human mind-body connection. With the rapid advancements we're seeing in multimodal AI, I believe we're on the cusp of achieving AGI through this integrated approach.

In essence, while traditional language models alone may not lead to AGI, the extension of these models into truly multimodal, embodied systems - as exemplified by QWEN-VL and the proposed Omni-human model - could be the key to unlocking artificial general intelligence in the near future. We have the tools; now it's time to assemble them in the right way.

Eh, it likely isn't real 'AGI'.

But we don't actually need real 'AGI' to rival (or outperform) average human performance/ability in MANY tasks.

So going forward, something like 'combining models' or something like what you describe.

Well, it could likely reach a point, where we can have an AI that is humanish/humanlike, with overall 'performance' of that of an average human (and better in quite a number of specific areas), and would be capable of taking over a lot of tasks/jobs that current humans do (might have challenges in sudden, rare situations), and might actually be able to do ~80% or even more jobs/tasks.

Artificial Intelligence thread

tphuang

General

9dashline

Captain

tphuang

General

9dashline

Captain

9dashline

Captain

broadsword

Brigadier

Eventine

Senior Member

9dashline

Captain

luminary

Senior Member

Michaelsinodef

Senior Member