I'm going to have to try manus out. This tool looks pretty good. If it as good at its task and cheap as it states, then I need to get in on it.
I mean the limitation right now for their team is probably inference resources. I assume now they are popular, they will have no issue securing more compute
My guess is Manus is an automated/autonomous orchestration layer that coordinates a series of worker models... and with the orchestration layer itself trained with RL+Zero , for an end-to-end agentic AI ability ....
So its orchestration isnt hardcoded but dynamic in real-time, probably a hybrid model that makes state based decision with a transformer/LLM backbone allows high level language based intent delegation etc -> feedback to itself
Basically a general purpose end-to-end AI that can do for computer screens what Tesla FSD did to self driving with only video modality and cameras as sensors
From Photons IN -> Controls OUT
to
Pixels IN -> Actions OUT
Except as they pointed out, its actually headless, doesnt need a physical monitor or screen and can work off a virtual or emulated display....
Once this process becomes refined enough and as AI is still scaling in intelligence, wont be long before entire categories of jobs that are now done by laptop/ in front of a screen will be replaced by AI