A relatively few smart people are needed at the beginning and then the AI takes over, with some human input here and there. Even data annotation itself, which is the most labour intensive process, increasingly becomes automated nowadays.Indeed, but what I would say to you is that it’s they quality if the human that matters now. Without humans it’s just AI training AI and ultimately those AI’s were based on human annotations all the way back.
Nowadays it’s the experts that add the key annotations and train the lower cost annotators, so in the end it’s all down to your education system.
Just to give you an idea, OpenAI has just 375 employees. If you consider their different products and administration costs, we could generously assume a maximum of 100 people are involved in their language model projects.
Of course they hired external contractors, but I wouldn't expect that number to be something crazy like 10000 or something like that. They would need their core data annotation team (20 people?) to supervise them in order to ensure a consistently high quality data annotation dataset. In general, human data annotation or supervised learning quickly finds itself in big problems when the time to scale comes