I have been saying this since last year or so. The internet is like 1~3%(thats me being generous) of pristine data the rest could just as well be seen as synthetic brain rot.
I have been saying this since last year or so. The internet is like 1~3%(thats me being generous) of pristine data the rest could just as well be seen as synthetic brain rot.
Don't these stupendous LLM companies have data from CHYNA? I mean, why do they need Chinese data when they already have access to the true quality data provider which is India.
just api though, not open weights afaik
Alibaba released longer context length version of Qwen-2.5