As far I understand apart from the synthetic data I think DeepSeek bought their training data, so is probably more bias towards English and Chinese. But the model is open source and can be finetuned using Russian data relatively easy.It seems like Chinese models need to make their performance better in other languages, like Russian for example