Alibaba released their reasoning model for Qwen.
Gave it a spin against Deepseek's R1. Used thinking and web search for both models.
Just a simple vibe question. Told them a specific sum for 2018 and then asked to adjust for inflation "today". I purposefully did not give a date to see how well they would approximate today being Feb of 2025.
DeepSeek failed miserably, gave me Oct 2023. The answer was obviously incorrect as a result of that.
By contrast, Qwen gave me the correct date (Feb 2025) but also approximated inflation adjustment as far as it could. Its final answer was very close to what I calculated. Moreover, DeepSeek was very slow whereas Qwen was much faster (but still not as fast as it should be). I really like DeepSeek but clearly it should not rest on its laurels. The other Chinese labs are catching up fast.