So we finally have a METR estimate of Mythos.
View attachment 174646
FYI, Mythos seems to be trained to be specifically good at software. In non-software benchmarks it didn't do much better than GPT5.5 except in HLE (likely due to Mythos being much bigger in size).
Dario himself said that he expect Chinese labs to reach this capability in 6-12 months. So the rest of us shouldn't have to wait very long. There is also continual algorithmic improvements, so it's not clear you really need a model as big as mythos to reach the same capability a year from now.