You have a point, to a degree. There has been little improvement in desktop CPU clockrate in the last 15 years and that has largely put a break on single thread performance, outside of architectural improvements that allow for extracting more instruction level parallelism, together with better OoO engines and caches. It's important to understand that the latter have been helped with node improvements through higher transistor density and smaller power consumption, so that new processors have more registers, larger internal buffers, etc.And it gives the indication of the the experienced improvement in processing speed is related to the tweaking of the circuit design rather to the manufacturing methods.
So, the 45/22/3nm technology doesn't give edge and improvement - the cpu and compiler design gives .
At the same time, we've seen substantial improvements in aggregate performance and perf/watt. Problems that are parallelizable continue to be solved faster and with less energy with each new iteration of CPUs. This is nowhere more apparent than in the mobile device arena. Yet, we remain many orders of magnitude above the theoretical limit in perf/watt. It is well understood that large problem sizes facilitate parallelization and in this age of data abundance there appears to be no shortage of problems solvable by throwing ever larger parallel computers at them. Don't get me wrong: single thread is king. All else equal, a 100GHz CPU is better than 10x10GHz CPUs. While we have hit a limit as to how far we can innovate with the former, we have not yet hit a limit with the latter.
Once we approach the limit in how small we can make the transistors on silicon, and we still have quite a bit to go, the innovation might continue in the direction of HEMT. I am thinking of something like the failed GaAs Cray-3 and Cray-4 supercomputers. We might also start seeing more fixed function analog computer blocks, for example like the ones that solve FT with OFT methods.