NVIDIA VP Declares CPU Scaling, Moore's Law, Dead

Bill Dally, chief scientist at NVIDIA, has written an article at Forbes alleging that traditional CPU scaling and Moore's Law are dead, and that parallel computing is the only way to maintain historic performance scaling. With six-core processors now available for $300, Dally's remarks are certainly timely, but his conclusions are a bit premature.

Will The Real Moore's Law Please Stand Up And/Or Die Already?


Moore's original representation of his now-famous law.

Dally's claims Moore's Law is dead because "CPU performance no longer doubles every 18 months." This is little more than a straw man; Moore's Law states that the number of transistors that could be built within a chip for minimal cost would double about every two years. The assumption that there's a 1:1 correlation between additional transistors and additional performance neglects significant sections of Moore's work.

Dally's larger point is that we've reached the effective limit of serial computing and must switch to parallel computing, aka GPU computing, to return to historical performance. He writes: "Every three years we can increase the number of transistors (and cores) by a factor of four. By running each core slightly slower, and hence more efficiently, we can more than triple performance at the same total power. This approach returns us to near historical scaling of computing performance."

Dally characterizes modern multi-core CPU designs as akin "to putting wings on a train," and claims conventional CPUs consume too much power per instruction executed to continue scaling at historical level. Switching to parallel computing will be difficult thanks to entrenched standard practices, the sheer number of serial programs that need to be converted, and a scarcity of programmers trained in parallel programming techniques, but in his view, it's the only solution. We're not so optimistic.

The Myth of Hard Work

Dally's explanation of the current state of parallel programming presupposes that the only thing standing between us and vast multicore processor arrays is hard work and funding. This is oversimplified almost to the point of being disingenuous. First, not all programs can be parallelized, and second, parallelism itself inevitably hits a point of diminishing marginal return. GPU computing, with its vast banks of processors, excels precisely in those rare cases where programs can scale almost linearly to take advantage of more processing cores.

Taking advantage of parallel cores in the manner Dally suggests in consumer products requires that we literally reinvent the wheel. If the goal is to use small, simple processors, code compilers would have to be designed from the ground up to handle the complexity of spinning threads to at least dozens of tiny cores. We can't accurately predict what performance might look like in these sorts of systems because we haven't even invented the tools we'd need to build them.

AMD
and Intel seem to collectively have a better idea. Technologies like Turbo Boost—increasing the speed of 1-2 cores while turning other cores off—provides performance the consumer can take advantage of immediately. It's currently thought that Moore's Law will hit an immutable barrier sometime around 2021, but it still has quite a ways to go. No one is denying the tremendous performance of GPU computing in the right areas, but Dally's report on the death of Moore's Law is greatly exaggerated.