Intel's Sandy Bridge, AVX Extensions On Track For Q4 2010
Sandy Bridge is a 'tock' in Intel's tick-tock model, meaning it's a new architecture delivered on an existing process technology. Many of the processor's characteristics will be familiar, such as the 256K of L2 cache and the 8MB L3, but the chip has a few surprises up its sleeve. The GPU core will be integrated into the CPU die, unlike the 32nm / 45nm approach Clarkdale uses, and the processor will support what Intel calls AVX (Advanced Vector Extensions.) Intel reportedly projects that a Sandy Bridge CPU running x87 FPU code will be capable of up to 2GB/s of double-precision throughput per core. If the same workloads are then rewritten to support AVX, theoretical maximum performance is no less than 8GB/s per core (double precision), a fourfold increase. AVX doesn't just add new instructions, Intel claims it streamlines and allows the CPU to execute older instructions more quickly as well.
If AVX is as potent as Intel claims, it could be the most important SIMD introduction since SSE2. For those of you who don't remember, the Pentium 4's launch performance, particularly its FPU performance, was quite weak compared to other products from both Intel and AMD. Back then, SSE itself was just creeping into the market; software support for the Pentium 4's SSE2 was almost nonexistant. As time passed, software vendors introduced new products that utilized SSE2, and the P4's comparative ranking began to change. Where a P4 2GHz had once been often outperformed by a 1.4GHz Athlon with no SIMD support at all, it was now capable of dominating its rivals.*
If Sandy Bridge's power efficiency is significantly achieved through the use of AVX, PC performance could rise significantly as vendors adopt the new extensions. Long-term, AVX could even have an impact on mobile and ultra-mobile devices. The fewer clock cycles it takes to perform a task, the more quickly a CPU can return to its power-saving idle mode. Intel has sunk a great deal of work into tackling power consumption by building ever more frugal processors, but there's definitely something to be said for attacking the problem in the other direction. We wouldn't expect AVX to debut on Atom anytime soon, but given time, it definitely could.
* There was always a certain argument that the P4 wouldn't have needed SSE2 so much if its performance in x87 code wasn't so weak. We acknowledge this, but 10 years later, it seems beside the point. ;)