|This week, at ISSCC (International Solid-State Circuits Conference) Intel unveiled its next-generation Itanium processor, codenamed Poulson. This new octal-core processor is easily the most significant update to Itanium Intel has ever built and could upset the current balance of power at the highest-end of the server / mainframe market. It may also be the Itanium that fully redeems the brand name and sheds the last vestiges of negativity that have dogged the chip since it launched ten years ago.
Here's the sneak peak
To discuss why, we'll have to flip through some history.
From Merced to Tukwila
Intel began work on what would become Itanium back in 1994 in a joint venture with HP. The two companies chose to pursue a design philosophy they termed EPIC (Explicitly Parallel Instruction Computing). As an EPIC processor, Itanium pursued a very different design philosophy compared to the Pentium Pro and the other out-of-order execution processors that followed it.
Instead of using specific CPU hardware to re-arrange and optimally schedule instructions for execution (defined as Out of Order Execution, or OoOE), Itanium relies on the compiler to optimize code at run-time. This allowed the designers of Merced (the first generation Itanium) to devote more die space to execution hardware, thus boosting theoretical performance. The weak link in the chain was the compiler itself. If it failed to detect and exploit thread-level parallelism at runtime only a fraction of the CPU's execution units were in use at any given time.
Intel initially promised Itanium processors would debut in 1999; the first chips didn't actually hit market until 2001. Things got worse from there: the two year delay gave Itanium's competitors time to launch faster versions of their own chips, Itanium's much-touted 32-bit hardware compatibility was slow, and it quickly became apparent that then-modern compilers were not capable of delivering the degree of optimization Itanium required. Supporting applications, meanwhile, were few and far between.
Merced, the First Generation Itanium Processor
This was logical given the chicken-egg dilemma of introducing a brand-new architecture, but it was negative ammunition all the same. Most damaging of all was the way Itanium had been initially marketed. When Intel announced it was targeting a 1999 launch window, respected analysts were soon predicting that the chip's combination of 32-bit compatibility and advanced 64-bit execution would sweep the length and breadth of the x86 industry. The chip's initial weaknesses were significant, but its marketing was worse.
We're not kidding when we say the chip was poorly misrepresented/marketed. The predictions above were made by IDC. Original image courtesy of Wikipedia
Over the last ten years, Intel has refreshed and updated the Itanium core multiple times. The last significant refresh, Tukwila, was built on a 65nm process with up to four cores and 24MB of L3 cache. With Poulson, Intel is leapfrogging 45nm entirely and moving Itanium to its cutting-edge, 32nm process.
Poulson incorporates a number of advances in its record-breaking 3.1B (yes, billion) transistors. It's socket-compatible with the older Tukwila processors and offers up to eight cores and 54MB of on-die memory. It's assumed that Intel will eventually offer Poulson products with less than eight cores and/or with lower amounts of available cache, but the company has announced no details on pricing or SKU structure.
Like Tukwila, Poulson shares a common platform with current Xeon 7500 products. Intel claims that the new chip delivers improved RAS (Reliability, Availability, Stability) services as compared to its predecessors and that it draws significantly less power than Tukwila would if the latter had been shrunk to a 32nm process.
The one thing Intel isn't discussing is what sort of performance boost Poulson can deliver relative to Tukwila. One of Poulson's most notable features is its doubled execution width. Up until now, all Itanium processors could only issue up to six instructions per clock cycle; Poulson boosts that to 12. In theory, Poulson's IPC (instructions per clock cycle) rate should be much higher than that of Tukwila when measured clock-for-clock.
Like all Itanium processors, however, Poulson relies much more heavily on the compiler's ability to schedule instructions for optimum execution than a standard x86 processor. The degree of performance improvement over previous processors, therefore, will depend on whether or not the compiler can hand over enough parallelized threads to take advantage of the architecture's increased capabilities.
Even with this caveat, Poulson should offer both an increase in absolute performance and in performance-per-watt when compared to previous Itanium processors. Longer term we might even see Itanium edging slightly beyond its current niche market status. Itanium's raw performance has never been in doubt when compared to conventional x86 processors using properly optimized code. Given sufficiently intelligent compilers, Itanium could begin to make economic sense in fields that couldn't previously justify the high cost of optimizing for the chip.