Intel's Eight-Core, Heavily-Updated Poulson Itanium Breaks Cover, Heads To Market

Intel's Itanium has spent the past year in an unwelcome spotlight. The war between HP and Oracle over whether or not the latter had an obligation to support HP servers after publicly promising to do so dragged Intel's Itanium roadmap into the limelight. Ultimately, the judge found that Oracle had to live up to its contractual obligations and concluded that the case was brought for personal reasons, but the damage to HP was done. Disclosures that came to light during the trial indicated that HP had paid Intel a sizeable sum of money to continue developing Itanium past the point when Intel would've otherwise canceled the project.

There's nothing unusual about paying a company to build a processor, but under the circumstances, you might think this latest Itanium would be a simple refresh of the original microarchitecture on a smaller process node. It isn't. The Intel Itanium 9500 family, codenamed Poulson, is the most significant refresh Intel has ever done on Itanium. Just moving from 65nm to 32nm technology would've substantially reduced power consumption and increased clock speeds, but Santa Clara has overhauled virtually every aspect of the CPU.

Poulson can issue 11 instructions per cycle compared to Tukwila's six. It adds execution units and rebalances those units to favor server workloads over HPC and workstation capabilities. Its multi-threading capabilities have been overhauled and it uses faster QPI linkages between the CPUs. Poulson does away with the global stalls that plagued earlier Itanium processors; dependencies and cache misses aren't as detrimental on Poulson as they were on earlier Itanium processors like Tukwila.

The L3 cache design has also changed. Previous Itanium 9300 processors had a dedicated L3 cache for each core. Poulson, in contrast, has a unified L3 that's attached to all its cores by a common ring bus. Unlike the x86 design, however, Poulson's ring bus is bi-directional -- total L3 cache bandwidth is estimated to be 700GB/s or higher.

Itanium's original design was driven by the philosophy that software compilers should do all the heavy lifting when it came to optimization, parallelization, and scheduling. The Intel and HP engineers that worked on the project didn't think out-of-order execution would scale particularly well on x86 processors and built what they believed would be a high-performance microarchitecture that would scale far more effectively. In hindsight, OOoE designs outperformed these expectations and left Itanium high and dry.

It's not clear if Itanium has a long-term future. HP is the CPU's only major customer, and HP has stated that Oracle's shenanigans and FUD-spreading did significant damage to its Itanium server business. Even taken with a grain of salt, however, the Itanium 9500's performance figures are impressive. This could be the chip that breaks free of the "Itanic" moniker and establishes IA-64 as a healthy competitor for IBM's POWER 7.