Intel Core i7 Processors: Nehalem and X58 Have Arrived
Intel Nehalem Microarchitecture
Intel has said that that Nehalem represents the biggest platform architecture change in the company's history to date. While there are some big changes ushered in by this new platform, the Core i7 processors are not entirely new and do borrow heavily from Penryn.
Monolithic Quad-Core:
The Core i7 does differ from its predecessors in a number of ways, however. For one, Core i7 processors feature a monolithic die that house all four execution cores. If you remember, all of Intel's previous quad-core offerings were built by linking two dual-core dies on a single package. Intel's 45nm high-k metal gate manufacturing process will be use to produce all initial Core i7 processors.
Integrated Memory Controller:
In perhaps the biggest change to the CPU's design, Core i7 processors feature an on-die, triple channel, DDR3 memory controller that support three channels of DDR3 memory per socket, with up to the three DIMMs per channel. The memory controller for Intel's previous desktop processors was always integrated into the Northbridge chip, which is part of the core logic chipset. By moving the memory controller on die, and increasing the number of channels, Core i7 processors offer significantly more bandwidth than their predecessors and lower latency as well.
Quick Path Interconnect:
Moving the memory controller on-die, also allowed Intel to design a new serial interconnect that resides between the CPU and chipset, dubbed QPI (Quick Path Interconnect). And with the memory controller on-die, that also means there is no more traditional front side bus. QPI is a serial point-to-point interconnect that offers up to 25.6GB/s of bandwidth per port over 40 data lanes--20 in each direction.
Deeper Buffers, New Cache Structure:
Something else coming with Intel's Core i7 processors is a new cache structure. Core i7 processors feature L1, L2, and shared L3 caches, as opposed to Core 2 processors that have only L1 and L2 cache. There is a 64K L1 cache (32K Instruction, 32K Data) per core, 1MB of total L2 cache (256K per core), and a shared 8MB of L3 cache. We should note, however, that the L3 cache size may vary in future version of the CPU.
Although we don't have the specific details, Intel has also stated that Core i7 processors have "deeper buffers" than their Penryn-based counterparts, but the stages in the pipeline are largely unchanged.
Hyper-Threading Returns:
Intel is also bringing back Hyper-Threading with the new Core i7 processors. Hyper-Threading was first introduced in the Pentium 4 days and allows the Core i7's four cores to be recognized as eight virtual cores by the system's OS. While Hyper-Threading 1.0 was criticized for being energy inefficient, Intel claims this latest iteration is much more power friendly and performance should be better too.
Power Management and Turbo Mode:
Intel is also introducing new "Power Gates" with the Nehalem micro-architecture. In addition to reducing leakage power, Power Gates allow idle cores to enter a deep sleep state (C6) while other cores may be under load. Core i7 processors also feature integrated power sensors and an integrated Power Control Unit that allows the processor to perform real-time monitoring of each core's current, power, and voltage states. One of the reasons why having onboard power controllers and an integrated Power Control Unit is integral to the Core i7 is that it enables the CPU to divert power from idle cores to active cores in what Intel calls "Turbo Mode." If a particular core is being heavily taxed, it can tap into some of the power that would ordinarily be used to for one of the other cores if it is not currently in use. Turbo Mode typically increases performance of a single core, or the entire CPU by one speed bin; a 3.2GHz Core i7 processor with a stock multiplier of 24, for example, will operate with a multiplier of 25 when in Turbo Mode. Through overclocking, however, these parameters can be changed, and Turbo Mode could result in further speed increases.
Above we have a die shot of Nehalem with each of its major sections labeled. As you can see, the memory controller resides along the top edge of the die, with miscellaneous I/O and QPI links along either edge. The four executions cores are lined up through the middle, with a instruction queue in between, and the shared L3 cache below.
With the new Core i7 processors, Intel is also introducing a new 1366 pin socket. The new LGA 1366 socket looks and functions much like the current LGA 775 socket for Core 2 processors, but it is slightly larger. Pictured above is an LGA 1366 socket open without a CPU, and closed with a Core i7 CPU installed. Also note, the mounting holes for the CPU cooler are further apart than an LGA 775 socket. More on that later.