Logo   Banner   TopRight
TopUnder
Transparent
Intel's Game Changer: One Size Fits All Haswell
Transparent
Date: Sep 17, 2012
Section:Processors
Author: Joel Hruska
Transparent
Introduction
Intel's next-generation CPU, codenamed Haswell, was the major star of IDF. One aspect of the chip we haven't talked about at length, however, is its emphasis on reduced power consumption. When Intel announced that its Ivy Bridge mobile products would target 17W for mainstream systems, it made headlines. Pushing Haswell down to 10W is an even greater achievement, but hitting these targets requires a great deal of collaboration and cooperation.


Intel's Dadi Perlmutter, Executive Vice President, Architecture Group with Xeon Phi and Atom CPUs

For most of the past 40 years, power consumption was treated as an afterthought at virtually every level. Unless you were building specialized hardware for particular operating environments, it made little sense to invest in clock-gating or other power conservation technology. Moore's law and Dennard scaling regularly delivered better transistor processes that leaked less and scaled more efficiently without requiring any particular effort on the engineers' part.

 
 Thermal images are of Clover Trail, Intel's 32nm SoC

That trend came to a decisive halt at the 90nm process node back in 2005. Intel had already begun to develop technologies to lower CPU power consumption by that point; the company's SpeedStep technology had debuted several years earlier. Haswell continues this work, offering fine-grained control over areas of logic that were previously either on or off, up to and including specific execution units.


Haswell and Clover Trail have implemented new sleep states that deactivate even more logic areas

These optimizations are impressive in and of themselves, particularly the fact that idle CPU power is approaching tablet levels, but they're only part of the story. Operating system changes matter as well, and Intel has teamed up with Microsoft to ensure that Windows 8 takes full advantage of current and future hardware.
 
Transparent
Integration with Windows 8
In Windows 7, the maximum amount of time hardware was allowed to stay in idle mode while awake was 15.6 milliseconds. One of Microsoft's design imperatives with the new operating system was to allow for longer idle periods, group timer updates and requests together, and give the system room to go to sleep when it didn't actually need to be powered up. The two slides below summarize W8's improvements (Intel characterizes them as better hygiene) at both the CPU and network/disk level.

 
Click for high res.

The next slide shows Windows 7's default activity, its activity when scheduled tasks are aligned with processor active cycles, and Windows 8's default scheduling.



In the Windows 7 comparisons, the black tick marks are 15.6ms apart. There are 7 ticks, for a total of 109.2 milliseconds, or roughly a tenth of a second. With default scheduling, the CPU only idles a full 15.6ms twice. From the first tick to the second, the chip drops into idle ~40.2% of the time. Aligning poll timers and grouping deferrable requests together under Windows 7 significantly improves the situation; the CPU spends 87% of its 15.6ms window in idle mode between ticks 1&2.

Windows 8 jettisons the system timer altogether, discourages poll timing, and aligns activity more stringently. As a result, the system spends a much higher amount of time in idle or low-power mode, even dropping into a CPU power-off state at one point. Because we're still talking about milliseconds, the end user hasn't even noticed the difference -- but the device's battery will.
 
Transparent
Rethinking Everything
Intel offered dozens of technical seminars this past week and they all touched on power efficiency to one degree or another. Intel has proposed power efficiency improvements to the memory bus, CPU-PCH linkages, SATA, wireless radios, displays, and much more.




Part of that rethinking process is focusing on seemingly settled interfaces and how they function. PCI-Express optimizations and the general shape of Intel's power optimization framework are shown below.



Here's Intel's next-generation platform, Shark Bay.



LTR refers to Latency Tolerance Messaging, a way of informing the platform how much a device can idle without compromising responsiveness or capability. By 2013, Intel expects the majority of devices that hook into Shark Bay to support some type of additional low-power operation. Long-term, the company's goal is to turn as many of these blocks green as it can.
 
Transparent
Low Power x86? No One's Laughing Anymore
It's easy to come away from IDF with an Intel-centric view of the industry. Even after taking a few days to review supplementary materials and consider the wider ecosystem, much of what Intel demonstrated is bad news for ARM partners like Samsung, Qualcomm, NVIDIA, and AMD. When Intel launched Atom four years ago, conventional wisdom said that the reason the company built the chip in the first place was because it couldn't make a play for handheld devices with a conventional x86 processor. Plenty of people scoffed at the idea that Intel could make an x86 smartphone at all.
Medfield proved the company could.


Intel CTO Justin Rattner Holds Wafer of Rosepoint Dual-Core Atom CPUs w/ Integrated Digital WiFi Radio

Haswell's 10W target will allow the chip to squeeze into some of the convertible laptop/tablet form factors we saw at IDF, while Bay Trail -- the 22nm, out-of-order successor to Clover Trail -- arrives in 2013 as well. While most folks have been trading blows over whether or not Intel could compete with ARM's CPU power consumption, Santa Clara has been busy designing every other aspect of the system for low power consumption and saving a lot of wattage in the process.



If a device's CPU only consumes 15-20% of the total, it doesn't much matter if its ARM or x86. Intel has positioned itself as a leader in lowering device power consumption in a way that no other company has matched to date. It's not clear that any of them have the stomach to try. AMD hasn't just sworn off cell phones, it's officially given up on competing with Intel's process technology and is emphasizing the idea of using established building blocks of IP.

That approach sounded great in the late 90s and early 2000s, but IDF was just the latest example of how the pendulum is swinging back towards integrated device manufacturing. Intel is the only IDM left standing in consumer hardware or smartphones. Samsung is close, but the pro-Apple verdict of several weeks ago could disrupt the company's entire roadmap. Apple is beholden to Samsung, NVIDIA and Qualcomm are dependent on TSMC, and AMD depends on GlobalFoundries.

When Atom goes out-of-order next year, Intel will have a top-to-bottom product stack that addresses desktop platforms through smartphones. It'll have its own native implementation of everything from radios to graphics. With those advantages, the company will have what it needs to cause serious disruption in the high-end of the market, where a lot of the revenue is.

I'm willing to stick my neck out and call this one early. By the end of 2013 (assuming ship dates don't slip), Atom and Haswell could account for a significant percentage of convertible/tablet sales, even if ARM anchors itself in lower-end or cheaper devices. By the end of 2014 and the launch of Broadwell, Haswell's 14nm successor, Intel will have claimed a majority of the tablet and smartphone market, with ARM increasingly driven into lower-end, less profitable devices. By that point, we'll be talking about laptop dual-core processors with 5-7W TDPs.


Content Property of HotHardware.com