|This news has been a long time coming. It's been six years since Intel first began talking about Atom. When it was initially announced, the 45nm, in-order core, based on the Bonnell microarchitecture, was a new product from the ground up. It drew on Intel's expertise in other areas and shared some design elements with the original Pentium, but Atom was its own unique design. And for more than half a decade, Intel has kept that same design.
Let's put that in perspective. In the desktop world, we were talking about Windows Vista, Core 2 Duo, and AMD's original Phenom. The Phenom II "Shanghai" refresh was still nine months away, Hector Ruiz was still CEO of AMD. In smartphones, the Cortex-A8 -- the first modern smartphone processor and the chip at the heart of Apple's iPhone 3GS -- hadn't shipped yet. Nokia and RIM were major powerhouses in the cell phone industry and Windows Mobile's market share was still north of 35%. I was thinner, and had more hair.
For the past five years -- and it'll have been nearly six years by the time these new Atom CPUs come to market -- Intel has focused on improving power consumption, improving power consumption, and improving power consumption. Dual-core variants appeared on the desktop in fairly short order, and clock speed nudges have only bumped performance slightly higher. On the one hand, this has paid off tremendously. As someone who spent several weeks with an Intel-powered Gingerbread phone, I can honestly say that yes, you can put an Intel smartphone in your pocket, it works just fine, and battery life is decent.
But there's no hiding the fact that Atom is getting long in the tooth. AMD's Brazos outperformed it decisively in 2011 and with Kabini (second-gen Brazos) and the Cortex-A15 both coming to market, Intel has finally given Atom the re-architecting it deserved.
Here's the various parts:
Behold Bay Trail:
This new Atom is an out-of-order processor, but many of the basic blocks are identical to what we saw in Silvermont. Intel stuck with a dual-issue design with relatively limited integer and floating point pipelines. There are, however, a significant number of improvements.
Execution units have been redesigned for more efficient, lower-latency operation. The L2 cache has expanded and is now shared between cores. Intel isn't using Hyper-Threading this time around, opting instead to go for a straight 1:1 relationship between threads and core count. Many of the most significant changes to Atom's core for Silvermont are focused on how the chip handles floating-point code. x87 FPU code running on Atom was pretty slow, partly by design, and partly thanks to a CPU bug that inserted a cycle of latency in between any two consecutive x87 operations. CPU analyst Agner Fog describes the problem as follows:
Whenever there are two consecutive x87 instructions, the two instructions fail to pair and instead cause an extra delay of one clock cycle. This gives a throughput of only one instruction every two clock cycles, while a similar code using XMM registers would have a maximum throughput of two instructions per clock cycle.While latencies and throughput varied depending on the type of operation, Atom's x87 latencies and instruction throughputs were disappointing. We can't give you an exact breakdown of how Silvermont improves on this situation yet, but we've seen that data. The improvements are substantial and non-trivial. Silvermont's latencies for various x87 operations are often half of Saltwell's, with certain instructions outputting more than twice as often.
Even with these improvements, Atom will never be a heavy-lifting core, but it should handle a great deal of legacy code more gracefully than it currently does.
Bay Trail / Avoton cores are deployed in groups of two, called "modules", but this system has more in common with ARM processors than with AMD's Bulldozer modules. Each of the two cores on a Bay Trail module is a complete core, with a shared L2 cache. As you'd expect, multiple modules can be linked together to build out the system.
|Next-Gen Performance, Power Consumption|
|The goal of Silvermont (in all its configurations) is to substantially boost single-thread and multi-threaded performance while simultaneously slashing power consumption thanks to the 22nm FinFET process being used to build the chip, along with architectural tweaks. The following graph contains a great deal of information about Silvermont's estimated performance:
"Iso" is a Greek prefix meaning "equal." 1C1T = 1 core, 1 thread. 2C4T is a reference to Saltwell's Hyper-Threading. What these slides show, in aggregate, is a phenomenal increase in performance, performance-per watt, and a dramatic reduction in power consumption. The benefits extend through every scenario in single and multi-threaded configurations and, according to Intel, are a key component of why Bay Trail / Avoton will decimate the competition when the new chips finally launch late this year or the beginning of next.
According to Intel, the new chips are substantially faster than anything its competitors are fielding. The company is promising dual-core Silvermont chips that are 1.4x - 2.1x faster than equivalent quad-core products from its competitors, while drawing 1.6x - 3.1x less power (The use of x-less power nomenclature is somewhat confusing).
How fair are these figures? That's a question worth asking. Intel's slides note that the "software and workloads used in performance tests have been optimized only on Intel microprocessors." On the other hand, when we first unveiled Medfield, the phone's performance and battery life tests in real world usage matched up with what Intel claimed the phone could do against the competition. In the past, Intel has been played things straight when it came to comparing its tablet and mobile phone platforms against ARM competitors, and we don't see a reason to conclude the company has deviated from that this time around.
Better performance is of limited value without better power consumption, and Intel aims for Silvermont to deliver on both counts. The "Race to Zero" has become the new 1GHz push, and Intel has further tweaked Silvermont's design to allow the chip to enter and leave wait-states more efficiently. As we've previously covered, the ability to race to minimum power consumption is extremely important to a CPU's overall power consumption.
The graphs Intel is showing at this point claim major advantages for Silvermont in this regard. Obviously, that's what we'd expect -- Intel isn't going to hand out substandard comparisons -- but take them with a grain of salt. It's not clear if these comparisons use Cortex-A15 hardware, older Cortex-A9 products built on 40nm, or both.
Silvermont will also be capable of more aggressive power budget sharing, with CPU and GPU cores adjusted on the fly to optimize performance. This is an option that's existed for multiple product generations, but Silvermont improves the cross-adjustment capability.
The big question, of course, is whether AMD's upcoming Temash / Kabini will be able to compete with this new Atom. Obviously we aren't going to come down definitively on one side or the other until shipping hardware is in hand, but here's what we expect.
AMD's Kabini is supposedly shipping for revenue this quarter, whereas these new Atom's aren't expected until the end of the year. Graphics performance will likely favor Kabini thanks to the chip's GCN heritage. Intel is claiming 2x single-threaded CPU performance, while AMD is predicting a 6-9% increase for Kabini, clock-for-clock. It's important to remember, however, that historically Atom's single-thread performance without Hyper-Threading support was very low. 2x single-thread performance will put the chip in Kabini's ballpark. AMD is hinting at a 2.13x comparison over dual-core Brazos, while Intel says 2.8x higher than dual-core Atom. Since Brazos was faster than Atom to begin with, the implication is that Kabini and Bay Trail may hit roughly the same performance targets for the CPU cores.
At this point, we expect Kabini / Temash to compete well against Atom in higher power devices, while Atom will be capable of fitting into smaller devices, like smartphones, where Kabini / Temash draw too much power to compete.
The real showdown in 2014 will be against the Cortex-A15 in tablets and high-end smartphones. By the time these new cores ship, Qualcomm, Samsung, and Nvidia will have had time to refine their initial 28nm Cortex-A15 products. As we've previously stated, the Cortex-A15, for all its performance, isn't going to smash power efficiency records in smartphones. The same design tweaks that improved performance so markedly over the Cortex-A9 are going to work against the core in mobile.
You don't need to take Intel's graphs as gospel truth to recognize that this is going to be an ugly fight for the ARM vendors. The Cortex-A15's of 2014 will still be based on 28nm technology, going up against Intel at 22nm FinFET. TSMC and Global Foundries are both planning to bring next-generation FinFET technology online more quickly, but TSMC won't ramp 20nm in volume until 2015. The company's 16nm FinFET technology isn't expected until 2016 or 2017. Company CEO Morris Chang has stated in conference calls that he expects the total volume of 16nm FinFET technology to be "very small" in 2015.
This is the architectural overhaul that Atom has needed. It's a no-compromise approach that should deliver vastly improved performance and better battery life. And, by all accounts, it's exactly what Intel needs to take the fight to the Cortex-A15. By the time the Cortex A54 and A57 are shipping, Intel's own 14nm part (codenamed Airmont) will be on the market as well. This is the chip that's going to widen Intel's ability to compete in tablets and smartphones and the long-term impact of that on the company's business can't be understated. We should also point out that Intel will be refreshing its low-power architecture twice per process node moving forward, similar to the tick-tock cadence of desktop processors. And we've all seen how well they've executed that strategy thus far with the Core series.
It's still going to be 6-8 months before we see Bay Trail / Merrifield in shipping products, but Intel's data suggests the company is going to put up a much harder fight in the smartphone and tablet market. ARM manufacturers like Qualcomm and Nvidia will absolutely have competitive products on the market, but Intel's Merrifield and Bay Trail could easily be fighting for top SKUs rather than appearing in only a few budget products.