Let's put that in perspective. In the desktop world, we were talking about Windows Vista, Core 2 Duo, and AMD's original Phenom. The Phenom II "Shanghai" refresh was still nine months away, Hector Ruiz was still CEO of AMD. In smartphones, the Cortex-A8 -- the first modern smartphone processor and the chip at the heart of Apple's iPhone 3GS -- hadn't shipped yet. Nokia and RIM were major powerhouses in the cell phone industry and Windows Mobile's market share was still north of 35%. I was thinner, and had more hair.
For the past five years -- and it'll have been nearly six years by the time these new Atom CPUs come to market -- Intel has focused on improving power consumption, improving power consumption, and improving power consumption. Dual-core variants appeared on the desktop in fairly short order, and clock speed nudges have only bumped performance slightly higher. On the one hand, this has paid off tremendously. As someone who spent several weeks with an Intel-powered Gingerbread phone, I can honestly say that yes, you can put an Intel smartphone in your pocket, it works just fine, and battery life is decent.
But there's no hiding the fact that Atom is getting long in the tooth. AMD's Brazos outperformed it decisively in 2011 and with Kabini (second-gen Brazos) and the Cortex-A15 both coming to market, Intel has finally given Atom the re-architecting it deserved.
Here's the various parts:
- Avoton -- Low power SoC, aimed at servers. Follow-up to Centerton, will debut later this year.
- Rangely - For comms, infrastructure products.
- Bay Trail - Tablets. First quad-core Atom, targeting holidays, 2013. Also in some ultramobiles.
- Merrifield -- Follow-up to Medfield. Shipping by end of year to meet Q1 2014 launches.
This new Atom is an out-of-order processor, but many of the basic blocks are identical to what we saw in Silvermont. Intel stuck with a dual-issue design with relatively limited integer and floating point pipelines. There are, however, a significant number of improvements.
Execution units have been redesigned for more efficient, lower-latency operation. The L2 cache has expanded and is now shared between cores. Intel isn't using Hyper-Threading this time around, opting instead to go for a straight 1:1 relationship between threads and core count. Many of the most significant changes to Atom's core for Silvermont are focused on how the chip handles floating-point code. x87 FPU code running on Atom was pretty slow, partly by design, and partly thanks to a CPU bug that inserted a cycle of latency in between any two consecutive x87 operations. CPU analyst Agner Fog describes the problem as follows:
Whenever there are two consecutive x87 instructions, the two instructions fail to pair and instead cause an extra delay of one clock cycle. This gives a throughput of only one instruction every two clock cycles, while a similar code using XMM registers would have a maximum throughput of two instructions per clock cycle.While latencies and throughput varied depending on the type of operation, Atom's x87 latencies and instruction throughputs were disappointing. We can't give you an exact breakdown of how Silvermont improves on this situation yet, but we've seen that data. The improvements are substantial and non-trivial. Silvermont's latencies for various x87 operations are often half of Saltwell's, with certain instructions outputting more than twice as often.
Even with these improvements, Atom will never be a heavy-lifting core, but it should handle a great deal of legacy code more gracefully than it currently does.
Bay Trail / Avoton cores are deployed in groups of two, called "modules", but this system has more in common with ARM processors than with AMD's Bulldozer modules. Each of the two cores on a Bay Trail module is a complete core, with a shared L2 cache. As you'd expect, multiple modules can be linked together to build out the system.