ARM's Race: An Attack Plan For Servers and Mobile

by Joel Hruska — Tuesday, May 06, 2014, 08:00 AM EDT

Page 1:
Introduction And Server Development
- Page 2: HP's Project Moonshot
- Page 3: The Changing Face of Mobile

It has been nearly a year since we visited ARM in Cambridge, UK, and the company recently held another tech day -- this time in Austin, Texas. During the three-day session, ARM covered a wide range of topics, with a primary focus on server ecosystems and next-generation mobile hardware.

The company started off with an in-depth exploration of its CCN-508 server interconnect. AMD and Intel don't really have an analogous chip to this -- think of the CCN-508 as the hub that all other CPUs, GPUs, network interfaces, CPU cache, and other components connect to.

Click to Enlarge

ARM has revealed details on the CCN-508 before, but the company was emphasizing its server chops at Austin, talking up the extensive capabilities of the new design. The CCN-508 is designed to deploy on 16nm FinFET or 14nm manufacturing process nodes at TSMC / GlobalFoundries and offers a 128-bit bus that provides a total of 230GB/s of sustained bandwidth with up to 360GB/s burst bandwidth available. The CCN-508 is designed to support an L3 cache of up to 32MB and its attached memory controllers support ECC, RAS, and DDR3L, DDR3, and DDR4 standards.

One of the factors that sets the new interconnect apart from its predecessors is the degree of clock gating ARM is supporting in the new silicon. If you've followed the evolution of mobile hardware, you know that the ability to adjust clock frequencies and to power down sections of a chip that aren't in use, is a vital component of all modern hardware. The CCN-508 allows for more advanced management -- the L3 RAM can be powered down partially or entirely to reduce total SoC power. Alternately, CPUs can store data in L3 before powering down themselves (ARM calls this active retention, and claims that the chip can wake up again in just 5ns.)

One major question was whether or not the CCN-508 would support AMD's HSA, given that the chip isn't expected for several years and ARM is a member of the HSA Foundation. The answer to that question is "No" -- while CCN-508 does support fully coherent accelerators (including GPUs) and can be used for OpenCL and GPU offloading, it does not implement the HSA specification. In fact, one point ARM made to us at the event is that the HSA spec hasn't even been fully defined yet -- implementing it in hardware is basically impossible at this point in time.

As for a vendor actually bringing an ARM server to market, Applied Micro was on hand to talk about their own work on X-Gene. X-Gene has been floating around in the ARM server world for years; the company originally debuted its design in 2011. The original ramp target of 2012 was, in retrospect, rather optimistic -- the company is now talking about shipping a 64-bit ARM server on 40nm hardware today, with 28nm chips sampling later this quarter. Whether those processors will hit the 3GHz target that X-Gene originally forecast isn't clear; the manufacturer was reticent to give firm details at the event.

ARM was also joined by representatives from Red Hat and Canonical who discussed their respective efforts in bringing up the software stacks required to make ARM an equal player with x86 in the server world. Standards like OpenJDK are ready for deployment on ARM, along with management applications like Canonical's Juju platform. The implication of both companies is that while the ARM ecosystem isn't as established as its x86 counterpart, the various components needed for robust server deployments are rapidly dropping into place.