NVIDIA Shows Beefy Arm CPUs Battling x86 Servers For A100 GPU-Powered Cloud AI Dominance

NVIDIA Arm Server
NVIDIA is thumping its chest over a round of impressive benchmark runs that highlight the potency of mixing its A100 accelerators with either Arm or x86 hardware. Regardless of the CPU platform, NVIDIA contends that its accelerators enable the "best results in AI inference," and it has another batch of benchmarks to support its claim.

The GPU maker is especially interested in highlighting results obtained with A100 and Arm hardware. Traditionally, data centers have turned to x86 silicon, which is well-suited for the kinds of demanding workloads they're constantly being hammered with, for a variety of applications. But Arm has been gaining ground.

"The Arm architecture is making headway into data centers around the world, in part thanks to its energy efficiency, performance increases and expanding software ecosystem," NVIDIA says.

"The latest benchmarks show that as a GPU-accelerated platform, Arm-based servers using Ampere Altra CPUs deliver near-equal performance to similarly configured x86-based servers for AI inference jobs. In fact, in one of the tests, the Arm-based server out-performed a similar x86 system," NVIDIA adds.

What NVIDIA is referring to is the latest batch of MLPerf benchmarks by MLCommons, an industry benchmarking group that was formed several years ago. And for the third consecutive time, NVIDIA has set records in both performance and energy efficiency on inference tests, which taps AI software to recognize an object or make a prediction. These tasks leverage deep learning models powered by capable hardware.

NVIDIA Arm Benchmarks
Click to Enlarge (Source: NVIDIA)

The chart above highlights NVIDIA's point, which is that an Arm-based server is nearly as proficient as a x86-based server in these benchmarks, when paired with its A100 GPUs. There is not a lot of separation between the two platforms. Additionally, one of the biggest gaps in performance actually favors the Arm-based system.

"Arm, as a founding member of MLCommons, is committed to the process of creating standards and benchmarks to better address challenges and inspire innovation in the accelerated computing industry," said David Lecomber, a senior director of HPC and tools at Arm.

"The latest inference results demonstrate the readiness of Arm-based systems powered by Arm-based CPUs and NVIDIA GPUs for tackling a broad array of AI workloads in the data center," Lecomber added.

Naturally the folks at Arm are giddy at the results, as well as NVIDIA's willingness to embrace and highlight the architecture's capabilities in comparison to x86, which is the domain of AMD and Intel. Likewise, it's worth pointing out that NVIDIA is in the process of acquiring Arm, pending regulatory approvals, so there are plenty of kudos that both sides are willing to share with one another.

That's not to say any of it is unjustified. The benchmarks speak for themselves, but in case more commentary is need, NVIDIA is eager to point out that its A100 is up to 104 times faster than a CPU. And also power efficient. Have a look...
NVIDIA A100 Benchmarks
Click to Enlarge (Source: NVIDIA)

NVIDIA A100 Efficiency
Click to Enlarge (Source: NVIDIA)

These are impressive results. And from NVIDIA's vantage point, they're important ones as well. Not just from a self-serving standpoint, but as datasets balloon in size and AI scenarios extend from the data center to the edge, it benefits users to have more choices rather than essentially be locked into a single type of architecture.

It's not just about the hardware, though. NVIDIA points out that that things like its TAO toolkit and TensorRT software provides the optimizations necessary to see the kinds of gains on display—up to 20 percent in performance and 15 percent in energy efficiency, compared to the previous round of MLPerf inference benchmarks that were run four months ago.

"All the software we used in the latest tests is available from the MLPerf repository, so anyone can reproduce our benchmark results. We continually add this code into our deep learning frameworks and containers available on NGC, our software hub for GPU applications," NVIDIA says.

This is all good news for NVIDIA, especially if it can complete its acquisition of Arm. It also keeps x86 players on their toes, knowing that Arm is reaching into their territory.