Intel Unveils Powerful New Nervana NNP AI Chips For Inferencing And Training With Up To 32GB HBM2

Intel Nervana
Like just about every other major technology firm, Intel sees a lot of promise in artificial intelligence, and is making investments in the category. That is on display today at the Hot Chips conference, where Intel unveiled the first two processors comprising its Nervana Neural Network (NNP) line, including one for training (NNP-T) and one for inference (NNP-I).

Dubbed "Spring Crest," Intel built its Nervana NNP-T from the ground up to train deep learning models at scale. It is designed to prioritize training networks as quickly as possible, and do it within a given power budget, Intel says. The chip is also designed with flexibility in mind, to offer a balance between compute, communication, and memory, and is optimized for batched workloads.

Intel Spring Crest Slide

At a glance, NNP-T is an impressive slice of silicon. It packs 27 billion transistors and 24 processors cores operating at up to 1.1GHz. It also features 32GB of HBM2 memory. Being built to scale means it can spread workloads across to multiple accelerator cards, and multiple systems as well. As the needs of a client grows, so too can the deep learning hardware.

The chip is built on TSMC's 16nm CLN16FF+ processor. That may seem like an unusual decision since Intel typically fabs its own silicon, but in this case, production was already underway when Intel acquired Nervana.

There's also a Tensor processing cluster here, with 24 Tensor processors, along with 60MB of on-chip distributed memory. According to Intel, this culminates in 119 TOPS (theoretical operations of per second) of performance.

In short, NNP-T is an SoC custom tuned to train an AI system, as opposed to offloading that type of task to a Xeon processor.

Intel Spring Hill Slide

Codenamed "Spring Hill," Intel's Nervana NNP-I part is an inference SoC. This one is built on Intel's 10-nanometer manufacturing process and features Ice Lake cores to accelerate deep learning deployment at scale.

"The Intel Nervana NNP-I offers a high degree of programmability without compromising performance or power efficiency. As AI becomes pervasive across every workload, having a dedicated inference accelerator that is easy to program, has short latencies, has fast code porting and includes support for all major deep learning frameworks allows companies to harness the full potential of their data as actionable insights," Intel explains.

Intel is claiming best in class performance and power efficiency for major data center inference workloads, at 4.8 TOPs/W. The TDP ranges from 10W to 50W.

What this all boils down to is tapping into huge market opportunities driven by cognitive and AI workloads. Intel had said in 2017 that the two sectors combined are on a trajectory to reach $46 billion in 2020. The Nervana NPP platform is Intel's aggressive move into those sectors, including areas like healthcare, social media, automobiles, and weather.

Intel faces stiff competition in AI, namely from Amazon's AWS Inferential processors, Google's Tensor Processing Unit, and NVIDIA's Deep Learning Accelerator (NVDLA). If Krzanich's projections are correct, however, there is plenty of money to go around.