NVIDIA Unveils Beastly 2 Petaflop DGX-2 AI Supercomputer With 32GB Tesla V100 And NVSwitch Tech (Updated)

nvidia dgx2 real image 2
Of the over 28,000 attendees at NVIDIA’s GTC 2018 GPU Technology Conference, many converged on the San Jose Convention Center this week to learn about advancements in AI and Machine Learning that the company would bring to the table for developers, researchers and service providers in the field. Today, NVIDIA CEO Jensen Huang took to the stage to unveil a number of GPU-powered innovations for Machine Learning, including a new AI super computer and an updated version of the company’s powerful Tesla V100 GPU that now sports a hefty 32 Gigabytes of on-board HBM2 memory.
NVIDIA V100 32GB GPU
NVIDIA Tesla V100 GPU with 32GB of onboard HBM2

A follow-on to last year’s DGX-1 AI supercomputer, the new NVIDIA DGX-2 can be equipped with double the number of Tesla V100 32GB processing modules for double the GPU horsepower and a whopping 4 times the amount or memory space, for processing datasets of dramatically larger batch sizes. Again, each Tesla V100 now sports 32GB of HMB2, where previous generation Tesla V100 was limited to 16GB. The additional memory can afford factors of multiple improvements in throughput due to the data being stored in local memory on the GPU complex, versus having to fetch out of much higher latency system memory, as the GPU crunches data iteratively. In addition, NVIDIA also attacked the problem of scalability for its DGX server product by developing a new switch fabric for the DGX-2 platform.

NVIDIA NVSwitch diagram

Dubbed NVSwitch, the new fully crossbar GPU interconnect fabric allows the platform to scale to up to 16 GPUs and utilize their memory space contiguously, where the previous DGX-1 NVIDIA platform was limited to 8 total GPU complexes and associated memory. NVIDIA claims NVSwitch is 5 times faster than the fastest PCI Express switch and is able to allow up 16 Tesla V100 32GB GPUs to communicate at a blistering 2.4 Terabytes per second. NVSwitch is based on NVIDIA’s legacy NVLink technology but expands significantly on its capability and scalability, supporting all NVLink compatible GPUs from NVIDIA.  

NVIDIA NVSwitch die
NVIDIA NVSwitch Die

NVIDIA claims the new DGX-2 is the world’s first 2 Petaflop Machine Learning system and it’s comprised of serious array of technology in support of its Tesla V100 32GB GPUs. Specs including dual 28-core Intel Xeon processors, up to 16 GPUs with 6 NVSwitch complexes linking them, as well as up to 30 Terabytes of NVMe Solid State Storage, with configurations segmentable down from there.

DGX 2 Exploded
nvidia dgx2 real image 3
NVIDIA DGX-2
All told, NVIDIA is making bold claims of performance gains, noting that DGX-2 can train FAIRSeq, a cutting-edge neural machine translation model, in about one and a half days, where it took 15 days on DGX-1 (a 10X improvement). Other gains NVIDIA boasts for DGX-2 are in areas of inference or image recognition, where DGX-2 is claimed 190X faster, and up to 60X faster with speech recognition and voice synthesis.
NVIDIA DGX 2

Pricing is yet to be announced but NVIDIA notes Tesla V100 32GB cards will be immediately available from leading OEMs, while its DGX-2 Machine Learning supercomputer will ship in Q3 this year.

Update, 3/27/2018 - 5:12 PM: We're updating our coverage here with some up close direct views of NVIDIA's new DGX-2 Machine Learning Supercomputer. Enjoy!

DGX 2 Server Open2
NVIDIA DGX-2 AI Supercomputer

DGX 2 Tesla V100 GPU And Switch
Tesla V100 32GB Module And NVSwitch Chip

DGX 2 NVSwitch Chip
NVIDIA NVSwitch Fabric Chip

Update, 3/27/2018 - 6:18 PM:  In the demo video here, NVIDIA CEO Jensen Huang demonstrates deep learning inference training of image recognition, for the Tesla V100 platform versus Intel's Skylake Xeon processors...

We'll have more coverage from NVIDIA GTC 2018 on the way, so stay tuned!