NVIDIA Unveils Beastly 2 Petaflop DGX-2 AI Supercomputer With 32GB Tesla V100 And NVSwitch Tech (Updated)

A follow-on to last year’s DGX-1 AI supercomputer, the new NVIDIA DGX-2 can be equipped with double the number of Tesla V100 32GB processing modules for double the GPU horsepower and a whopping 4 times the amount or memory space, for processing datasets of dramatically larger batch sizes. Again, each Tesla V100 now sports 32GB of HMB2, where previous generation Tesla V100 was limited to 16GB. The additional memory can afford factors of multiple improvements in throughput due to the data being stored in local memory on the GPU complex, versus having to fetch out of much higher latency system memory, as the GPU crunches data iteratively. In addition, NVIDIA also attacked the problem of scalability for its DGX server product by developing a new switch fabric for the DGX-2 platform.
Dubbed NVSwitch, the new fully crossbar GPU interconnect fabric allows the platform to scale to up to 16 GPUs and utilize their memory space contiguously, where the previous DGX-1 NVIDIA platform was limited to 8 total GPU complexes and associated memory. NVIDIA claims NVSwitch is 5 times faster than the fastest PCI Express switch and is able to allow up 16 Tesla V100 32GB GPUs to communicate at a blistering 2.4 Terabytes per second. NVSwitch is based on NVIDIA’s legacy NVLink technology but expands significantly on its capability and scalability, supporting all NVLink compatible GPUs from NVIDIA.
NVIDIA claims the new DGX-2 is the world’s first 2 Petaflop Machine Learning system and it’s comprised of serious array of technology in support of its Tesla V100 32GB GPUs. Specs including dual 28-core Intel Xeon processors, up to 16 GPUs with 6 NVSwitch complexes linking them, as well as up to 30 Terabytes of NVMe Solid State Storage, with configurations segmentable down from there.
All told, NVIDIA is making bold claims of performance gains, noting that DGX-2 can train FAIRSeq, a cutting-edge neural machine translation model, in about one and a half days, where it took 15 days on DGX-1 (a 10X improvement). Other gains NVIDIA boasts for DGX-2 are in areas of inference or image recognition, where DGX-2 is claimed 190X faster, and up to 60X faster with speech recognition and voice synthesis.Pricing is yet to be announced but NVIDIA notes Tesla V100 32GB cards will be immediately available from leading OEMs, while its DGX-2 Machine Learning supercomputer will ship in Q3 this year.
Update, 3/27/2018 - 5:12 PM: We're updating our coverage here with some up close direct views of NVIDIA's new DGX-2 Machine Learning Supercomputer. Enjoy!
