AMD's Radeon Instinct MI25 GPU Accelerator Crushes Deep Learning Tasks With 24.6 TFLOPS FP16 Compute
Marco has already written a rather informative article giving you an overview of the Radeon Instinct family and AMD’s ROCm software suite, but today we’re here to give you a more specifics on the hardware powering these new GPU accelerators.
The Radeon Instinct family consists of three products, all based on different GPU architectures. The low-end Instinct MI6 is based on Polaris, the mid-range Instinct MI8 is based on Fiji, while the high-end Instinct MI25 is based on AMD’s new Vega graphics architecture. Performance varies widely between the three cards, and it should come as no surprise that the Vega-based Radeon Instinct MI25 is by far the most powerful solution that AMD has available:

AMD claims that its cards will offer performance leadership in both single-precision (FP16) and double-precision (FP32) performance, and that its enhanced GPU-to-GPU communication technology will allow for lower overall latencies. AMD is also touting excellent interoperability with its EPYC lineup of server processors.
Considering that EPYC server processor have up to 128 PCIe lanes available, AMD is claiming that the platform will be able to link up with Radeon Instinct GPUs with full bandwidth without the need to resort to PCI Express switches (which is a big plus). As we reported in March, AMD opines that an EPYC server linked up with four Radeon Instinct MI25 GPU accelerators has roughly the same computing power as the human brain.
Looking ahead, Inventec has plans to offer a rack that incorporates a staggering 125 Radeon Instinct MI25 GPUs, delivering 3 PetaFLOPS of GPU compute performance. If your compute needs aren’t nearly as demanding, there will be a Falconwitch server supporting up to 16 Radeon Instinct MI25 GPUs providing 400 TFLOPS of compute performance.