NVIDIA's HGX H200 GPU With HBMe3 Is An AI Beast For The Data Center
NVIDIA states that “the H200 is the first GPU to offer HBM3e — faster, larger memory to fuel the acceleration of generative AI and large language models, while advancing scientific computing for HPC workloads.” It’s because of the advancement to HBM3e that the NVIDIA H200 can provide 141GB of memory at 4.8 terabytes per second. This is a massive uplift in performance when comparing it to its predecessor, the NVIDIA A100, almost doubling the capacity and delivering a 2.4x increase in bandwidth.
The release of the NVIDIA H200 will mean that Large Language Models will be seeing a big boost in performance very soon. For example, Llama 2, a 70 billion-parameter LLM, will be achieving a near doubling of its current inference speed. NVIDIA expects this performance to get even better with the release of future software updates.
The NVIDIA H200 will be compatible with hardware and software currently available in the NVIDIA H100. This backwards compatibility will make it easy for current server manufacturers that partner with NVIDIA, such as Dell Technologies, ASRock Rack, Lenovo, Supermicro and Hewlett Packard Enterprise to easily upgrade their current systems. It will also be available in the NVIDIA GH200 Grace Hopper Superchip with HBM3e, which was announced back in August of this year.
The NVIDIA H200 will make its way to cloud service providers and be available from systems manufacturers globally in the second quarter of 2024.