AMD Built A Beastly 88-Core EPYC CPU With Monster HBM3 Memory Bandwidth For Microsoft
This announcement comes not from AMD, but in fact from Microsoft, who is much more interested in telling you about its Azure HBv5 Virtual Machines—cloud computing you can rent with some pretty unique capabilities. We, of course, are much more interested in the hardware behind those capabilities: namely, what Microsoft describes as a custom 4th-generation EPYC chip with high-bandwidth memory (HBM).
In case you didn't know, HBM isn't simply a categorical descriptor, but rather a specific memory technology. It achieves extremely high memory transfer rates by using an absurdly-wide memory interface. Historically, HBM ran relatively low transfer rates, but modern HBM3e has pushed the transfer rate up to be competitive with desktop DDR5 memory. Considering that the interface is typically at least eight times wider than a regualr desktop CPU's memory interface, you're really talking about some high bandwidth.
Microsoft says that these chips are based on AMD's Zen 4 architecture and run with SMT disabled for maximum single-threaded throughput. They clock as high as 4 GHz, which isn't impressive to desktop gamer types, but not bad for an EPYC machine with up to 352 CPU cores per server. Those cores are served by "400-450 GB" of HBM3 memory, and Microsoft says these chips also sport double the Infinity Fabric bandwidth between CPUs compared to typical EPYC servers.
The peak bandwidth achieved in this configuration is a very nice 6.9 TB/second, as measured in the STREAM Triad benchmark. This kind of memory bandwidth is almost unprecedented, at least for CPUs; people in the know will recall that Intel also shipped server CPUs with HBM onboard as part of its Data Center MAX Xeon line.
Memory bandwidth per core has been a sore spot for high-performance computing for a while. We've reported on this topic a few times before, including when JEDEC proposed its MRDIMM standard for Multi-Rank Buffered DIMMs that promise double the memory bandwidth of standard DIMMs. That may help ease things as core counts continue to rise, but bandwidth-per-core is still relatively low in such a configuration. Making use of HBM3 neatly puts a bow on that problem.
It's likely that, just as the Steam Deck's "Aerith" processor was likely not created with the Steam Deck in mind, these EPYC chips in use by Microsoft were probably not fabbed with cloud-based HPC in mind, either. There were rumors in the past that AMD would introduce Instinct accelerators that dropped the CDNA GPU for Zen CPU CCDs, and that's exactly what this appears to be. Those products never came to general availability, but it seems they did find their way to market one way or another.