AMD Confirms Zen 6 CPUs Will Support AVX512 And These Other Instruction Sets

hero amd zen6 logo edited
The idea that AMD's Zen 6 would support AVX-512 in some fashion has never really been in question, to tell the truth. With native 512-bit vector datapaths and a nearly-complete AVX-512 implementation, AMD's Zen 5 already has the strongest AVX-512 support on client systems, but it does seem like Zen 6 is poised to expand that support considerably. We know this thanks to recent patches submitted for the GNU Assembler (Gas).

Specifically, the patches plumb out a new "Znver6" target for Gas that confirms the new architecture for everything Zen 5 supports, as well as new instruction set extensions: AVX512_BMM, AVX_NE_CONVERT, AVX_IFMA, AVX_VNNI_INT8, and AVX512_FP16. Of those, only AVX512_BMM is truly novel; the rest are already supported by Intel's Granite Rapids Xeon processors. AVX512_BMM is a new set of instructions designed to accelerate bit masking operations for matrices, which offers big wins for binary neural networks.

znver6
From the mailing list for GNU Binutils.

Arguably the most exciting part of this is AVX512_FP16, though. This adds support for the FP16 datatype as a "first class" citizen on x86-64 CPUs. Again, AMD isn't first to market with this, but it is the first to bring this functionality to a client desktop platform. That's a big deal, because a huge swath of AI and ML development is performed using FP16 datatypes, and historically, that has meant either having to clumsily emulate support using FP32 compute, or offloading it to accelerators with native support.

felixclc amdfp16 tweet

Having native FP16 support on client desktop CPUs means that it will soon become much simpler to experiment with AI in a home lab without needing expensive and power-thirsty graphics hardware. As HPC developer and self-described "perf & ASM nerd" @FelixCLC_ explains above, CPUs are present in every system, and have the best "observability," which means that they offer superior introspection tools (like perf counters, debuggers, and instruction-level tracing) as well as repeatable execution without GPU black boxes.

Notably, many higher-end Arm processors already support FP16 as a first-class citizen, but those chips are less convenient to use as development and test platforms when compared to a mainstream x86-64 CPU. In other words, with the release of AMD's Zen 6 processors some time next year, it will be the first time that someone working on a normal x86-64 desktop will be able to natively write kernels in FP16, benchmark them using standard CPU profiling tools, and get results that map directly to real hardware behavior, with no vendor-specific SDKs required. It's a huge leap for the ecosystem of experimentation, especially considering that Intel appears poised to continue its market segmentation of AVX-512 with its upcoming Nova Lake processors.

Props to Phoronix for catching this story.
Tags:  AMD, (nasdaq:amd), zen 6
Zak Killian

Zak Killian

A 30-year PC building veteran, Zak is a modern-day Renaissance man who may not be an expert on anything, but knows just a little about nearly everything.