AMD EPYC 7004 Zen 4 CPUs Allegedly Gaining AVX3-512, BFloat16 Instructions To Battle Xeon

No matter what kind of pricing proposition AMD brought to the server sector or how many cores and threads it offered up, Intel could always play the AVX-512 card, touting superior performance in very specific workloads that leverage that instruction set. Perhaps not for long, though. Rumor has it AMD's EPYC 7004 series CPUs will embrace AVX3-512.

We're not talking about AMD's next-generation server chips, but it's next next-generation models. In the more immediate future, AMD is getting ready to unveil its EPYC 7003 series based on Zen 3, codenamed Milan. Then sometime after (perhaps very late this year, or more likely around this time next year), it will roll out its Zen 4-based EPYC "Genoa" CPUs.

That's looking a bit far out in the distance, and obviously there is very little in the way of confirmed information. We know AMD will release a server series based on Zen 4, and those CPUs will help power El Capitan, which is expected to be the fastest supercomputer on the planet when fully deployed.

As far as leaked information goes, however, check out this slide...

AMD EPYC 7004 Slide
Source: Chiphell

A user posted this on the Chiphell forum, and if the slide is real and/or accurate, AMD's Genoa stack will feature support for some additional instruction architectures (ISAs), including AVX3-512, Bfloat 16, and others. Adding AVX3-512 and Bfloat 16 to the mix would be a major deal as AMD and Intel duke it out in the lucrative server space.

It was only a few months ago when Intel made the bold claim that its 32-core Ice Lake-SP Xeon CPUs could get the better of a 64-core/128-thread EPYC 7742 processor.

"Customers running life sciences and financial services applications can expect to see higher performance on workloads such as NAMD molecular dynamics simulation (up to 1.2 times), Monte Carlo simulations (up to 1.3 times), and LAMMPS molecular modeling simulation (up to 1.2 times) compared to competitive x86 systems featuring twice as many cores as a 32-core Ice Lake processor-based system," Intel said at the time.

Intel's claim was based on specific benchmarks comparing a system with two 32-core Ice Lake-SP Xeon processors (for a total of 64 cores and 128 threads) going up against a pair of EPYC 7742 processors (for a total of 128 cores and 256 threads). The caveat is that Intel cherry picked workloads that take advantage of AVX-512 instructions, which its processors support and AMD's do not.

AVX-512, or Advanced Vector Extensions 512, can require quite a bit of power and generate considerable heat in some cases, but workloads designed to tap into the instruction set can also see a big performance boost. Meanwhile, BFloat16, or Brain Floating Point, was developed by Google for its Cloud TPUs and can help accelerate deep learning workloads.

What this all boils down to is a much more interesting match-up between AMD's Genoa processors and Intel's Sapphire Rapids Xeon chips, both of which are expected to support DDR5 memory and PCI Express 5.0.