Intel Revamps Falcon Shores GPU Plans, Integrates Powerful Gaudi AI Accelerator

hero intel datacenter isc 2023
The International Supercomputing Conference, now known simply as ISC, starts today and runs through Thursday in Hamburg, Germany. All the big players are in attendance, but perhaps none are bigger than Intel. The company came full force at ISC last year, announcing piles of new products and grand plans for them. It should perhaps be no surprise then that this year, Intel isn't actually announcing any new products.

datacenter silicon roadmap

That's not to say that Intel doesn't have any announcements to make, though. The company held a press briefing where Jeff McVeigh, the company's CVP of Supercomputing, talked to us about Intel's plans in the high-performance computing (HPC) space. Those plans primarily boil down to "iterate on what is already working," and it's easy to understand why: if it ain't broke, don't fix it.

scalable xeons

That includes its upcoming Emerald Rapids-family processors, which will become the "5th-generation Xeon Scalable family." These will succeed Sapphire Rapids and use the same platform, which could lead some to say that it's essentially a Sapphire Rapids refresh. Intel doesn't describe them that way, though, and says that Emerald Rapids is going to offer improved performance, efficiency, and core counts over Sapphire Rapids. Those parts are scheduled to launch at the end of this year.

The company also confirmed that Granite Rapids is still coming next year, and it should be a more significant evolution over the current-generation parts. Those will get a die shrink to Intel 3, and will in fact be the "first P-core Xeons" on that process. Intel promises increased core density as well as "memory & I/O innovations" on Granite Rapids.


McVeigh points out that the need for memory bandwidth is rapidly outpacing what's available due to asymmetrically-rising CPU core counts. To help combat this trend, it announced that Granite Rapids will implement the company's MCR-DIMM standard.

MCR-DIMMs, known more fully as "Multiplexer Combined Rank" DIMMs, are pretty similar to the JEDEC MRDIMM standard that we've written about before, and actually pre-date that. In fact, Intel says that it has released its MCR-DIMM tech to the market and that it "understands that other CPU vendors" are following suit. MCR-DIMMs will apparently offer an 83% peak bandwidth increase over standard DDR5, giving transfer rates over 1.5 TB/second on a two-socket Granite Rapids server.

compute taxonomy

McVeigh pointed out that if you look at the HPC market in general, it is overwhelmingly dominated by processing done on CPUs, but if you look at specifically large-scale AI development, it is actually mostly done on GPUs and AI accelerators. AI is by far the largest growth market within HPC, and as a result, Intel had shifted its plans a bit from what it announced last year.

We've already reported that Intel canceled "Rialto Bridge", the successor to Ponte Vecchio. Intel's not giving up on GPUs, though. It's just that the company wants to move ahead to newer and better products instead of iterating on its parts in this space. Next up for Intel's GPUs will be Falcon Shores. "But wait Zak," you say. "Wasn't Falcon Shores going to be some kind of CPU-GPU hybrid thing?"


Indeed it was, dear reader. Intel originally announced Falcon Shores as an "XPU," because the original intention was that it would combine Intel x86 CPU cores with Xe GPU cores using a mixed tile architecture. In the press briefing, Jeff McVeigh admitted that his "prior push and emphasis around integrating CPU and GPU into an XPU was premature."

He explained that integration makes sense when the market is stable and you know what will best serve your customers, but the part of the market that could make use of something like Falcon Shores is in a heavy state of flux right now. Keeping things disintegrated offers Intel greater flexibility to serve the market.

falcon shores

Falcon Shores isn't going anywhere, though. Renamed "Falcon Shores GPU," it will remain as the successor to the company's current Data Center Max GPUs while offering the latest technology—things like HBM3 and "next-generation I/O", a CXL programming model, and whatever Intel's latest GPU architecture is at that time. Falcon Shores will also apparently be able to integrate Habana Gaudi accelerators in some fashion, which we'll talk about in a moment.

max series gpu availability

If you fancy a set of servers using Intel's Data Center Max GPUs, you'll likely be pleased to know that the company will have an eight-GPU Universal Baseboard (UBB) available for super-high-density GPU compute needs. Those will see "broad" availability starting in July, which is around the time that its Data Center GPUs will become available in general. If you'd like one on a PCIe card, though, you'll probably have to wait until August.

gaudi2 performance

The company spoke briefly about its extant Gaudi 2 accelerators. Intel added these parts to its portfolio with the purchase of AI startup habana, and they're apparently quite potent processors indeed. Intel compares Gaudi 2 directly against NVIDIA's A100 GPUs, and claims that they come out ahead by as much as 2.44x in workloads like Stable Diffusion image generation.

oneapi overview

Arguably the most exciting development out of Intel's supercomputing division has been the oneAPI standard. This is an open industry specification for communicating with parallel processors; you can think of it like an analog to NVIDIA's CUDA, except that oneAPI is not only free and open-source, but also multi-architecture and even multi-vendor. It's not locked-down to Intel's hardware—the company proudly proclaims that it is about "breaking free from proprietary APIs."

portable single source

According to Intel, oneAPI is performance-competitive with NVIDIA and AMD's own vendor-specific APIs for compute workloads, which is impressive. The company boasts that it offers excellent speed across architectures and vendors, even when using "portable single-source code." That's great news for HPC developers because it means that you don't have to worry about being locked into a certain hardware vendor because that's what your code runs on.

last year

It may not be flashy, but Intel's presentation was really more about looking back over the past year of software developments and hardware releases. Since moving to a two-year release cadence for its GPUs, there are naturally going to be dead years like this where there are no new products to crow about. Here's looking forward the future, then—with any luck, Intel's next generation hardware will be as hot as it hopes.