Intel Falcon Shores Puts An x86 CPU And Xe GPU In A Single Xeon Socket For Supercomputers

Intel Falcon Shores slide with Raja Koduri
Intel is making a flurry of announcements today, such as revealing its first desktop Arc Alchemist graphics cards will arrive in the second quarter, and that it plans on shipping 4 million discrete GPUs by the end of the year. It's graphics strategy extends much further than the desktop, though, and all the way into supercomputers. In that realm, Intel today tipped its upcoming Falcon Shores architecture.

What exactly is Falcon Shores? We'll know the full details in time. For now, however, the interesting tidbit is that Falcon Shores will merge x86 CPU and Xe GPU hardware into a single Xeon socket.

"We are working on a brand new architecture codenamed Falcon Shores. Falcon Shores will bring x86 and Xe GPU acceleration together into a Xeon socket, taking advantage of next generation packaging, memory, and IO technologies, giving huge performance and efficiency improvements for systems computing large data sets and training gigantic AI models," said Raja Koduri, head of Intel's Accelerated Computing Systems and Graphics Group.

This is not the same thing as integrated graphics. By combining x86 CPU and Xe GPU resources in a single socket and moving to a unified memory architecture, Intel says it can achieve big gains, in part by way of what Koduri says is a vastly simplified GPU programming model. Intel is calling this solution and XPU.

Intel Falcon Shores performance slide

According to Koduri, Falcon Shores will deliver a better than 5x uplift in performance per watt, and the same goes for both compute density and memory capacity and bandwidth. This won't come just from combining x86 CPU and Xe GPU resources in a single Xeon socket, but also by way of an "impressive array of technologies" that underpin the architecture.

Those technologies include taking advantage of the angstrom era. Intel renamed its nodes last year, and the angstrom era starts with what it refers to as Intel 20A (preceded by Intel 3, Intel 4, Intel 7, and 10nm SuperFin). An angstrom, by the way, is equal to a unit of length of one hundred-millionth of a centimeter. According to what Intel revealed last year, Intel 20A will introduce a new RibbonFET transistor architecture and interconnect technologies.

Koduri also say Intel is developing new extreme bandwidth shared memory that will play into this, though he didn't elaborate.

"We are super excited about this architecture as it brings acceleration to a much broader range of workloads than the current discrete solutions," Koduri said.

This is all aimed at delivering revolutionary upgrades en route to enabling zetta-scale processing within the supercomputing segment by 2027. That represents a performance increase by a factor of 1,000x in just a few years. We'll have to wait until later for more details, though. In the meantime, Intel has posted a handful of breakout session videos as part of today's Investor Meeting event.