Here's NVIDIA’s Vera Rubin AI Superchip — 88 Cores, Two GPUs, Gobs Of Memory And Next-Level Design

hero nvidia vera rubin
NVIDIA held its annual Graphics Technology Conference (GTC) in Washington D.C. yesterday, and as a surprise showing in the middle of his keynote, company CEO Jensen Huang pulled out a Vera Rubin Superchip, marking the first time that this product was shown to the public. The part looks quite different from even the GB300 Blackwell Ultra Superchip, largely thanks to its use of SOCAMM2 memory and on-board connectivity.

vera rubin superchip
The slide that NVIDIA showed, featuring a render of the uncapped chips.

If you haven't been following along, the Vera Rubin Superchip is named for the combination of the 88-core Arm-based Vera CPU along with two Rubin GPUs, each of which boasts a pair of reticle-sized dice along with 288GB of HBM4 memory. NVIDIA's actually counting the dice separately, so where a rack of Blackwell is called NVL72, a rack with the same number of Rubin packages will be called NVL144.

The Vera CPU is quite interesting, as it represents a major shift away from the previous-generation Grace CPU. Vera is a chiplet processor that utilizes expandable (and more importantly, replaceable) SOCAMM2 memory. NVIDIA says it uses customized Arm cores, although not much is known about their capabilities at this time. We do know that Vera has 88 cores with simultaneous multi-threading (SMT), giving 176 threads, as well as 1.8TB/s of bandwidth to each GPU via NVLINK-C2C.

huang two trays
CEO Huang with the two main types of trays that make up an NVL144 rack.

NVLink is of course also how the "Superchips" are connected to each other, allowing them to act as a single massive processor. The Vera Rubin NVL144 is apparently entirely cable-less internally, and you can see this borne out as the Superchip that Huang showed on stage has no ports or plugs on it at all anywhere besides the five edge connectors, which we believe to comprise two for NVLink, and three for other miscellaneous I/O.

A Vera Rubin tray includes eight ConnectX-9 NICs alongside the processors, as well as a next-gen Bluefield-4 DPU to help synchronization and coordination of all this data moving around. Huang emphatically stated that the total system bandwidth of an NVL144 rack is "the entire data usage of the Internet in one second," which is a heck of a comparison if you take it at face value.

vera rubin cpx compute tray

Of course, Huang also talked briefly about the Vera Rubin CPX system, which takes the very same Vera Rubin compute trays and adds eight additional Rubin CPX GPUs. We've written about Rubin CPX before; it's a more specialized processor designed for accelerating compute-heavy context operations crucial to rapid AI inference. Each Rubin CPX processor is its own multi-hundred-watt GPU; we shudder to think how much power the full Vera Rubin NVL144 CPX will use.

timeline

The future looks bright for NVIDIA, which just became the world's first $5 trillion dollar company. After Rubin CPX will come Rubin Ultra, putting four Rubin GPUs and a full terabyte of HBM4E memory on a single package. And then after that, the company's next GPUs will be known as Feynman, although we know virtually nothing about those products yet beyond that they will likely be built on TSMC's A16 process in 2028.

Rubin is scheduled to enter mass production "around this time next year, maybe a little earlier" according to Huang, and it will ship to customers in early 2027.