TSMC Candidly Explains Why It Can't Keep Up With NVIDIA's Red Hot AI Chip Demand

tsmc reticle mask
Unlike many computing trends, AI has practical purposes, so it has stuck around beyond the initial fad period. People are using AI for anything and everything, even when it isn't particularly-suited for the task. (Note that all of our articles are written by humans!) If you want to run large AI models like GPT-4, you need extremely large and powerful GPUs to get any kind of decent performance, and that basically means buying NVIDIA hardware.

It's this incredible surge in demand for AI processing that has pushed NVIDIA's stock to the moon, giving the company a valuation in the one trillion US dollar range, a number so large it is practically inconceivable. The thing is, it would be even higher, except TSMC can't make NVIDIA's GPUs fast enough.

Strictly speaking, that's not quite true, as it turns out. TSMC can make chips very, very quickly. The problem is actually with the advanced packaging that NVIDIA's Hopper GPUs require. You see, part of the extremely high performance of Hopper in AI tasks is thanks to its use of exotic HBM2 and HBM3 memories. These require being installed on the same package as the GPU chip itself, and this means unusual packaging.

The specific technology in question is TSMC's CoWoS, an acronym that stands for Chip on Wafer on Substrate. It's a high-density packaging solution not unlike Intel's newer EMIB tech. Fundamentally, it is what allows Hopper to work, as without packaging technology like this it would be difficult or impossible to achieve the levels of memory bandwidth that the big datacenter-focused GPUs require.

nvidia A100 mezzanine
Above: NVIDIA's A100 GPU. Image: NVIDIA | Top: A lithography reticle mask. Image: TSMC

Unfortunately, CoWoS is also what's holding up the production pipeline, it seems. Talking to Nikkei Asia, TSMC Chairman Mark Liu said that, contrary to popular belief, there's no actual shortage of AI chips. Instead, the company is cranking out CoWoS packages as fast as it can, and it's simply unable to keep up with demand. According to Liu, demand for CoWoS picked up "suddenly", which apparently means it tripled in a year.

While being unable to serve all of your customers' demand isn't great, it's a lot better than having no demand at all. Liu says that he expects this bottleneck to be alleviated in "one and a half years." That's due to the expansion of TSMC's advanced chip packaging capabilities with a new $2.9b USD facility specifically for these technologies.

What Liu didn't say is how much extra capacity the new facility will add, but it's probably at least double. We say that because Liu stated his belief that advanced packaging like CoWoS is part of a "paradigm shift" in the semiconductor industry, where instead of building ever bigger and denser processors, we instead stack and combine multiple chips together into bigger packages.