Intel's 128MB L4 Cache May Be Coming to Desktops with 14nm Broadwell-K CPUs

When Intel debuted Haswell this year, it launched its first mobile processor with a massive 128MB L4 cache. Dubbed "Crystal Well," this on-package (not on-die) pool of memory wasn't just a graphics frame buffer, but a giant pool of RAM for the entire core to utilize. The performance impact from doing so is significant, though the Haswell processors that utilize the L4 cache don't appear to account for very much of Intel's total CPU volume.

Right now, the L4 cache pool is only available on mobile parts, but that could change next year. According to CPU-World, Broadwell-K will change that. The 14nm desktop chips aren't due until the tail end of next year -- we should see a desktop refresh in the spring with a second-generation Haswell part. Still, it's a sign that Intel intends to integrate the large L4 as standard on a wide range of parts.

Why Crystal Well Matters

There are two reasons to pay attention to Crystal Well. First, it's entirely possible that Intel will integrate the massive cache across all chips at some future date. Using EDRAM instead of SRAM allows the company to dedicate just one transistor per cell instead of the 6T configurations commonly used for L1 or L2 cache. That means the memory isn't quite as fast or as efficient as it might be, but it saves an enormous amount of die space. At 1.6GHz, L4 latencies are 50-60ns -- significantly higher than the L3, in other words, but just half the speed of main memory.

By integrating that huge pool of memory into desktop processors, Intel stands to boost performance modestly in both graphics and non-graphics workloads. And that's important, given the company's focus on form factors like NUC, which have no space for an external graphics card built to desktop specifications. The goal, for Intel, is to simultaneously build chips that hit "good enough" graphics and to expand the definition of "good enough" to include an increasingly large number of people.

That's a hump that AMD has struggled with for years. The fact is, casual users don't care much about integrated graphics, provided they can watch video and perform other basic tasks. In order to serve as a major selling point, AMD either needs HSA functionality (to offer acceleration in regular workloads), dramatically faster graphics for casual tasks like web surfing, or high enough integrated performance to attract low-end gamers. Strong video and fast accelerated web browsing are features Intel offers already, so AMD"s ability to hack out a clear win on these points is limited. Faster gaming performance is something the company is definitely shooting for with Mantle and its upcoming Kaveri APU, as well as broad support for HSA.

If Intel moves to make Crystal Well a standard option on desktop parts for 14nm (including the second-generation 14nm chip, Sky Lake), it's also a step towards integrating the L4 buffer into a wider variety of mobile chips. This generation, on 22nm, only 45W chips have the integrated buffer. It's entirely possible that we'll see 14nm mobile parts w/ the large L4 pushing into 25-35W TDPs. That's going to put still more pressure on AMD's mobile Kaveri.

The desktop threat, meanwhile, could be even larger. If Intel pushes the next-generation implementation of its graphics architecture to take advantage of a 75-88W TDP, it might be able to challenge Nvidia and AMD discrete cards at the low end,