Intel announced its family of Xeon Scalable Processors
in early May, featuring the Skylake-SP
microarchitecture. Those processors haven’t officially launched just yet, but today the chip giant is revealing one of the key technologies being leveraged in the Xeon Scalable Processor family. A new mesh interconnect architecture has been designed to increase bandwidth between on-chip elements, while simultaneously decreasing latency, and improving power efficiency and scalability.
In a blog post on the company’s website, Akhilesh Kimar, Skylake-SP CPU
Architect explains, “The task of adding more cores and interconnecting them to create a multi-core data center processor may sound simple, but the interconnects between CPU cores, memory hierarchy, and I/O subsystems provide critical pathways among these subsystems necessitating thoughtful architecture. These interconnects are like a well-designed highway with the right number of lanes and ramps at critical places to allow traffic to flow smoothly…”
Intel Ring Bus Architecture
In previous-generation, many-core Xeon processors
, Intel has used a ring interconnect architecture to link the CPU cores, cache, memory, and various I/O controllers on the chips. As the number of cores
in the processors, and memory and I/O bandwidth has increased, however, it has become increasingly more difficult to achieve peak efficiency with a ring interconnect, because a ring architecture could require data to be sent across long stretches (relatively speaking) of the ring to reach its intended destination. The new mesh architecture addresses this limitation by interconnecting on-chip elements in a more pervasive way, to ultimately increase the number of pathways and improve the efficiency.
Intel Mesh Interconnect Architecture
Above is a visual representation of the new mesh architecture. In the diagram, processor
cores, on-chip cache banks, memory controllers, and I/O controllers are organized in rows and columns. Wires and switches connect the various on-chip elements and provide a more direct paths than the prior ring interconnect architecture. The nature of a mesh also allows for many more pathways to be implemented, which further minimizes bottlenecks, and also allows Intel
to operate the mesh at a lower frequency and voltage, yet still deliver high bandwidth and low latency.
Kimar also says in the post, “The scalable and low-latency on-chip interconnect framework is also critical for the shared last level cache architecture. This large shared cache is valuable for complex multi-threaded server applications, such as databases, complex physical simulations, high-throughput networking applications, and for hosting multiple virtual machines. Negligible latency differences in accessing different cache banks allows software to treat the distributed cache banks as one large unified last level cache.”
Along with the new mesh, which enhances the connectivity and topology of the on-chip interconnect, Intel is implementing a modular architecture with its Xeon Scalable processors
for resources that access on-chip cache, memory, IO, and remote CPUs. These resources are distributed throughout the chip so “hot-spots” in areas that could be bottlenecked are minimized. Intel claims the higher level of modularity with the new architecture allows available resources to better scale as the number of processor cores increases.