NVIDIA Pascal GPU Architecture Preview: Inside The GP100
NVIDIA Pascal GPU Architecture (Cont.)
Chip on Wafer on Substrate (CoWoS) HBM2 On GP100
NVLink is a serial interconnect technology that employs differential signaling with embedded clocks, a technology that has been around for years. The technology also allows for unified memory architectures and cache coherency. NVLink is similar to PCI Express in terms of command set and programming model, but NVLink offers significantly more bandwidth, with better bandwidth utilization. NVIDIA is claiming up to 94% bandwidth efficiency with NVLink.
NVIDIA’s Pascal also features a unified memory architecture. The GPU has a page migration engine with support for Virtual Memory Demand Paging. It has 49-bit Virtual Addresses, which are able to cover 48-bit CPU address in addition to all GPU memory. There is also support for GPU page faulting, and NVIDIA says the GPU can handle “thousands of simultaneous page faults”. Finally, the GPU supports up to a 2MB page size, with better TLB (Translation Look-Aside Buffer) coverage of GPU memory as well.
While Kepler and Maxwell also had support for unified memory, they were limited to a memory space equal to the size of available GPU memory. With Pascal, though, that limitation is gone. The GP100 can allocate memory beyond the GPU memory amount, and up to the total amount of available system memory.
We should learn more about Pascal and additional GPUs that use the architecture in the near future. For now though, it appears that NVIDIA has a powerful base GPU architecture on its hands for the HPC market. When or if the GP100 trickles down into a consumer product remains to be seen, but if history is an indicator, it will arrive in some form and fill the space currently occupied by the Titan X.
Stay tuned to HotHardware in the months ahead as we learn more about the GeForce side of Pascal.