The mid-range graphics card market has seen a lot of action as of late, with an infusion of new products and the discontinuation of popular previous-gen products. Last week it was AMD’s turn to announce a new card—the $150 Radeon R7 265—and with its arrival also came a price cut on the Radeon R7 260X, which can now be had for as low as $129, though the MSRP has been reduced to $119.
Today is NVIDIA’s turn to introduce a new mid-range graphics card. But unlike AMD’s re-brand and soft-launch, NVIDIA is at the ready with a brand-new GPU architecture and cards should be hitting store shelves immediately. The new GeForce GTX 750 Ti and GeForce GTX 750 are the first graphics cards in NVIDIA’s line-up to feature the company’s next-gen Maxwell GPU architecture, which is an evolution of Kepler that’s designed to deliver higher performance and power efficiency.
Take a look at the main features and specifications for NVIDIA’s new cards below and then we’ll dive in a little deeper, inspect a couple of retail-ready models, and dig into their performance and power characteristics...
|GeForce GTX 750||GeForce GTX 750 Ti|
|Graphics Processing Clusters||1||1|
|CUDA Cores (single precision)||512||640|
|Base Clock||1020 MHz||1020 MHz|
|Boost Clock||1085 MHz||1085 MHz|
|Memory Clock (Data rate)||5000 MHz||5400 MHz|
|L2 Cache Size||2048 KB||2048 KB|
|Total Video Memory||1024 MB GDDR5||2048 MB GDDR5|
|Total Memory Bandwidth||80 GB/s||86.4 GB/s|
|Texture Filtering Rate (Bilinear)||32.6 GigaTexels/sec||40.8 GigaTexels/sec|
|Fabrication Process||28 nm||28nm|
|Transistor Count||1.87 Billion||1.87 Billion|
2 x Dual-Link DVI
2 x Dual-Link DVI
|Form Factor||Dual Slot||Dual Slot|
|Recommended Power Supply||300 Watts||300 Watts|
|Thermal Design Power (TDP)||55 Watts||60 Watts|
|Price|| $119 MSRP
We’ll discuss NVIDIA’s reference specifications more deeply on the next page, but want to point out a couple of things in the table above. First, take a look at those TDP ratings. The GeForce GTX 750 Ti is rated for only 60w, while its little brother comes in at 55w. That means reference cards (and likely some factory overclocked models) do not require any supplemental power. The cards get all the juice they need right from the graphics slot alone, which opens up the possibility to upgrade many white-box systems that may not ship with beefy enough power supplies to handle more power-hungry discrete GPUs. Also note that the GPU (codenamed GM107) is manufactured using TSMC’s 28nm process node. Previous-gen Kepler-based cards were also produced at 28nm, so while Maxwell is a new architecture, these new cards do not leverage a new manufacturing process. It’s likely that future Maxwell-based GPUs will use a more advanced manufacturing process, however.
NVIDIA was able to keep power so low on the GeForce GTX 750 Ti and GTX 750 by focusing on power efficiency throughout the design of Maxwell. NVIDIA took what they learned with Kepler and its Tegra SoCs and put much of that knowledge into Maxwell. Maxwell is designed to boost efficiency through better GPU utilization, and ultimately improve performance per watt and per die area.
Maxwell’s Streaming Multiprocessors, or SM, are somewhat different than Kepler’s. With Maxwell, NVIDIA has made improvements to the control logic partitions for better workload balancing, and it also has finer-grained clock-gating and better compiler-based scheduling. Maxwell can also issue more instructions per clock cycle, all of which allow the Maxwell SM (also called an SMM in some NVIDIA docs) to exceed Kepler’s SMX in terms of efficiency. NVIDIA is claiming that Maxwell’s new SM architecture can deliver 35% more performance per CUDA Core on shader-limited workloads than Kepler, with up to double the performance per watt, despite using the same 28nm manufacturing process.
Due to the architectural and SM design differences, NVIDIA was able to increase the number of SMs in GM107 to 5, versus 2 in GK107, and also increase the L2 cache size to 2MB versus 256K, with only a 25% increase in die area. Each SM in the GM107 includes a Polymorph Engine and Texture Units, while each GPC includes a Raster Engine. And the ROPs are aligned with L2 cache slices and Memory Controller partitions.
The GM107 GPU contains one GPC, five Maxwell Streaming Multiprocessors (SMM), and two 64-bit memory controller partitions (128-bit total). Each SM is now partitioned into four separate processing blocks, each with its own instruction buffer, scheduler and 32 CUDA cores. With Kepler, the control logic had to route and schedule traffic to 192 CUDA cores. This partitioning simplifies the design and scheduling logic, saving area and power, and reduces computation latency. The compute L1 cache function has now also been combined with the texture cache function, and shared memory is a separate unit shared across all four blocks.
Overall, with this new design, each “SM” is significantly smaller while delivering about 90% of the performance of a Kepler SM. The smaller area per SM, however, allows NVIDIA to implement more SMs per GPU. Comparing GK107 versus GM107, for example, GM107 has five SMs versus two SMs, along with 25% more peak texture performance and 1.7 times more CUDA cores.
Maxwell features an improved NVENC block that provides faster encode (6-8X real-time for H.264 vs. 4x real-time for Kepler) and 8-10X faster decode as well, and thanks to a new local decoder cache, Maxwell has higher memory efficiency per stream for video decoding, which also results in lower power for video decode operations. The updates to the NVENC block make the GM107 a perfect companion to NVIDIA's ShadowPlay feature. We should also point out that Maxwell and the GM107 works for streaming to Shield as well. And it supports G-SYNC too, provided the card is outfitted with the requisite DP output. The GeForce GTX 750 Ti and GTX 750, however, like the GeForce GTX 650 Ti, do not support SLI.