When you look at Intel's Broadwell-Y design at 14nm, there are a number of benefits the company is claiming, most of which relate to power consumption reductions. There are, however, performance gains to be had as well.
In this slide, Intel details the specific power and silicon area savings brought forth by Broadwell-Y at 14nm. From operating voltage to capacitance and leakage current, Broadwell-Y delivers anywhere from 10 - 20 percent lower power consumption, depending on what aspect of the design you're speaking of.
Other areas of optimization for Broadwell include Intel's second generation FIVR design, which offers a new dual-LVR (linear voltage regulator), as well as bypass modes and non-linear droop control. In a nutshell, these optimizations allow for the FIVR block to consume less power based on chip demand. Intel has also developed new 3DL inductors which were removed from the CPU package substrate and are now recessed underneath the die on a tiny module. This allowed for a 30-percent Z-height reduction for the entire module and thinner package overall for tablet designs and the like.
Intel has also improved their clock gating and power control granularity with Broadwell, introducing DCC, or Duty Cycle Control, which builds upon the company's concept of hurry-up and get idle (or HUGI). Essentially, Intel has designed a method to get net-effective clock rates by running at higher frequency for a period of time and then shutting off the clocks (zero leakage) to get the same performance at lower total power consumption. By way of example, rather than run at 5MHz for 4 seconds (50 * 4 = 200), the chip can run at 100MHz for a second, then completely shut off the next second, and then repeat the process (100 * 2 = 200). From here the user experiences the same performance, at a net power savings. That's the general concept anyway.
Beyond power savings and optimization techniques, let's look at the performance upside that Intel is claiming for their Broadwell in general.
- >5% IPC over Haswell
- Larger out-of-order scheduler, Faster store-to-load forwarding
- Larger L2 TLB (1K to 1.5K entries), new dedicated 1GB Page L2 TLB (16 entries)
- 2nd TLB page miss handler for parallel page walks
- Faster floating point multiplier (5 to 3 cycles), Radix-1,024 divider, faster vector gather
- Improved address prediction for branches and returns
- Targeted cryptography acceleration instruction improvements
- Faster virtualization round-trips
- Power efficiency Performance features designed at ~2:1 Performance:Power ratio
- Power gating and design optimization increase efficiency at every operating point
The bullet points listed tally most of the CPU architecture and performance improvements Intel was able to achieve with what they're calling the "Broadwell Converged Core." The net gain may seem smallish at only a greater than 5 percent improvement in overall IPC, but when you consider the power consumption reductions and the fact that, with those reductions, a full dual-core Broadwell-Y chip will be able to reside in a very thin tablet form factor (a product segment where only Intel Atom chips have traditionally played), it's a very impressive advancement in sum total to be sure.