AMD's Virgo platform consists of a desktop APU based on Trinity and a motherboard built around the A85 PCH. As we've noted, AMD has either tweaked or revamped many of the primary functional blocks in their Trinity architecture over the previous generation Llano. From the new Piledriver-based CPU cores themselves to the Northern Islands-based GPU core, Trinity offers performance improvements and additional features in a number of key areas.
Block Diagram: Trinity's VLIW4-based GPU Engine
The GPU engine on board Trinity is based on AMD's previous-generation Northern Islands family of GPU cores. You can loosely think of these as Radeon HD 6400 class GPUs, though the company is re-branding them as integrated Radeon HD 7000 family cores. The GPU has an improved hardware tesselator over the previous gen Llano APU. In addition, it's VLIW4 design offers balanced stream processor cluster with each of the four SPUs offering equal capability and more simplified scheduling, versus the VLIW5 design used in older Radeon HD series GPUs. The number of active Radeon cores in the on-die GPUs will vary from APU to APU, but all desktop Trinity variants sport the same architecture.
As part of the GPU block, AMD has also incorporated an updated version of their HD Media Accelerator with enhanced UVD (Unified Video Decoder) and AVC (Accelerated Video Converter) blocks. The UVD block offers hardware offload for Blu-ray 3D, MPEG-4/DivX, and Picture-in-Picture with dual HD streams. In addition, within the HD Media Accelerator engine, A series APUs also offer AMD Quick Stream video streaming technology for prioritizing video stream packet data for uninterrupted video streaming. And of course many software ISVs will be offering optimized versions of their applications to take advantage AMD's A-series video acceleration and conversion technologies.
The other major design advancement is AMD's new Piledriver compute cores. Piledriver is an optimization of AMD's Bulldozer core that shares the same high-level architecture as Bulldozer, but with a number of major enhancements. The same shared fetch, decode, floating point and L2 cache resources per pair of integer units is present in desktop Trinity, however, AMD has improved their branch prediction and L2 efficiency and improved hardware prefetch as well. Piledriver cores also have a larger L1 TLB or Translation Look-aside Buffer. All told, AMD is claiming a combined performance increase of ~14% on the desktop versus their Bulldozer architecture along with a 50% increase in GPU perf, clock-for-clock. However, factor in Turbo Core 3.0 speed boosts and AMD is claiming larger aggregate performance gains in both desktop and mobile platforms over previous-gen A-series APUs.
Trinity also incorporates an updated DDR3 memory controller that now supports new low power 1.25V DDR3 memory, incorporated with in a new UNB or Unified Northbridge design. It's interesting to note that PCI Express now replaces HyperTransport for serial connectivity to downstream I/O devices in Trinity and it makes sense, given the obvious mass adoption of PCIe serial interfaces across platform and chip-level interconnects. The Radeon Memory Bus or RMB offers a full bandwidth path for the GPU engine to system memory that bypasses cache coherency mechanisms for lower latency access.
Finally, AMD's Turbo Core 3.0 technology offers more aggressive clock gating and overclocking, with up to a 20% increase in top-end GPU clock speed, 19% in single core CPU clock speed and an 8% boost in multithreaded CPU performance. Specifically, the A10-5800K and A8-5600K that we'll be running through their paces next, scale up to 4.2GHz and 3.9GHz, respectively, in single-threaded applications on the CPU, but have base clocks of 3.8GHz and 3.6GHz with dynamic scaling as needed in multithreaded workloads.