ATI Radeon HD 4850 and 4870: RV770 Has Arrived
RV770 Architecture and Features
As we've already explained, the initial line-up of Radeon HD 4800 series cards will be comprised of the single-slot Radeon HD 4850 and dual-slot Radeon HD 4870.
AMD is touting the Radeon HD 4850 as the first single-GPU solution to offer 1TFLOPs of compute power, thanks to its 625MHz RV770 GPU. The card features GDDR3 memory and has a max power of about 110W. As you'd probably expect, the Radeon HD 4870 is markedly more powerful. Although based on the same GPU, the 4870 is clocked higher at 750MHz, and thus offers 1.2TFLOPs of compute power. The Radeon HD 4870 also makes use of newer GDDR5 memory technology and has a higher max power of 160W. More on the cards themselves a little later.
Low level specs aren't what make the Radeon HD 4800 series cards stand out; it's the RV770 GPU that's really interesting. It turns out that AMD was able to crank the SP count up from 320 on the older RV670 to a beefy 800 on the RV700. AA and Z/Stencil performance are enhanced as well, and the number of texture units has been increased from 16 to 40. What's somewhat surprising about all of these changes though, is that AMD was able to do it with "only" a 44% increase in transistors.
AMD was able to do this by redesigning virtually all of the functional blocks within the GPU. The 800 stream processing units are grouped in a new SIMD core layout, and the texture units, ROPs, and cache have been restructured to minimize transistor count, while also increasing performance. We should also point out that the ring-bus memory controller introduced with the X1K series has been replaced with a new memory controller that can make use of GDDR5 memory.
With the RV770, AMD claims that the SPs in the GPU offer 40% more performance per square millimeter, and that more aggressive clock gating offers improved performance per watt. Likewise, the newly streamlined design of the RV770 texture units reportedly offer 70% more performance per square mm with double the texture cache bandwidth and large increases in 32- and 64-bit filter rates.