During our Tech Briefing
with ATi in New York last month, ATi provided a foil set
comparing the Radeon 9800 Pro to the GeForce FX. We've
snipped a couple of the salient benefits slides for you
here.
This chart shows you a few of
the raw specifications of the 9800 Pro versus the GeForce
FX. As you can see ATi is claiming full OpenGL 2.0
support, which means the VPU needs to support unlimited
length for Shader Instructions. NVIDIA's claim to fame
with the GFFX has always been "DX9+", with support for 1024
Shader Instruction length. The Radeon 9800 Pro has the
ability to go well beyond that, with their new "F-Buffer"
technology, which we'll cover shortly. Also, Peak
Memory Bandwidth (save the Marketing games played with
compression ratios) is now up to 21.8GB/sec, significantly
faster than their rival's flagship GPU.
Shots from the New York
ATi Tech Briefing
|
New
enhancements for the R350 |
More than just a
speed bump |
|
Next,
we'll dig into some detail on the major differences between
the R300 (or Radeon 9700 Pro) and the R350 / Radeon 9800
Pro. While the base VPU hasn't changed much, ATi has
taken steps to further optimize their high end core.
ATi's new
R350 VPU incorporates a few new tweaks and optimizations
beyond just a simple Engine Clock speed boost. The
R350 picks up where the R300 left off, in terms of DX9 and
OpenGL 2.0 support, as well as their Image Quality
enhancement techniques, called "SmoothVision" and Memory
Bandwidth saving compression algorithms in Z-cache, now
known as "Hyper Z III+"
Smart Shader 2.1
The "F-Buffer" for Limitless Shader Instructions -
The "F-Buffer" is a fairly simple
concept really and it's exactly what allows the Radeon 9800
Pro and other cards based on the R350 VPU, to offer
unlimited length Shader Instructions for the Game Developer.
The "F-Buffer" stands for a Fragmentation Stream FIFO (First
In First Out memory) Memory Buffer, that has been
implemented on chip. Specifically what this does is to
provide temporary storage for pixels that need to be
processed over multiple passes of the shader engines, rather
than writing them out to the frame buffer. Only pixels
that require a single pass will be written out to the frame
buffer. This provides a memory bandwidth savings and
allows the VPU to handle processing workloads more
efficiently.
SmoothVision 2.1
ATi has also tweaked the Radeon 9800
Pro's memory controller, allowing higher performance and
greater efficiency with Anti-Aliasing loads. At higher
resolutions with 4X and 6X AA settings, the 9800 Pro should
by all rights be somewhat faster than the 9700 Pro, clock
for clock. Regardless, ATi's pristine looking Gamma
Corrected 4X and 6X AA methods, are still arguably the best
looking approach to getting rid of the jaggies, that is
available on the market today.
Hyper Z III+
Hyper Z III are ATi's
compression and caching techniques, aimed at providing
valuable memory bandwidth savings in the Z-Bufffer and
Stencil Buffer. Rather than dissecting the technology
for you here, we'll let the folks at ATi go through
its benefit. Here's what they claim Hyper Z III+
brings to the table.
HYPER Z III+ takes this technology a step further with an
enhanced Z-cache that is more flexible and better optimized
to work with stencil buffer data. The stencil buffer
co-exists with the Z-buffer and behaves similarly, in that
an application can set a pixel?s stencil value and compare
it against the value stored in the stencil buffer to
determine if the pixel gets rendered or not. The main
difference is that the Z values in the Z-buffer represent
the ?depth? of a pixel, while the values in the stencil
buffer can represent anything the programmer wants them to.
One of the most common uses for the stencil buffer is for
rendering real-time shadow volumes. In this case, the
application calculates which parts of the image fall in the
shadow of other objects, and stores these shadowed areas in
the stencil buffer. The graphics processor can then compare
each pixel it renders with the stencil buffer values to
determine if it falls within the shadow of any objects that
have already been rendered. As long as all objects are
rendered in the correct order, this technique makes it
possible to generate accurate shadows for any moving object
and/or light sources in a scene.
This process requires a lot of extra computation, so it has
been used sparingly (if at all) in most existing games.
Future game engines, however, such as the Doom 3 engine, are
expected to use it heavily to create very realistic
environments. The enhanced Z-cache feature of HYPER Z III+
increases the performance of stencil shadow volumes and will
help to deliver a superior experience when playing the next
generation of 3D games.
Translation? Doom3 =
faster... Stencil Shadow Volumes and real-time shadow
effects = Faster... Piece of cake, right? OK then,
let's move out.
A New Growing Family Of Radeons
|