Logo   Banner   TopRight
TopUnder
Transparent
NVIDIA GeForce 8800 GTX and 8800 GTS: Unified Powerhouses
Transparent
Date: Nov 08, 2006
Section:Graphics/Sound
Author: Marco Chiappetta and Dave Altavilla
Transparent
Introduction, Specifications, and Related Info

A couple of weeks ago at an Editor's Day event at their headquarters in Santa Clara, California, NVIDIA proclaimed that they planned to "redefine reality" with their self-branded ultimate gaming platform. The products in the spotlight throughout the event were the new nForce 680i SLI chipset, the upcoming 680a SLI, and a new family of graphics cards built around the company's forthcoming DX10 capable G80 GPU.

The G80 at the heart of what became known as the GeForce 8800 GTX and 8800 GTS represented a complete shift in NVIDIA's GPU architecture. NVIDIA's Tony Tomasi even went so far as to say, "every transistor in the chip is new". Of course NVIDIA would leverage some technology from previous generations of products, but the DX10 compliant, Unified architecture of the G80 is a major departure from the G70 GPU and its derivatives that power cards in the GeForce 7 series of products.

After hearing what NVIDIA had to say about the G80 and new nForce chipsets over the course of the event, the idea that the company had designed and built the ultimate gaming platform seemed like a distinct possibility, even for staunch PC enthusiast critics like us. We of course wouldn't pass judgment until we had the products to test for ourselves, however. Fortunately, the nForce 680i SLI and GeForce 8800 GTX and GTS were ready for testing almost immediately, and today we can tell you all about them.

We've talked at great length about the new nForce 600 series of chipsets, and more specifically about the nForce 680i SLI in this article here.  And in this showcase and evaluation we'll be presenting you with information regarding NVIDIA's flagship GeForce 8800 GTX and 8800 GTS. Strap in folks.  It's going to be a wild ride.

NVIDIA GeForce 8800 Series
Features & Specifications
NVIDIA unified architecture:
Fully unified shader core dynamically allocates processing power to geometry, vertex, physics, or pixel shading operations, delivering up to 2x the gaming performance of prior generation GPUs.


GigaThread Technology:
Massively multi-threaded architecture supports thousands of independent, simultaneous threads, providing extreme processing efficiency in advanced, next generation shader programs.

Full Microsoft DirectX 10 Support:
World's first DirectX 10 GPU with full Shader Model 4.0 support delivers unparalleled levels of graphics realism and film-quality effects.

NVIDIA SLI Technology:
Delivers up to 2x the performance of a single graphics card configuration for unequaled gaming experiences by allowing two cards to run in parallel. The must-have feature for performance PCI Express graphics, SLI dramatically scales performance on today's hottest games.

NVIDIA Lumenex Engine:
Delivers stunning image quality and floating point accuracy at ultra-fast frame rates.
16x Anti-aliasing: Lightning fast, high-quality anti-aliasing at up to 16x sample rates obliterates jagged edges.

128-bit floating point High Dynamic-Range (HDR):
Twice the precision of prior generations for incredibly realistic lighting effects - now with support for anti-aliasing.

NVIDIA Quantum Effects Technology:
Advanced shader processors architected for physics computation enable a new level of physics effects to be simulated and rendered on the GPU - all while freeing the CPU to run the game engine and AI.

NVIDIA ForceWare Unified Driver Architecture (UDA):
Delivers a proven record of compatibility, reliability, and stability with the widest range of games and applications. ForceWare provides the best out-of-box experience and delivers continuous performance and feature updates over the life of NVIDIA GeForce GPUs.

OpenGL 2.0 Optimizations and Support:
Ensures top-notch compatibility and performance for OpenGL applications.

NVIDIA nView Multi-Display Technology:
Advanced technology provides the ultimate in viewing flexibility and control for multiple monitors.

PCI Express Support:
Designed to run perfectly with the PCI Express bus architecture, which doubles the bandwidth of AGP 8X to deliver over 4 GB/sec. in both upstream and downstream data transfers.

Dual 400MHz RAMDACs:
Blazing-fast RAMDACs support dual QXGA displays with ultra-high, ergonomic refresh rates - up to 2048x1536@85Hz. 

Dual Dual-link DVI Support:
Able to drive the industry's largest and highest resolution flat-panel displays up to 2560x1600.

Built for Microsoft Windows Vista:
NVIDIA's fourth-generation GPU architecture built for Windows Vista gives users the best possible experience with the Windows Aero 3D graphical user interface.

NVIDIA PureVideo HD Technology:
The combination of high-definition video decode acceleration and post-processing that delivers unprecedented picture clarity, smooth video, accurate color, and precise image scaling for movies and video.

Discrete, Programmable Video Processor:
NVIDIA PureVideo HD is a discrete programmable processing core in NVIDIA GPUs that provides superb picture quality and ultra-smooth movies with low CPU utilization and power.

Hardware Decode Acceleration:
Provides ultra-smooth playback of H.264, VC-1, WMV and MPEG-2 HD and SD movies.

HDCP Capable:
Designed to meet the output protection management (HDCP) and security specifications of the Blu-ray Disc and HD DVD formats, allowing the playback of encrypted movie content on PCs when connected to HDCP-compliant displays.

Spatial-Temporal De-Interlacing:
Sharpens HD and standard definition interlaced content on progressive displays, delivering a crisp, clear picture that rivals high-end home-theater systems.

High-Quality Scaling:
Enlarges lower resolution movies and videos to HDTV resolutions, up to 1080i, while maintaining a clear, clean image. Also provides downscaling of videos, including high-definition, while preserving image detail.

Inverse Telecine (3:2 & 2:2 Pulldown Correction):
Recovers original film images from films-converted-to-video (DVDs, 1080i HD content), providing more accurate movie playback and superior picture quality.

Bad Edit Correction:
When videos are edited after they have been converted from 24 to 25 or 30 frames, the edits can disrupt the normal 3:2 or 2:2 pulldown cadences. PureVideo HD uses advanced processing techniques to detect poor edits, recover the original content, and display perfect picture detail frame after frame for smooth, natural looking video.

Video Color Correction:
NVIDIA's Color Correction Controls, such as Brightness, Contrast and Gamma Correction let you compensate for the different color characteristics of various RGB monitors and TVs ensuring movies are not too dark, overly bright, or washed out regardless of the video format or display type.

Integrated SD and HD TV Output:
Provides world-class TV-out functionality via Composite, S-Video, Component, or DVI connections. Supports resolutions up to 1080p depending on connection type and TV capability.

Noise Reduction:
Improves movie image quality by removing unwanted artifacts.

Edge Enhancement:
Sharpens movie images by providing higher contrast around lines and objects.


NVIDIA G80 Wafer:
She's a Big One

    
The GeForce 8800 GTX GPU

   
The GeForce 8800 GTS GPU


The GeForce 8800 GTX and GTS are based on a totally new unified GPU architecture, so they don't have too much in common with the older GeForce 7 series of products. It would be a good idea to familiarize yourself with NVIDIA's previous product offerings, and their platform as whole, however. For a comprehensive look at the main features of the GeForce 7 series, and for more details regarding NVIDIA's multi-GPU SLI platform, we recommend taking a look at a few of our recent articles...

We know that's a lot of reading, but the information and performance data in the articles listed above will give you much of the background and architectural details necessary to better understand the new products being announced today. If you're unclear about anything on the proceeding pages, look back to these articles for more related details.

Transparent
Architectural Overview

The GeForce 8800 series GPU in a massive piece of silicon. It's comprised of roughly 681 million transistors and manufactured on TSMC's 90nm process node. It implements a massively parallel, unified shader design, consisting of up to 128 individual stream processors in up to 8 groups of 16, running at frequencies of up to 1.35GHz. The GeForce 8800 GTX takes advantage of all 128 stream processors, but the GTS has two blocks disabled for a total of 96 stream processors.  And the unified nature of the design means each processor is capable of being dynamically allocated to vertex, pixel, geometry, or physics operations, unlike traditional GPU architectures that feature discreet pixel and vertex shaders.

 
GeForce 8800 Series GPU Block Diagram

Each GeForce 8800 GPU stream processor is a fully generalized, fully decoupled, scalar, processor that supports IEEE 754 floating point precision. The advantages of being fully scalar, are well summed up in this quote provided by NVIDIA

"Although leading GPUs to date have used vector processing units, because many operations in graphics occur with vector data (such as R-G-B-A components operating in pixel shaders or 4x4 matrices for geometry transforms in vertex shaders), many scalar operations also occur. During the early GeForce 8800 architecture design phases, NVIDIA engineers analyzed hundreds of shader programs which showed an increasing use of scalar computations. They realized that with a mix of vector and scalar instructions, especially evident in longer, more complex shaders, it's hard to efficiently utilize all processing units at any given instant with a vector architecture. Scalar computations are difficult to compile and schedule efficiently on a vector pipeline.

Both NVIDIA and ATI vector-based GPUs have used shader hardware that permits dual instruction issue. Recent ATI GPUs use a "3+1" design, allowing single issue of a four-element vector instruction, or dual-issue of a three element vector instruction and a scalar instruction. NVIDIA GeForce 6x and GeForce 7x GPUs are more efficient with 3+1 AND 2+2 dual-issue design, but still not as efficient as a GeForce 8800 GPU scalar design, which can issue scalar operations to it's scalar processors with 100% shader processor efficiency. NVIDIA engineers have estimated as much as 2X performance improvement can be realized from a scalar architecture that uses 128 scalar processors versus one that uses 32 4-component vector processors, based on architectural efficiency of the scalar design. (Note that vector-based shader program code is converted to scalar operations inside a GeForce 8800 GPU to ensure complete efficiency.)"

All of the stream processors in the GPU are driven by a high-speed clock domain that is separate from the core clock that drives the rest of the chip. For example, the GeForce 8800 GTX core clock is 575MHz and its stream processors run at 1.35GHz. The GeForce 8800 GTS has a core clock of 500MHz, but its stream processors are clocked at 1.2GHz.

The GeForce 8800 series GPU also has six memory partitions that each provide a 64-bit interface to memory, yielding a 384-bit combined interface width on the GTX. One of the memory partitions is disabled in the GTS, which yields a 320-bit memory interface. The memory subsystem implements a high-speed crossbar design, similar to GeForce 7x GPUs, and supports DDR1, DDR2, DDR3, GDDR3, and GDDR4 memory. The GeForce 8800 GTX uses GDDR3 memory clocked at 900MHz with a 384-bit (48 byte-wide) memory interface running at 900MHz (1800MHz DDR) - that equates to 86.4GB/sec. Yikes. 

Texture filtering units are also fully decoupled from the stream processors. The GeForce 8800 series GPU can deliver up to 64 pixels per clock worth of raw texture filtering horsepower (vs. 24 in GeForce 7900 GTX), 32 pixels per clock worth of texture addressing, 32 pixels per clock of 2X anisotropic filtering, and 32-bilinear-filtered pixels per clock.

The GeForce 8800 GTX has six Raster Operation (ROP) partitions (the GTS has 5).  Each partition can process 4 pixels with 16 sub-pixel samples, or a total of 24 pixels/clock with color and Z processing. For Z-only processing, an advanced new technique allows up to 192 samples/clock to be processed when a single sample is used per pixel. If 4x multi-sampled anti-aliasing is enabled, then 48 pixels per clock Z-only processing is possible.


Another new feature inherent to the GeForce 8800 series GPU is dubbed Early Z. Z comparisons for individual pixel data have generally occurred late in the graphics pipeline in the ROP. The problem with evaluating individual pixels in the ROP is that the they have already traversed nearly the entire pipeline. If the pixel ends up being occluded, that's a waste of GPU resources and bandwidth. With complex shader programs that have hundreds or thousands of processing steps, that a lot of processing that can be wasted on pixels that will never be displayed.

To somewhat alleviate this issue the GeForce 8800 employs an Early Z technique test Z values of pixels before they enter the shader pipeline. The result is that a GeForce 8800 GTX GPU can cull pixels at four times the speed of GeForce 7900 GTX.

We'll cover the individual specifications of the new GeForce 8800 GTX and 8800 GTS cards being announced today a little later on, but we thought we'd give you a high level breakdown before discussing some more of the other advanced features offered by NVIDIA latest flagship GPU. Due to the scalable nature of the GPU design, functional blocks can be disable, yielding a GPU with different performance characteristics.

Transparent
Unified Shaders, DX10, and SM 4.0

One of the G80 GPU's main benefits is its full support for DirectX 10, Shader Model 4.0, and the other features inherent to Microsoft's upcoming API. In addition to the increased performance offered by the G80's architecture, DirectX 10 itself is poised to offer major performance benefits over DirectX 9. It will do this by significantly reducing the CPU overhead required for rendering. DirectX 10 addresses DX9's CPU overhead problems in a number of ways. For example, the cost of draw calls and state changes are reduced through a complete redesign of the performance-critical parts of the core API.  Also, new features have been introduced to reduce CPU dependence and to allow more work to be done in one command.

With DirectX 10 Microsoft will also be introducing Shader Model 4.0, which incorporates many key innovations like a new programmable stage called the geometry shader that allows for per-primitive manipulation. DX10 will also provide a new unified shading architecture with a unified instruction set and common resources across vertex, geometry, and pixel shaders. DX10 specifications are also more well defined, so you won't see DirectX 10 class hardware that lacks key features. Some DX9 class GPUs were labeled as DirectX 9 compliant when they did not support all DX9 features, like vertex texture fetch for example.

Geometry Shaders:
Geometry shaders in particular represent a major step forward in the programmable graphics pipeline. In fact, the introduction of Geometry shaders marks the first major change to the 3D graphics pipeline in many years. Geometry shaders allow for the generation and destruction of geometry data on the GPU for the first time. Previously, GPU's could only manipulate existing geometry. Coupled with the new stream output function, algorithms that weren't previously possible, or that had to be executed on the host CPU, can now be mapped to the GPU.

Stream output is another useful new DirectX 10 feature supported in GeForce 8800 GPUs that enables data generated from geometry shaders (or vertex shaders if geometry shaders are not used) to be sent to memory buffers and subsequently forwarded back into the top of the GPU pipeline to be processed again. Allowing data to flow through the GPU this way allows for more complex geometry processing, advanced lighting calculations, and GPU-based physical simulations without heavily taxing the host CPU.

Increased Resources:

Shader model 4.0 also provides an increase in the resources allotted for shader programs. In previous versions of DirectX, developers had to manage relatively scarce register resources. DirectX 10, however, provides large increase in register resources.  As you can see in the chart above, temporary registers are up from 32 to 4096, and constant registers up from 256 to 65,536 (sixteen constant buffers of 4096 registers). Textures, texture sizes, and the number of render targets has increased as well.  The GeForce 8800 architecture can provide all of these DirectX 10 resources.

Unified Shaders:
In prior versions of DirectX, pixel shaders lagged behind vertex shaders in the number of constant registers, available instructions, and instruction limits. Due to these limitations, developers looked at vertex and pixel shaders as separate entities. But with Shader model 4.0's unified instruction set with the same number of registers and inputs for both pixel and vertex shaders, all shaders will be able to tap into the entire resources of the GPU.


Workload with Discreet Pixel and Vertex Shaders

The GeForce 8800's unified architecture also results in a more efficient use of GPU resources. With the previous generation of GPUs that had discreet pixel and vertex shaders, there would almost always be idle hardware. If a scene was particularly pixel shader heavy, for example, the vertex shaders may have sat idle, and vice versa for the opposite scenario.


Workload with Unified Shaders

But with a Unified shader architecture, because GPU resources can be allocated on the fly and dynamically load-balanced, major portions of the GPU won't sit idle waiting for a shift in the workload. At its most basic level, a unified shader architecture makes rendering more efficient.

New HDR Modes:
With DirectX 10, Microsoft will also introducing two new HDR formats which offer the same dynamic range as FP16 but require only half the storage. The first format, R11G11B10, is optimized for storing textures in floating-point format. It uses 11-bits for red and green, 10-bits for blue. The second floating point format is designed to be used as a render target. It uses a 5-bit shared exponent across all colors with 9-bits of mantissas for each component. These new formats will allow high-dynamic range rendering with less costly storage and bandwidth requirements.  For the highest level of precision, DirectX 10 supports 32-bits of data per component. The Geforce 8800 series fully supports this feature, which can be used for anything from high-precision rendering to scientific computing applications.

Better Instancing:
DirectX 10 will also offer new instancing capabilities. With DirectX 9 instanced objects were basically copies of the original; they could not use different textures or shaders.  But with DirectX 10 instanced objects no longer need to use the same textures. Due to the addition of texture arrays, each instanced object can now reference a unique texture by indexing into a texture array. Instanced objects can also use different shaders through the use of HSLS 10's support for switch statements. What this means is that a shader can be written that describes multiple different materials, and during rendering, each instanced object could have unique effects applied to it.

Transparent
The Lumenex Engine

With the GeForce 8800 series architecture, NVIDIA is introducing their new "Lumenex Engine". The Lumenex engine is the name NVIDIA has come up with to describe a host of features integrated in the G80 GPU. The new key features in the Lumenex Engine include 16x Coverage Sampling Anti-Aliasing (CSAA), 16x nearly angle independent anisotropic filtering, 16-bit and 32-bit floating point texture filtering, fully orthogonal 128-bit High Dynamic Range (HDR) rendering with all the above features, and a full 10-bit display pipeline.

CSAA and Angle Independent Anisotropic Filtering:

Coverage Sample Anti-Aliasing is a new anti-aliasing technique that increases image quality, without drastically increasing the demands placed on the GPU or the memory subsystem.  In its standard modes, CSAA compresses the redundant color and Z/stencil information into the memory footprint and bandwidth of 4X multi-sample AA. In its higher quality modes (8xQ and 16xQ), CSAA compresses the information in the footprint and bandwidth of 8X multi-sample AA. Previous generations of NVIDIA GPUs could only do 4X MSAA in hardware.

The image above demonstrated the difference between no anti-aliasing, traditional 4X multi-sampling AA and 16x CSAA. In the "NO AA" portion of the image there are sharp, jagged edges. 4X MSAA does a nice job of softening the edge, but the gradient steps are still clearly visible. The 16X CSAA portion of the images has takes things even further though, and are gradient steps are much less apparent.

The GeForce 8800 GPUs also feature a new, high-quality anisotropic filtering engine that eliminates the angle dependant optimizations used in previous GPUs.  We talk more about the 8800's new anisotropic filtering and anti-aliasing modes a little later.

10-Bit Display Pipeline:
The Lumenex Engine inside the GeForce 8800 series GPUs is built with a full 10-bit display pipeline. This will allow for over a billion unique colors to be displayed with next-gen 10-bit content and displays, but for now, because multiple stages in the pipeline support 10-bit precision, not just the DACs like some older architectures, data integrity is preserved and the final output should be closer to the original input signal.

HDR:
The High Dynamic Range (HDR) lighting capabilities inherent in all GeForce 8800 Series GPUs now supports 128-bit precision (32-bit floating point values per component), and unlike previous NVIDIA GPUs, HDR lighting effects can now be used in conjunction with multi-sample anti-aliasing. With the GeForce 8800 series, MSAA is now compatible with both FP16 (64-bit color) and FP32 (128-bit color) render targets.

PureVideo Enhancements:
The updated PureVideo Engine in the GeForce 8800 series of products also allows for more sophisticated post-processing with HD content.  In addition to the capabilities already offered by the PureVideo engine in the GeForce 6 and 7 series of products, the GeForce 8800 series adds support for VC-1 and H.264 HD Spatial-Temporal Deinterlacing, VC-1 and H.264 HD Inverse Telecine, HD Noise reduction, and HD Edge enhancement.

Transparent
CUDA, the Demos, and HDR With AA

With the GeForce 8800 series architecture, NVIDIA is also announcing their "CUDA" initiative. CUDA is an acronym for Compute Unified Device Architecture. All GeForce 8800 GPUs will support NVIDIA's CUDA, which provides a unified hardware and software solution for data-intensive computing.

CUDA's main features include a new "Thread Computing" processing model that takes advantage of the heavily threaded nature of the GeForce 8800 GPU architecture.

CUDA basically encompasses all GPUGPU functionality, including NVIDIA's Quantum Effects physics technology. Quantum Effects allows for physics effects to be simulated and rendered on the GPU. The GeForce 8800 GPU's stream processors will eventually be used in games to implement more realistic water, smoke, fire, hair, explosion, particle effects, etc., in games that take advantage of the technology. And because these computations are being run on the GPU, the host CPU is freed to run the game engine and AI.

To take advantage of CUDA, NVIDIA will be releasing a compiler that will allow standard C code to be executed on the GPU. Of course, only certain types of applications will benefit from being run on the GPU, namely those that require massive amount of floating point performance, like Folding@Home for example. The GPU's architecture will complement traditional general purpose CPUs by providing additional processing capability for inherently parallel applications.  CUDA technology utilizes GPU resources in a different manner than graphics processing, but both CUDA threads and graphics threads can run on the GPU concurrently if desired. Because the architecture is unified, GPU resources can be dynamically allocated for pixel, vertex, or geometry shader duties, in addition to CUDA or Quantum Effects related processing tasks.


      
Adrianne Curry Demo

      
Waterworld Demo

      
Froggy Demo

With every new GPU, NVIDIA inevitably releases a handful of technology demos designed to exploit the new features and capabilities of the product. This time around, NVIDIA did away with the fairy-tale characters and enlisted the help of Playboy model Adrianne Curry to show off the 8800 series. The Waterworld and Froggy demos also show off the Geometry Shader and Stream Out capabilities of the GeForce 8800. Prior to now, the geometry necessary to create the objects in the Waterworld demo would have been executed on the CPU.  But with the 8800, growing vines, and the particle effects for the water are all tasked to the GPU. The manipulation of Froggy's body is also being computed on the GPU, something that could not have been done before.


      
Oblivion: HDR with 8X Anti-Aliasing - Driven By The GeForce 8800 GTX
High Resolution Images - Large File Sizes

There's no question, the lush cinematic worlds of Oblivion are a showcase of the current capabilities of DX9-based game engines.  However, until now, High Dynamic Range (HDR) lighting effects, combined with full scene anti-aliasing, have only been available to users of ATI graphics cards, since NVIDIA's legacy products do not support the necessary features in hardware.  Valve's proprietary method of 16-bit floating point driven HDR actually works on NVIDIA-based cards but that was the only game engine to utilize this technique.  With NVIDIA's new GeForce 8800 series architecture, all standard methods of HDR, including FP16 and FP32 are supported with both multi-sample and super-sample AA enabled.

As you can see, the effect is quite impressive and though these screen shots were taken at high resolution with 8X AA enabled, frame rates were still more than playable for this game title.  As you'll note, our FRAPS frame counter, in the top left corner of the screen, shows a range of 39 - 60 fps.  If you've spent time in the worlds of Oblivion, you'll know this is completely fluid for game play.  Though the game itself was unable to detect the capabilities of our GeForce 8800 GTX and enable AA with HDR in the game engine, we turned on HDR in Oblivion's options menu and then forced 8X AA on in NVIDIA's driver control panel.  On a side note, at XHD resolutions of 2560X1600, 2X AA with HDR enabled was also very playable as well.  XHD gaming nirvana indeed.

Transparent
The GeForce 8800 GTX

After reading through the architectural details and new features found in the G80, we're sure you're all wondering what the GeForce 8800 GTX and 8800 GTS cards actually looks like, so without further ado we present to you the GeForce 8800 GTX...

      
NVIDIA GeForce 8800 GTX

NVIDIA's new flagship graphics card is a beast in every sense of the word. The card is built upon a 10.5" long PCB and the GPU and RAM are adorned with a massive dual-slot cooling apparatus. The cooler is outfitted with a fan that is designed to draw air in from the back, and blow it across the heatsink's fins, where is it ultimately expelled from the system through vents in the case bracket. There are however, some vents cut in the fan shroud towards the front of the card, which also aid in bringing temperatures down.

         

The 8800 GTX reference specifications call for a G80 GPU clocked at 575MHz with 768MB of RAM clocked at 1.8GHz. Due to the GTX's 384-bit memory bus, cards are equipped with 12, 32-bit DRAM chips, which all reside on the front side of the PCB.

         

Another interesting aspect with regard to the PCB is that it has two SLI edge connectors along the top. NVIDIA hasn't disclosed any specific information about what the second SLI connector could be used for, but when we asked about it, we did receive this response:

"The second SLI connector on the GeForce 8800 GTX is hardware support for potential future enhancements in our SLI software functionality. With the current drivers, only one SLI connector is actually used. Users can plug the SLI connector into either the right or left set of SLI fingers."

The GeForce 8800 GTX is also equipped with a pair of 6-Pin PCI Express power receptacles. Overall, NVIDIA has stated the 8800 GTX consumes a maximum of 185W, and the company recommends a 450W PSU that can supply 30A on its 12V rails.

         

With the cooler removed, the large G80 GPU is exposed, along with a second ASIC behind the DVI outputs. For GeForce 8800 GPUs, NVIDIA decided to put the TMDS and other display logic into a custom, discrete ASIC. This was done to simplify package and board routing as well as for manufacturing efficiencies. While on the subject of display logic, we should also mention that GeForce 8800 GTX cards have dual, dual-link DVI outputs in addition to an TV/HD output.

 


We also received a couple of retail-ready GeForce 8800 GTX cards prior to launch and wanted to showcase them for you here.

         
Leadtek GeForce 8800 GTX

Leadtek's GeForce 8800 GTX conforms to NVIDIA's reference specifications in virtually every way. Underneath the custom fan shroud decal is a card identical to the one pictured above. Leadtek will be bundling their GeForce 8800 GTX card with an assortment of software and accessories that includes a pair of PCI Express power adapters, a DVI to DB15 VGA adapter, an HD component output dongle, a user's manual and installation guide, and a variety of software on CDs. The software compliment included copies of PowerDVD, and the games SpellForce 2 and Trackmania Nations.

      
Asus GeForce 8800 GTX

Asus GeForce 8800 GTX card is also virtually identical to NVIDIA's reference design. The only differentiating physical feature is an "Asus" decal on the center of the fan. We'll be looking at both of these cards in upcoming articles here at HotHardware.

    
Foxconn GeForce 8800 GTX

Finally, Foxconn's GeForce 8800 GTX offering is also of the same NVIDIA reference approach but the company is trying to differentiate with their add-on bundle.  The board will come with a bonus USB Gamepad controller that is actually of very high quality and compatible with many current games.


Transparent
The GeForce 8800 GTS

The GeForce 8800 GTS shares many of the same features as the 8800 GTX, but the two cards differ in a number of ways.

      
NVIDIA GeForce 8800 GTS

For one, the 8800 GTS is built upon a shorter 9" PCB. The card also requires less power; NVIDIA recommends a 400W PSU that can supply 26A on its 12V rails. As such the GTS has only one 6-Pin PCI Express power receptacle. The GTS also has only a single SLI edge connector, so at some point in the future the GTX is likely to offer a few additional features when running in SLI mode. 

         

         
EVGA e-GeForce 8800 GTS

We actually received a retail-ready EVGA e-GeForce 8800 GTS for the purposes of this article. Underneath the card's cooler, which is identical to the one used on the GTX, lies a G80 GPU clocked at 513MHz and 640MB of GDDR3 memory clocked at 1584MHz. Please note that the GTS has "only" 96 streaming processors enabled in the GPU, and its memory has a 320-bit interface, as opposed to 384-bits on the GTX. The 320-bit memory interface means the GTS is outfitted with 10, 32-bit DRAMs.  The PCB does have pads for 12, however. So, there is a possibility that future, unannounced GeForce 8800 series cards with 384-bit memory interfaces may use this PCB design.

EVGA bundles their e-GeForce 8800 GTS with a nice assortment of accessories and software. Included in the box along with the card itself, were a pair of DVI to DB15 VGA monitor adapters, an HD component output dongle, an S-Video cable, a Molex to 6-Pin PCI Express power adapter, a user's manual, some EVGA decals, and a couple of CDs. One disc contained the obligatory drivers, while the other contained a full version of the brand-new game Dark Messiah. Dark Messiah is a great title to showcase some of the capabilities of this card. Many thanks to EVGA for throwing it in with their GTS.

Transparent
Anisotropic Filtering Quality and Performance

NVIDIA has claimed that the G80 at the heart of the GeForce 8800 GTS and GTX offers unsurpassed image quality.  And so, prior to benchmarking the new cards, we spent some time analyzing the 8800 GTX's in-game image quality versus a Radeon X1950 XTX and NVIDIA's previous flagship GeForce 7900 GTX. First, we used Half Life 2: Episode 1's "background_01a" map to get a feel for how each card's anisotropic filtering algorithms affected the scene and we also fired up the D3D AF Tester to get a clear visual representation of the angular dependency of each architecture.

Image Quality Analysis: Anisotropic Filtering
7900 vs. x1950 vs.8800



GeForce 7900 GTX
No Aniso






GeForce 7900 GTX
2X Aniso






GeForce 7900 GTX
4X Aniso






GeForce 7900 GTX
8X Aniso






GeForce 7900 GTX
16X Aniso







Radeon X1950 XTX
No Aniso







Radeon X1950 XTX
2X Aniso







Radeon X1950 XTX
4X Aniso







Radeon X1950 XTX
8X Aniso







Radeon X1950 XTX
16X Aniso







GeForce 8800 GTX
No Aniso





GeForce 8800 GTX
2X Aniso




GeForce 8800 GTX
4X Aniso




GeForce 8800 GTX
8X Aniso




GeForce 8800 GTX
16X Aniso

As you can see in the screen-shots above, as the level of anisotropic filtering is increased, the clarity and sharpness of the ground texture is enhanced. If we compare the quality of the images produced with each card, it's difficult the pick one that is clearly superior but there are definitely more subtle detail in the captures grabbed with the GeForce 8800 GTX.  If you focus your attention on the cracks in the ground in the distance, you'll be able to pick up some of the differences.

The images captured with D3D AF Tester also show the GeForce 8800 GTX's strengths. The 8800 GTX has almost no angular dependency and produces smooth transitions, in an almost circular pattern.  The Radeon X1950 XTX also does a great job with anisotropic filtering, but if you open the 16X aniso shots taken with the D3D Tester side-by-side you'll see the 8800 produces the superior pattern.

What the above screen-shots don't show is that the texture shimmering issue that's plagued the G70 is completely gone. The G80's new filtering capabilities have eliminated the texture shimmering present with older architectures, which actually made gaming much easier on the eyes.


To get an idea as to how increasing the level of anisotropic filtering in a game affected performance, we cycled through every available setting using our custom FarCry benchmark with the GeForce 8800 GTX and 8800 GTS. As the results show, anisotropic filtering is almost "free" on the G80. As the level of anisotropic filtering was increased, performance dropped off only slightly with either card.

Transparent
Image Quality: Anti-Aliasing & CSAA

As we've already mentioned, the G80 GPU at the heart of the GeForce 8800 GTX and GTS cards offers new anti-aliasing modes courtesy of the Lumenex Engine. With the G80, NVIDIA designed an anti-aliasing engine that employs a proprietary algorithm called Coverage Sampling Anti-Aliasing (CSAA). Unlike some older multi-sampling techniques, Coverage Sampling Anti-Aliasing uses intelligent color and Z sample information to perform anti-aliasing while reducing the load placed on the memory system. With CSAA, NVIDIA raised the total number of samples that could be taken per-pixel to 16, as opposed to 4 on the G71.

Image Quality Analysis: Anti-Aliasing
7900 vs. x1950 vs. 8800

GeForce 7900 GTX
No AA



GeForce 7900 GTX
2X AA



GeForce 7900 GTX
2xQ AA



GeForce 7900 GTX
4X AA



GeForce 7900 GTX
8xS AA



GeForce 8800 GTX
4X AA



GeForce 8800 GTX
8X AA



GeForce 8800 GTX
8xQ AA



GeForce 8800 GTX
16X AA



GeForce 8800 GTX
16xQ AA



Radeon X1950 XTX
No AA

Radeon X1950 XTX
2X AA

Radeon X1950 XTX
4X AA

Radeon X1950 XTX
6X AA
 

To see how the GeForce 8800 GTX performed in regard to anti-aliasing, we fired up Half Life 2 and captured a few screen-shots at the various anti-aliasing modes available. We also did the same with a GeForce 7900 GTX and a Radeon X1950 XTX. Please pay special attention to the labels and the file names when clicking through the images above though, as only the 4X anti-aliasing shots will represent an apples-to-apples-to-apples comparison between the three cards.

If you flip through the shots, the first thing you're likely to notice is a slight rendering bug on the 8800 that causes a problem with the lighting on some of the building and trees. We're confident this will be fixed in a future driver release so we won't dwell on it. What's more important to focus on are the gradients on the cables that span the top of the screen, and the fine details in the antennas atop the buildings. As the AA levels are increased, the GeForce 8800 GTX does a great job of reducing the jaggies, and 8800 also seems to better preserve some fine detail.

Transparent
CSAA Performance

To quickly asses the performance impact enabling CSAA had on frame-rates in a couple of popular games, we ran a handful of tests with F.E.A.R. and Prey using the GeForce 8800 GTX and GTS. We started with 4X anti-aliasing enabled, and cycled though the other modes offered with both games running at 1600x1200 with 16X anisotropic filtering enabled.

CSAA Performance: F.E.A.R. and Prey
Upping the Number of Samples

 

Jumping from 4X to 8X anti-aliasing with either card resulted in an approximate 20% to 30% performance drop, but from there on up, performance remained relatively stable until we hit the maximum 16xQ anti-aliasing mode. You may be asking yourself how this can be possible, as moving from 8X to 8xQ and ultimately 16X AA results in roughly equivalent performance. This is due to the Luminex Engine's ability to compress the redundant color and depth/stencil information into the memory footprint and bandwidth of 4 or 8 multi-samples. In fact, 8X AA and 16XAA both store only 1 texture sample and 4 color/Z samples. The two modes differ only in the number of coverage samples taken, which doesn't have as much of an impact on performance. 16xQ anti-aliasing on the other hand stores double the number of color/Z samples (8), hence the additional performance drop off.

Transparent
Our Test System and 3DMark06

HOW WE CONFIGURED THE TEST SYSTEMS: We tested all of the graphics cards used in this article on an EVGA nForce 680i SLI based motherboard powered by a Core 2 Extreme X6800 dual-core processor and 2GB of low-latency Corsair RAM. The first thing we did when configuring the test system was enter the BIOS and set all values to their default settings. Then we manually configured the memory timings and disabled any integrated peripherals that wouldn't be put to use. The hard drive was then formatted, and Windows XP Pro with SP2 and the October DX9 update was installed. When the installation was complete, we then installed the latest chipset drivers available, installed all of the other drivers necessary for the rest of our components, and removed Windows Messenger from the system.  Auto-Updating and System Restore were also disabled, the hard drive was defragmented, and a 1024MB permanent page file was created on the same partition as the Windows installation. Lastly, we set Windows XP's Visual Effects to "best performance," installed all of the benchmarking software, and ran the tests.

The HotHardware Test System
Core 2 Extreme Powered

Processor -

Motherboard -


Video Cards -






Memory -


Audio -

Hard Driv
e -

 

Hardware Used:
Core 2 Extreme X6800 (2.93GHz)


EVGA nForce 680i SLI
nForce 680i SLI chipset

GeForce 8800 GTX

GeForce 8800 GTS
GeForce 7950 GX2
GeForce 7900 GTX
Radeon X1950 XTX (CF Master)


2048MB Corsair PC2-6400C3
2 X 1GB

Integrated on board

Western Digital "Raptor"

74GB - 10,000RPM - SATA

Operating System -
Chipset Drivers -
DirectX -

Video Drivers
-




Synthetic (DX) -
DirectX -
DirectX -
DirectX -
DirectX -
DirectX -
OpenGL -

OpenGL -
Relevant Software:
Windows XP Pro SP2
nForce Drivers v9.53
DirectX 9.0c (October Redist.)

NVIDIA Forceware v96.94

ATI Catalyst v6.10


Benchmarks Used:
3DMark06 v1.0.2
Battlefield 2142 v1.01*
Need For Speed: Carbon v1.2*
FarCry v1.4*
F.E.A.R. v1.08
Half Life 2: Episode 1*
Prey v1.2*
Quake 4 v1.3*

* - Custom Test (HH Exclusive demo)
Performance Comparisons with 3DMark06 v1.0.2
Details: http://www.futuremark.com/products/3dmark06/

3DMark06
3DMark06 is the latest addition to the 3DMark franchise. This version differs from 3Dmark05 in a number of ways, and now includes not only Shader Model 2.0 tests, but Shader Model 3.0 and HDR tests as well. Some of the assets from 3DMark05 have been re-used, but the scenes are now rendered with much more geometric detail and the shader complexity is vastly increased as well. Max shader length in 3DMark05 was 96 instructions, while 3DMark06 ups the number of instructions to 512. 3DMark06 also employs much more lighting, and there is extensive use of soft shadows. With 3DMark06, Futuremark has also updated how the final score is tabulated. In this latest version of the benchmark, SM 2.0 and HDR / SM3.0 tests are weighted and the CPU score is factored into the final tally as well.

In terms of 3DMark06's general scoring metric, we see a shadow of things to come in our real-world game engine tests.  A GeForce 8800 GTS is roughly as fast as a GeForce 7950 GX2 and significantly faster than a Radeon X1950 XTX.  Then of course, the Grand-Daddy here is the new GeForce 8800 GTX, which easily broke the 10K 3DMark threshold, a first for any single GPU configuration that has ever hit our labs.

Perhaps the most interesting data point here is that the GeForce 8800 GTS loses slightly to the GeForce 7950 GX2 in Shader Model 2.0 performance and edges out the GX2 in SM 3.0 performance.  We'd offer that perhaps this is a testament to the new GeForce 8800 series, in terms of its shader engine capabilities moving forward with leading-edge game titles employing more complex shader instructions and effects. A final observation is that the GeForce 8800 GTX, according to 3DMark06, is roughly 50% more powerful with SM 3.0 workloads than NVIDIA's previous single-GPU flagship card, the 7900 GTX.

Transparent
Half Life 2: Episode 1

Historically a strong suit for ATI-based graphics cards, we begin our standard benchmark testing with Valve's Half Life 2: Episode 1.

Performance Comparisons with Half Life 2: Episode 1
Details: http://www.half-life2.com/

Half Life 2: Episode 1
update to HL2, Episode 1, we benchmarked the game with a long, custom-recorded timedemo that takes us through both outdoor and indoor environments. These tests were run at resolutions of 1,280 x 1,024 and 1,600 x 1,200 with 4X anti-aliasing and 16X anisotropic filtering enabled concurrently, and with color correction and HDR rendering enabled in the game engine as well.

 

Both of the new GeForce 8800 series cards prove themselves to be the fastest single graphics cards at any resolution in Half Life 2: EP1. They're even able to beat out the dual-GPU configuration of the GeForce 7950 GX2 here and handily take out ATI's current flagship Radeon X1950 XTX.  In fact the GTS is some 32% faster than a Radeon X1950 XTX at high resolutions and a GTX comes close nearly 68% faster.  Jaw dropping performance to be sure but we need to step up the workload a notch or two as well.

Transparent
FarCry v1.4 Performance

Though slightly on the dated side, Far Cry is still runs on a fairly robust DX9 game engine.  Testing with our custom FC time-demo with a fully patched version of the game is next.

Performance Comparisons with FarCry v1.4
Details: http://www.farcry.ubi.com/

FarCry
If you've been on top of the gaming scene for some time, you probably know that FarCry was one of the most visually impressive games to be released on the PC in the last few years.  Courtesy of its proprietary engine, dubbed "CryEngine" by its developers, FarCry's game-play is enhanced by Polybump mapping, advanced environment physics, destructible terrain, dynamic lighting, motion-captured animation, and surround sound. Before titles such as Half-Life 2 and Doom 3 hit the scene, FarCry gave us a taste of what was to come in next-generation 3D gaming on the PC. We benchmarked the graphics cards in this article with a fully patched version of FarCry using a custom-recorded demo run taken in the "Catacombs" area checkpoint. The tests were run at various resolutions with 4X AA and 16X aniso enabled concurrently.

Far Cry turned out to be about the same level of challenge for the GeForce 8800 series as was Half Life 2: Episode 1 on the previous page, but in this case the Radeon X1950 XTX had a much easier time with our custom demo and actually managed to nearly match the performance of the GeForce 8800 GTS.  Then our dual-GPU infused GeForce 7950 GX2 took second place by a comfortable margin.  And in the pole position, the new GeForce 8800 GTX bested the two top single GPU cards by over 30 frames per second and even managed a 15% performance gain over the GeForce 7950 GX2 at high resolution.

Transparent
F.E.A.R. v1.08 Performance

F.E.A.R. is definitely a relatively taxing and impressive game engine with a very realistic particle system and a great physics engine in comparison to many other games currently on the market.

Performance Comparisons with F.E.A.R
More Info: http://www.whatisfear.com/us/

F.E.A.R
One of the most highly anticipated titles of 2005 was Monolith's paranormal thriller F.E.A.R. Taking a look at the game's minimum system requirements, we see that you will need at least a 1.7GHz Pentium 4 with 512MB of system memory and a 64MB graphics card in the Radeon 9000 or GeForce4 Ti-classes or better, to adequately run the game. Using the full retail release of the game patched to v1.07, we put the graphics cards in this article through their paces to see how they fared with a popular title. Here, all graphics settings within the game were set to their maximum values, but with soft shadows disabled (Soft shadows and anti-aliasing do not work together currently). Benchmark runs were then completed at resolutions of 1,280x960 and 1,600x1,200, with anti-aliasing and anisotropic filtering enabled.

 

Here our frame-rates are much more subdued and in fact at high resolution with 4X AA enabled, even a powerful card like the GeForce 7900 GTX gets a little pokey. In this test, the GeForce 8800 GTS again edges out the recently released Radeon X1950 XTX and once again the GeForce 8800 GTX reins supreme even over the entire lot, including the dual-GPU powered GeForce 7950 GX2.  In terms of single GPU performance, the GeForce 8800 GTX is 40 - 45% faster than a Radeon X1950 XTX.

Transparent
Quake 4 v1.3 Performance

One of the most widely used and re-purposed OpenGL game engines on the market; Quake 4 is next...

Performance Comparisons with Quake 4
Details: http://www.quake4game.com/

Quake 4
id Software, in conjunction with developer Raven, recently released the latest addition to the wildly popular Quake franchise, Quake 4. Quake 4 is based upon an updated and slightly modified version of the Doom 3 engine, and as such performance characteristics between the two titles are very similar.  Like Doom 3, Quake 4 is also an OpenGL game that uses extremely high-detailed textures and a ton of dynamic lighting and shadows, but unlike Doom3, Quake 4 features some outdoor environments as well. We ran this these Quake 4 benchmarks using a custom demo with the game set to its "High-Quality" mode, at resolutions of 1,280 x 1,024 and 1,600 x 1,200 with 4X AA and 8X aniso enabled simultaneously.

Though OpenGL performance and Quake 4 have always been strong points for NVIDIA-based cards, the Radeon X1950 XTX does put up a solid showing here, actually besting NVIDIA's legacy single GPU card, the GeForce 7900 GTX.  However, ATI's fastest is still no match for the new GeForce 8800 series powerhouses and both walk off with decisive victories.  In fact the GeForce 8800 GTS is actually able to keep pace with the GeForce 7950 GX2. Finally, witnessing the GeForce 8800 GTX push out over 135 frames per second at 1600X1200 with 4X AA enabled borders on insanity.  Clearly if you're going to step up for the power and cost of a GeForce 8800 GTX, you better have a high-resolution LCD panel or CRT to go with it or we may have to hunt you down and ridicule you publicly.

Transparent
Prey v1.2 Performance

Take-Two's Prey is a game based on the Doom 3 engine. Like Quake 4 it also places a bit more strenuous demand on the graphics subsystem than many other titles.

Performance Comparisons with Prey
Details: http://www.prey.com/

Prey
After many years of development, Take-Two Interactive recently released the highly anticipated game Prey. Prey is based upon an updated and modified version of the Doom 3 engine, and as such performance characteristics between the two titles are very similar.  Like Doom 3, Prey is also an OpenGL game that uses extremely high-detailed textures and a plethora of dynamic lighting and shadows.  But unlike Doom3, Prey features a fare share of outdoor environments as well.  We ran these Prey benchmarks using a custom recorded timedemo with the game set to its "High-Quality" graphics mode, at resolutions of 1,280 x 1,024 and 1,600 x 1,200 with 4X AA and 16X anisotropic filtering enabled simultaneously.

 

Once again GPU-for-GPU, the new GeForce 8800 series from NVIDIA shows itself to be considerably faster in single GPU configurations, than anything on the market currently.  The GeForce 8800 GTS clocks in over 15% faster than a Radeon X1950 XTX and a shade under the performance of a GeForce 7950 GX2, in our custom Prey benchmark. The GeForce 8800 GTX is even 20% faster, in this game title, than a dual-GPU powered GeForce 7950 GX2. And at 1600X1200 with 4X AA enabled, Prey at 100+ fps is a whole barrel of fun.

Transparent
Need For Speed - Carbon

 

In an effort to mix things up a bit and get as far away from the first person shooter genre as possible, we have EA's Need For Speed: Carbon on tap next.  A jacked up, pimped out racing simulation with plenty of eye candy, NFS: Carbon should push these cards a bit more to the point where even a new GeForce 8800 series GPU breaks a sweat.

Performance Comparisons with Need For Speed: Carbon
Details: http://nfs.ea.com/

Need For Speed:
Carbon
Dating back to the days of floppy disks, EGA, and the Lamborghini Countach, the Need For Speed franchise is undoubtedly one of the most popular in gaming history.  The most recent addition to the franchise is Need For Speed: Carbon, a racing-sim loaded with muscle cars and exotics in addition to a number of lighting and special graphics effects. We ran these NFS: Carbon benchmarks by utilizing FRAPS and tracking framerates on the same track, using the same car with every graphics card. The game was configured with all of its graphics-related options set to their maximum values, with motion blur enabled.  We tested the game at resolutions of 1,280 x 1,024 and 1,600 x 1,200 with 4X AA and 16X anisotropic filtering enabled simultaneously.

 

Talk about a whole new world, whether you consider our Prey, F.E.A.R., or Half Life 2: Episode 1 tests, nothing put the hurt on these new Graphics cards like NFS  Carbon.  Of course you don't need blistering fast frame rates to play this great new racing sim either.  Our first observation is that NVIDIA has some driver work to do to get SLI working with the game, as the GeForce 7950 GX2 was having major issues with its dual-GPU setup if it couldn't even beat out a GeForce 7900 GTX.  Beyond that, the new Radeon X1950 XTX puts out a solid performance but can't quite catch the drift of a GeForce 8800 GTS.  And of course, at the risque of sounding mildly trite, the GeForce 8800 GTX leaves all other competitors in its dust.  This new flagship monster GPU from NVIDIA is over 40% faster than the fastest ATI currently has to offer, in our Need For Speed: Carbon testing.

Transparent
Battlefield 2142 Performance

Though not exactly a showcase of GPU horsepower and capability, there is little doubt that Battlefield 2142 is going to become a hugely popular multi-player first person shooter. Thus we've run our new GeForce 8800 series cards through their paces with the new EA title as a relevant reference point for battle-hardened readers.

Performance Comparisons with Battlefield 2142
Details: http://battlefield2142.ea.com/

Battlefield 2142
DICE's Battlefield 2142 is a futuristic update to the wildly popular Battlefield 2. The latest addition to the franchise is powered by an updated version of the proven Battlefield 2 engine, that now incorporates more shader and atmospheric effects. The textures and artwork are also of a higher resolution than those used in BF2. We ran these benchmarks using FRAPS on the Verdun map, in single player mode. The game was configured with its graphical options set to their maximum values, and we monitored framerates at resolutions of 1,280 x 960 and 1,600 x 1,200 with 4X AA and 16X anisotropic filtering enabled simultaneously.

If you're looking for a clear, decisive performance edge with BF 2142, we'd suggest you focus more on the amount of system memory installed (we recommend 2GB) and perhaps your CPU, rather than the GPU. Regardless, the flat-out fastest card in this mostly CPU-bound test, is once again the GeForce 8800 GTX. Both the 8800 GTS and GTX are able to take ATI's Radeon X1950 XTX to task and beat it handily. In fact the GTX is also able to edge out the ever-potent GeForce 7950 GX2 as well.

Transparent
XHD Resolutions: HL2 Episode 1

For the next round of testing, we've upped the ante and re-tested all of the graphics cards at XHD resolutions with a handful of games.

Performance Comparisons with Half-Life 2: Episode 1 XHD
Details: http://www.half-life2.com/

Half Life 2: Episode 1
Thanks to the dedication of hardcore PC gamers and a huge mod-community, the original Half-Life became one of the most successful first person shooters of all time.  So, when Valve announced Half-Life 2 was close to completion, gamers the world over sat in eager anticipation. Upon its release, HL2 was universally lauded, and the sequel won an array of "Game of the Year" awards. Armed with the latest episodic update to HL2, Episode 1, we benchmarked the game with a long, custom-recorded timedemo that takes us through both outdoor and indoor environments. These tests were run at resolutions of 1920 x 1200 and 2560 x 1600 with 4X anti-aliasing and 16X anisotropic filtering enabled concurrently, and with color correction and HDR rendering enabled in the game engine as well.

 

Once again, NVIDIA's new GeForce 8800 series cards are able to outpace all of the competition in almost every test configuration. The GeForce 8800 GTS was every so slightly slower than the GeForce 7950 GX2 at the higher resolution, likely do to the GX2's larger frame buffer (1GB vs. 640MB), but in the other three tests the GTS and 8800 GTX were dominant. In fact, the GeForce 8800 GTX was about twice as fast as the older, former single-GPU powered kingpins, the GeForce 7900 GTX and Radeon X1950 XTX.

Transparent
XHD Resolutions: F.E.A.R.

Performance Comparisons with F.E.A.R XHD
More Info: http://www.whatisfear.com/us/

F.E.A.R
One of the most highly anticipated titles of 2005 was Monolith's paranormal thriller F.E.A.R. Taking a look at the game's minimum system requirements, we see that you will need at least a 1.7GHz Pentium 4 with 512MB of system memory and a 64MB graphics card in the Radeon 9000 or GeForce4 Ti-classes or better, to adequately run the game. Using the full retail release of the game patched to v1.07, we put the graphics cards in this article through their paces to see how they fared with a popular title. Here, all graphics settings within the game were set to their maximum values, but with soft shadows disabled (Soft shadows and anti-aliasing do not work together currently). Benchmark runs were then completed at resolutions of 1920 x 1200 and 2560 x 1600 with anti-aliasing and anisotropic filtering enabled.

 

At a resolution of 2560x1600, the F.E.A.R. benchmark is able to slow almost all of the graphics cards we tested to a virtual crawl, with the exception of the GeForce 8800 GTX that is. At the lower resolution, things are somewhat competitive with the 8800 GTS coming in between the GX2 and Radeon X1950 XTX, and at 2560x1600, the GTS is actually outpaced by the X1950 XTX, likely due to the latter's super-fast 2GHz frame buffer.  The GeForce 8800 GTX is simply in a league of its own, however.  At both resolutions it crushes all of the competition by margins ranging from about 10% to a whopping 110%.

Transparent
XHD Resolutions: Quake 4

Performance Comparisons with Quake 4 XHD
Details: http://www.quake4game.com/

Quake 4
id Software, in conjunction with developer Raven, recently released the latest addition to the classic Quake franchise, Quake 4. Quake 4 is based upon an updated and slightly modified version of the Doom 3 engine, and as such performance characteristics between the two titles are very similar.  Like Doom 3, Quake 4 is also an OpenGL game that uses extremely high-detailed textures and a ton of dynamic lighting and shadows, but unlike Doom3, Quake 4 features some outdoor environments as well. We ran this these Quake 4 benchmarks using a custom demo with the game set to its "High-Quality" mode, at resolutions of 1920 x 1200 and 2560 x 1600 with 4X AA and 8X aniso enabled simultaneously.

With a Core 2 Extreme X6800 powering the system and the multi-threaded v1.3 patch installed, all of the cards we tested, with the exception of the GeForce 7900 GTX perhaps, put up playable framerates at both of the XHD resolutions we tested. The GeForce 8800 GTS finished just behind the GeForce 7950 GX2 at both resolutions, and missed the mark set by the Radeon X1950 XTX by about 10% at 2560 x 1600, but its performance was clearly superior to the 7900 GTX. The new GeForce 8800 GTX on the other hand performed extraordinarily in comparison to all of the other cards. Its large frame buffer and higher clocks (relative to the 8800 GTS) propelled the 8800 GTX to the top of the charts by large margins.

Transparent
XHD Resolutions: Prey

Performance Comparisons with Prey XHD
Details: http://www.prey.com/

Prey
After many years of development, Take-Two Interactive recently released the highly anticipated game Prey. Prey is based upon an updated and modified version of the Doom 3 engine, and as such performance characteristics between the two titles are very similar.  Like Doom 3, Prey is also an OpenGL game that uses extremely high-detailed textures and a plethora of dynamic lighting and shadows.  But unlike Doom3, Prey features a fare share of outdoor environments as well.  We ran these Prey benchmarks using a custom recorded timedemo with the game set to its "High-Quality" graphics mode, at resolutions of 1920 x 1200and 2560 x 1600 with 4X AA and 16X anisotropic filtering enabled simultaneously.

 

The results reported by our custom Prey benchmark at XHD resolutions somewhat mirror those reported by Quake 4 on the previous page. The GeForce 8800 GTX came in every so slightly behind the Radeon X1950 XTX at the higher resolution, but only the GX2 and 8800 GTX were faster at 1920 x 1200. Once again though, the new GeForce 8800 GTX put up one heck of a dominant performance, besting all of the competition by large double-digit percentages across the board.

Transparent
PureVideo Features and Performance

For our next round of tests we took a look at Digital Video processing performance between the two competing core GPU architectures, "PureVideo" technology at work for NVIDIA and "AVIVO" driving ATI.

WMV-HD Decode Acceleration
PureVideo Performance Explored

To characterize CPU utilization when playing back WMV HD content, we used the Performance Monitor built into Windows XP. Using the data provided by Windows Performance Monitor, we created a log file that sampled the percent of CPU utilization every second, while playing back the 1080p version of the "Amazing Caves" video available for download on Microsoft's WMVHD site. The CPU utilization data was then imported into Excel to create the graph below. The graph shows the CPU utilization for a GeForce 7900 GTX, a Radeon X1950 XTX, and the GeForce 8800 GTX using Windows Media Player 11, with XP patched using the DXVA updates posted on Microsoft's web site (Updates Available Here). The desktop resolution was set to 1920 x 1200 for these tests.


Average CPU Utilization (Core 2 Extreme X6800 @ 2.93GHz)

GeForce 7900 GTX Radeon X1950 XTX GeForce 8800 GTX
16.48% 16.0065407% 16.0065407%

One of the more interesting things to ever happen in the HH labs, took place during our CPU utilization testing for this article. As you can see in the graph above, all three of the cards handled the Amazing Caves video without much of a problem. No card ever went over the 25% CPU utilization mark. With a GeForce 7900 GTX in the test system, roughly 16.5% of CPU resources were used during the playback of this HD video.  With both the Radeon X1950 XTX and GeForce 8800 GTX installed though, exactly 16.0065407% of the CPU's resources were required. We factored the result out to seven decimal places to show just how wild this result was. With 86 samples recorded during the video playback for each GPU, the results averaged out to the exact same value. Any mathematicians in the audience?  What are the odds of that happening again?

DVD Video Quality: HQV Benchmark
http://www.hqv.com/benchmark.cfm

Next up, we have the HQV DVD video benchmark from Silicon Optics. HQV is comprised of a sampling of SD video clips and test patterns that have been specifically designed to evaluate a variety of interlaced video signal processing tasks, including decoding, de-interlacing, motion correction, noise reduction, film cadence detection, and detail enhancement. As each clip is played, the viewer is required to "score" the image based on a predetermined set of criteria. The numbers listed below are the sum of the scores for each section. We played the HQV DVD using the latest version of Intervideo's WinDVD 7 Platinum Suite, with hardware acceleration and PureVideo extensions enabled.

NVIDIA's latest Forceware drivers give the GeForce 7900 GTX a nice boost in performance in this test, and give the GeForce 8800 series of cards a slight edge over ATI's Radeon X1950 XTX.  The only differences between the GeForces and Radeon, however, were in the Noise Reduction tests, where we gave the NVIDIA-powered cards a slight advantage. In all honestly though, if HQV's scoring guidelines allowed it, we'd probably give ATI a 7.5 on the NR tests, and NVIDIA an 8.  But HQV doesn't allow this.  The output from either architecture is really that close.

Transparent
Preliminary SLI Testing

SLI was not quite ready for prime-time with initial driver release NVIDIA provided to analysts (v96.94), but a few days ago NVIDIA came though with an updated driver that was SLI capable (v96.97). Some functions are still currently disabled, like SLIAA for example, but the driver was stable and worked perfectly throughout a short run with some various benchmarks.

We ran a handful of benchmarks using a pair of GeForce 8800 GTS and GeForce 8800 GTX cards and have a quick comparison available below.  When looking at the graphs, please note that we're comparing single-card performance versus SLI - GeForce 8800 GTS and 8800 GTX numbers are separated into two separate graphs. All of the tests were run at the same resolution and settings (1920x1200 | 4XAA/16X Aniso), with the exception of the 3DMark06 test which represents a default benchmark run (1280x1024).

Performance Comparisons: Single-Card vs. SLI
Double Your Pleasure, Double The Fun

 

Performance scaled very well with the GeForce 8800 GTS and GTX cards running in SLI mode, especially in the Prey and F.E.A.R. benchmarks. Those two games in particular showed massive performance improvements with two GPUs sharing the rendering workload. Even at this early stage of driver development (relatively speaking), NVIDIA seems to have SLI working well with the GeForce 8800 series, at least from a performance perspective. And we suspect that things will only get better moving forward, as NVIDIA's driver team gets more familiar with the intricacies of their latest GPU.

Transparent
Overclocking the new GeForces

As we neared the end of our testing, we spent a little time overclocking the new GeForce 8800 GTS and GTX cards using the clock frequency slider available within NVIDIA's Forceware drivers, after enabling the "Coolbits" registry tweak.

Overclocking the GeForce 8800s
(Fast 3D Video Cards) + Overclocking = Even Faster Cards


GeForce 8800 GTX: Stock=576MHz GPU/1800MHz Mem | Overclocked=626MHz GPU/1900MHz Mem
GeForce 8800 GTS: Stock=513MHz GPU/1584MHz Mem | Overclocked=563MHz GPU/1684MHz Mem

 


GeForce 8800 GTX: Stock=576MHz GPU/1800MHz Mem | Overclocked=626MHz GPU/1900MHz Mem
GeForce 8800 GTS: Stock=513MHz GPU/1584MHz Mem | Overclocked=563MHz GPU/1684MHz Mem

We were pleasantly surprised by the overclockability of both of the new GeForce 8800 series cards, but had some interesting results. We were able to take the GeForce 8800 GTX up from its stock GPU core and memory clock frequencies of 576MHz / 1.8GHz to 626MHz / 1.9GHz. And we were able to take the GeForce 8800 GTS up from its default GPU and memory clocks of 513MHz / 1584MHz, to 563MHz / 1684MHz, core and memory clock frequency increases of 50MHz and 100MHz, respectively, for both cards. We suspect there is actually a little more left in the tank with these cards, and will experiment with retail product and third-party overclocking tools in the near future.

Transparent
Power Consumption, Temps and Noise

We have a few final data points to cover before bringing this article to a close. Throughout all of our benchmarking, we monitored how much power our test system was consuming using a power meter, and also took some notes regarding its noise output and GPU temperatures. Our goal was to give you all an idea as to how much power each configuration used and to explain how loud the configurations were under load. Please keep in mind that we were testing total system power consumption here, not just the power being drawn by the video cards alone.

Total System Power Consumption & Acoustics
It's All About the Watts and Decibels

The new GeForce 8800 GTX and GeForce 8800 GTS consumed more power than any of the other cards we tested, but the results are somewhat promising. Some of you may be thinking we're a little nuts to say that, considering the GTX requires two 6-Pin PCI Express supplemental power leads, but the good news is that the increased performance offered by these new cards is not directly proportional to their power consumption.  If we look at the GeForce 8800 GTX in particular, in many cases it offered nearly double the performance of the Radeon X1950 XTX, yet under load, the test system consumed "only" 33 more watts. And the GTS and X1950 XTX were roughly on par with one another under load. When compared to the 7900 GTX things don't look quite as rosy, but power consumption is not quite as insane as initial rumors regarding the G80 let on.  And who knows, with a die-shrink and GDDR4 memory, the spring-refresh products - if they in fact use GDDR4 memory and get a die-shrunk GPU; we're speculating - will likely offer better performance with lower power requirements.


GeForce 8800 GTS Core Temperature Over Time


GeForce 8800 GTX Core Temperature Over Time

As you'd probably expect looking at the power consumption numbers, the GeForce 8800 GTS and GTX put off a substantial amount of heat.  We monitored GPU temperatures over a 15 minute span with RTHDRIBL running at 1920x1200, as saw maximum temperature of about 80'C for the GTS and 84'C for the GTX. If you're in the market for either one of these cards, we definitely recommend good case ventilation.

Lastly, we have some comments regarding the noise generated by the coolers used on the new GeForce 8800 GTS and GTX. Throughout our testing, the fans on both cards spun up after only a few minutes of gaming. The noise output wasn't bad though. We couldn't register a solid result on our aging sound level meter, but we can say that the 8800 GTX and GTS are perhaps a bit louder than a 7900 GTX. We definitely wouldn't categorize the fans as quiet when spun-up, but we don't think the noise output will be an issue for any gamer or enthusiast.

Transparent
Our Summary and Conclusion

Performance Summary: NVIDIA's new GeForce 8800 GTS and GTX cards are mighty strong performers. Throughout our entire battery of benchmarks, both cards put up framerates at, or near the top of the charts. The GeForce 8800 GTS outperformed a GeForce 7900 GTX in every test we ran. It did, however, missed the mark set by a Radeon X1950 XTX in a couple of high-resolution tests, and trailed a GeForce 7950 GX2 on a few occasions, but the features and enhanced image quality offered by the 8800 GTS offset any of these results in our opinion.

The GeForce 8800 GTX's performance was for more dominant. In every benchmark we ran, the GeForce 8800 GTX was clearly the best performer, and in some cases it doubled the performance of any of the previous generation, single-GPU powered cards. And it did so with superior image quality. There is simply no currently other consumer level video card that can come close to matching the performance of the GeForce 8800 GTX.

NVIDIA has taken a monumental step forward with the GeForce 8800 GTS and GTX. These new cards are superior to their predecessors in every meaningful way. The G80 GPU's Unified Architecture, with its 128 (GTX) or 96 (GTS) stream processors delivered outstanding performance in every application we tested, whether it was based on DirectX or OpenGL.  Not to mention the fact that the G80 GPU also supports all DirectX 10 features as well.

The new Lumenex Engine also provides many real-world, tangible benefits.  The G80's new capabilities improve upon the previous generation of GPUs in terms of anti-aliasing and anisotropic filtering quality, and NVIDIA can now claim full support for HDR with AA, something they couldn't say with the G70. The GPU's full 10-bit display pipeline is also a welcomed feature that will pay even more dividends once next-gen 10-bit displays become available. NVIDIA also promises that the raw floating point performance of the G80 will usher in an era of high-performance physics processing on the GPU, and CUDA will eventually bring even more capabilities to the G80 as highly parallel, data intensive applications are compiled for execution on the GPU.

The GeForce 8800 GTX will be available immediately from multiple e-tail outlets for approximately $599. The GeForce 8800 GTS will be available right away as well, for about $449. Considering their performance, new features and enhanced PureVideo capabilities, the GeForce 8800 GTX and GTS are sure to have many hardcore enthusiasts chomping at the bit. And we're told you won't have to worry about the recall that has been in the news the last few days. In fact, NVIDIA sent this over in an effort to put potential customer's minds at ease:

"Some recent reports on the web mention a BOM error (wrong resistor value) on initial GeForce 8800 GTX boards. All boards with this problem were purged from the production pipeline. Product on shelves is fully qualified and certified by NVIDIA and its board partners. We and our board partners stand behind these products and back them with our full warranty."

We can say that the samples we evaluated functioned perfectly throughout our testing, and we suspect even if a few faulty boards make it into the hands of consumers, they'll be repaired or replaced immediately.

So there you have it. The first fully unified, DX10 compliant graphics cards have arrived and its clear they're vastly superior to their predecessors.  The GeForce 8800 GTX and GTS are the real-deal, and there's nothing else on the market that even comes close to touching them in terms of features and performance. These babies are HOT and they rock.

  • Extreme Performance
  • Unified Architecture
  • Full DX10 Support
  • Enhanced Image Quality
  • New AA Modes
  • Better Anisotropic Filtering
  • SLI Support
  • Hefty Power Consumption
  • Runs Hot
  • GTX Requires 2 Power Leads

 

Get into HotHardware's PC Hardware Forum Right Now!

Digg This Article To Share With Others

 



Content Property of HotHardware.com