|Introduction, Specifications, and Related Info|
A couple of weeks ago at an Editor's Day event at their headquarters in Santa Clara, California, NVIDIA proclaimed that they planned to "redefine reality" with their self-branded ultimate gaming platform. The products in the spotlight throughout the event were the new nForce 680i SLI chipset, the upcoming 680a SLI, and a new family of graphics cards built around the company's forthcoming DX10 capable G80 GPU.
The G80 at the heart of what became known as the GeForce 8800 GTX and 8800 GTS represented a complete shift in NVIDIA's GPU architecture. NVIDIA's Tony Tomasi even went so far as to say, "every transistor in the chip is new". Of course NVIDIA would leverage some technology from previous generations of products, but the DX10 compliant, Unified architecture of the G80 is a major departure from the G70 GPU and its derivatives that power cards in the GeForce 7 series of products.
After hearing what NVIDIA had to say about the G80 and new nForce chipsets over the course of the event, the idea that the company had designed and built the ultimate gaming platform seemed like a distinct possibility, even for staunch PC enthusiast critics like us. We of course wouldn't pass judgment until we had the products to test for ourselves, however. Fortunately, the nForce 680i SLI and GeForce 8800 GTX and GTS were ready for testing almost immediately, and today we can tell you all about them.
We've talked at great length about the new nForce 600 series of chipsets, and more specifically about the nForce 680i SLI in this article here. And in this showcase and evaluation we'll be presenting you with information regarding NVIDIA's flagship GeForce 8800 GTX and 8800 GTS. Strap in folks. It's going to be a wild ride.
The GeForce 8800 GTX and GTS are based on a totally new unified GPU architecture, so they don't have too much in common with the older GeForce 7 series of products. It would be a good idea to familiarize yourself with NVIDIA's previous product offerings, and their platform as whole, however. For a comprehensive look at the main features of the GeForce 7 series, and for more details regarding NVIDIA's multi-GPU SLI platform, we recommend taking a look at a few of our recent articles...
We know that's a lot of reading, but the information and performance data in the articles listed above will give you much of the background and architectural details necessary to better understand the new products being announced today. If you're unclear about anything on the proceeding pages, look back to these articles for more related details.
The GeForce 8800 series GPU in a massive piece of silicon. It's comprised of roughly 681 million transistors and manufactured on TSMC's 90nm process node. It implements a massively parallel, unified shader design, consisting of up to 128 individual stream processors in up to 8 groups of 16, running at frequencies of up to 1.35GHz. The GeForce 8800 GTX takes advantage of all 128 stream processors, but the GTS has two blocks disabled for a total of 96 stream processors. And the unified nature of the design means each processor is capable of being dynamically allocated to vertex, pixel, geometry, or physics operations, unlike traditional GPU architectures that feature discreet pixel and vertex shaders.
Each GeForce 8800 GPU stream processor is a fully generalized, fully decoupled, scalar, processor that supports IEEE 754 floating point precision. The advantages of being fully scalar, are well summed up in this quote provided by NVIDIA
All of the stream processors in the GPU are driven by a high-speed clock domain that is separate from the core clock that drives the rest of the chip. For example, the GeForce 8800 GTX core clock is 575MHz and its stream processors run at 1.35GHz. The GeForce 8800 GTS has a core clock of 500MHz, but its stream processors are clocked at 1.2GHz.
The GeForce 8800 series GPU also has six memory partitions that each provide a 64-bit interface to memory, yielding a 384-bit combined interface width on the GTX. One of the memory partitions is disabled in the GTS, which yields a 320-bit memory interface. The memory subsystem implements a high-speed crossbar design, similar to GeForce 7x GPUs, and supports DDR1, DDR2, DDR3, GDDR3, and GDDR4 memory. The GeForce 8800 GTX uses GDDR3 memory clocked at 900MHz with a 384-bit (48 byte-wide) memory interface running at 900MHz (1800MHz DDR) - that equates to 86.4GB/sec. Yikes.
Texture filtering units are also fully decoupled from the stream processors. The GeForce 8800 series GPU can deliver up to 64 pixels per clock worth of raw texture filtering horsepower (vs. 24 in GeForce 7900 GTX), 32 pixels per clock worth of texture addressing, 32 pixels per clock of 2X anisotropic filtering, and 32-bilinear-filtered pixels per clock.
The GeForce 8800 GTX has six Raster Operation (ROP) partitions (the GTS has 5). Each partition can process 4 pixels with 16 sub-pixel samples, or a total of 24 pixels/clock with color and Z processing. For Z-only processing, an advanced new technique allows up to 192 samples/clock to be processed when a single sample is used per pixel. If 4x multi-sampled anti-aliasing is enabled, then 48 pixels per clock Z-only processing is possible.
Another new feature inherent to the GeForce 8800 series GPU is dubbed Early Z. Z comparisons for individual pixel data have generally occurred late in the graphics pipeline in the ROP. The problem with evaluating individual pixels in the ROP is that the they have already traversed nearly the entire pipeline. If the pixel ends up being occluded, that's a waste of GPU resources and bandwidth. With complex shader programs that have hundreds or thousands of processing steps, that a lot of processing that can be wasted on pixels that will never be displayed.
To somewhat alleviate this issue the GeForce 8800 employs an Early Z technique test Z values of pixels before they enter the shader pipeline. The result is that a GeForce 8800 GTX GPU can cull pixels at four times the speed of GeForce 7900 GTX.
We'll cover the individual specifications of the new GeForce 8800 GTX and 8800 GTS cards being announced today a little later on, but we thought we'd give you a high level breakdown before discussing some more of the other advanced features offered by NVIDIA latest flagship GPU. Due to the scalable nature of the GPU design, functional blocks can be disable, yielding a GPU with different performance characteristics.
|Unified Shaders, DX10, and SM 4.0|
One of the G80 GPU's main benefits is its full support for DirectX 10, Shader Model 4.0, and the other features inherent to Microsoft's upcoming API. In addition to the increased performance offered by the G80's architecture, DirectX 10 itself is poised to offer major performance benefits over DirectX 9. It will do this by significantly reducing the CPU overhead required for rendering. DirectX 10 addresses DX9's CPU overhead problems in a number of ways. For example, the cost of draw calls and state changes are reduced through a complete redesign of the performance-critical parts of the core API. Also, new features have been introduced to reduce CPU dependence and to allow more work to be done in one command.
With DirectX 10 Microsoft will also be introducing Shader Model 4.0, which incorporates many key innovations like a new programmable stage called the geometry shader that allows for per-primitive manipulation. DX10 will also provide a new unified shading architecture with a unified instruction set and common resources across vertex, geometry, and pixel shaders. DX10 specifications are also more well defined, so you won't see DirectX 10 class hardware that lacks key features. Some DX9 class GPUs were labeled as DirectX 9 compliant when they did not support all DX9 features, like vertex texture fetch for example.
Stream output is another useful new DirectX 10 feature supported in GeForce 8800 GPUs that enables data generated from geometry shaders (or vertex shaders if geometry shaders are not used) to be sent to memory buffers and subsequently forwarded back into the top of the GPU pipeline to be processed again. Allowing data to flow through the GPU this way allows for more complex geometry processing, advanced lighting calculations, and GPU-based physical simulations without heavily taxing the host CPU.
Shader model 4.0 also provides an increase in the resources allotted for shader programs. In previous versions of DirectX, developers had to manage relatively scarce register resources. DirectX 10, however, provides large increase in register resources. As you can see in the chart above, temporary registers are up from 32 to 4096, and constant registers up from 256 to 65,536 (sixteen constant buffers of 4096 registers). Textures, texture sizes, and the number of render targets has increased as well. The GeForce 8800 architecture can provide all of these DirectX 10 resources.
The GeForce 8800's unified architecture also results in a more efficient use of GPU resources. With the previous generation of GPUs that had discreet pixel and vertex shaders, there would almost always be idle hardware. If a scene was particularly pixel shader heavy, for example, the vertex shaders may have sat idle, and vice versa for the opposite scenario.
But with a Unified shader architecture, because GPU resources can be allocated on the fly and dynamically load-balanced, major portions of the GPU won't sit idle waiting for a shift in the workload. At its most basic level, a unified shader architecture makes rendering more efficient.
New HDR Modes:
|The Lumenex Engine|
With the GeForce 8800 series architecture, NVIDIA is introducing their new "Lumenex Engine". The Lumenex engine is the name NVIDIA has come up with to describe a host of features integrated in the G80 GPU. The new key features in the Lumenex Engine include 16x Coverage Sampling Anti-Aliasing (CSAA), 16x nearly angle independent anisotropic filtering, 16-bit and 32-bit floating point texture filtering, fully orthogonal 128-bit High Dynamic Range (HDR) rendering with all the above features, and a full 10-bit display pipeline.
CSAA and Angle Independent Anisotropic Filtering:
Coverage Sample Anti-Aliasing is a new anti-aliasing technique that increases image quality, without drastically increasing the demands placed on the GPU or the memory subsystem. In its standard modes, CSAA compresses the redundant color and Z/stencil information into the memory footprint and bandwidth of 4X multi-sample AA. In its higher quality modes (8xQ and 16xQ), CSAA compresses the information in the footprint and bandwidth of 8X multi-sample AA. Previous generations of NVIDIA GPUs could only do 4X MSAA in hardware.
The image above demonstrated the difference between no anti-aliasing, traditional 4X multi-sampling AA and 16x CSAA. In the "NO AA" portion of the image there are sharp, jagged edges. 4X MSAA does a nice job of softening the edge, but the gradient steps are still clearly visible. The 16X CSAA portion of the images has takes things even further though, and are gradient steps are much less apparent.
The GeForce 8800 GPUs also feature a new, high-quality anisotropic filtering engine that eliminates the angle dependant optimizations used in previous GPUs. We talk more about the 8800's new anisotropic filtering and anti-aliasing modes a little later.
10-Bit Display Pipeline:
|CUDA, the Demos, and HDR With AA|
With the GeForce 8800 series architecture, NVIDIA is also announcing their "CUDA" initiative. CUDA is an acronym for Compute Unified Device Architecture. All GeForce 8800 GPUs will support NVIDIA's CUDA, which provides a unified hardware and software solution for data-intensive computing.
CUDA's main features include a new "Thread Computing" processing model that takes advantage of the heavily threaded nature of the GeForce 8800 GPU architecture.
CUDA basically encompasses all GPUGPU functionality, including NVIDIA's Quantum Effects physics technology. Quantum Effects allows for physics effects to be simulated and rendered on the GPU. The GeForce 8800 GPU's stream processors will eventually be used in games to implement more realistic water, smoke, fire, hair, explosion, particle effects, etc., in games that take advantage of the technology. And because these computations are being run on the GPU, the host CPU is freed to run the game engine and AI.
To take advantage of CUDA, NVIDIA will be releasing a compiler that will allow standard C code to be executed on the GPU. Of course, only certain types of applications will benefit from being run on the GPU, namely those that require massive amount of floating point performance, like Folding@Home for example. The GPU's architecture will complement traditional general purpose CPUs by providing additional processing capability for inherently parallel applications. CUDA technology utilizes GPU resources in a different manner than graphics processing, but both CUDA threads and graphics threads can run on the GPU concurrently if desired. Because the architecture is unified, GPU resources can be dynamically allocated for pixel, vertex, or geometry shader duties, in addition to CUDA or Quantum Effects related processing tasks.
With every new GPU, NVIDIA inevitably releases a handful of technology demos designed to exploit the new features and capabilities of the product. This time around, NVIDIA did away with the fairy-tale characters and enlisted the help of Playboy model Adrianne Curry to show off the 8800 series. The Waterworld and Froggy demos also show off the Geometry Shader and Stream Out capabilities of the GeForce 8800. Prior to now, the geometry necessary to create the objects in the Waterworld demo would have been executed on the CPU. But with the 8800, growing vines, and the particle effects for the water are all tasked to the GPU. The manipulation of Froggy's body is also being computed on the GPU, something that could not have been done before.
There's no question, the lush cinematic worlds of Oblivion are a showcase of the current capabilities of DX9-based game engines. However, until now, High Dynamic Range (HDR) lighting effects, combined with full scene anti-aliasing, have only been available to users of ATI graphics cards, since NVIDIA's legacy products do not support the necessary features in hardware. Valve's proprietary method of 16-bit floating point driven HDR actually works on NVIDIA-based cards but that was the only game engine to utilize this technique. With NVIDIA's new GeForce 8800 series architecture, all standard methods of HDR, including FP16 and FP32 are supported with both multi-sample and super-sample AA enabled.
As you can see, the effect is quite impressive and though these screen shots were taken at high resolution with 8X AA enabled, frame rates were still more than playable for this game title. As you'll note, our FRAPS frame counter, in the top left corner of the screen, shows a range of 39 - 60 fps. If you've spent time in the worlds of Oblivion, you'll know this is completely fluid for game play. Though the game itself was unable to detect the capabilities of our GeForce 8800 GTX and enable AA with HDR in the game engine, we turned on HDR in Oblivion's options menu and then forced 8X AA on in NVIDIA's driver control panel. On a side note, at XHD resolutions of 2560X1600, 2X AA with HDR enabled was also very playable as well. XHD gaming nirvana indeed.
|The GeForce 8800 GTX|
After reading through the architectural details and new features found in the G80, we're sure you're all wondering what the GeForce 8800 GTX and 8800 GTS cards actually looks like, so without further ado we present to you the GeForce 8800 GTX...
NVIDIA's new flagship graphics card is a beast in every sense of the word. The card is built upon a 10.5" long PCB and the GPU and RAM are adorned with a massive dual-slot cooling apparatus. The cooler is outfitted with a fan that is designed to draw air in from the back, and blow it across the heatsink's fins, where is it ultimately expelled from the system through vents in the case bracket. There are however, some vents cut in the fan shroud towards the front of the card, which also aid in bringing temperatures down.
The 8800 GTX reference specifications call for a G80 GPU clocked at 575MHz with 768MB of RAM clocked at 1.8GHz. Due to the GTX's 384-bit memory bus, cards are equipped with 12, 32-bit DRAM chips, which all reside on the front side of the PCB.
Another interesting aspect with regard to the PCB is that it has two SLI edge connectors along the top. NVIDIA hasn't disclosed any specific information about what the second SLI connector could be used for, but when we asked about it, we did receive this response:
"The second SLI connector on the GeForce 8800 GTX is hardware support for potential future enhancements in our SLI software functionality. With the current drivers, only one SLI connector is actually used. Users can plug the SLI connector into either the right or left set of SLI fingers."
The GeForce 8800 GTX is also equipped with a pair of 6-Pin PCI Express power receptacles. Overall, NVIDIA has stated the 8800 GTX consumes a maximum of 185W, and the company recommends a 450W PSU that can supply 30A on its 12V rails.
With the cooler removed, the large G80 GPU is exposed, along with a second ASIC behind the DVI outputs. For GeForce 8800 GPUs, NVIDIA decided to put the TMDS and other display logic into a custom, discrete ASIC. This was done to simplify package and board routing as well as for manufacturing efficiencies. While on the subject of display logic, we should also mention that GeForce 8800 GTX cards have dual, dual-link DVI outputs in addition to an TV/HD output.
We also received a couple of retail-ready GeForce 8800 GTX cards prior to launch and wanted to showcase them for you here.
Leadtek's GeForce 8800 GTX conforms to NVIDIA's reference specifications in virtually every way. Underneath the custom fan shroud decal is a card identical to the one pictured above. Leadtek will be bundling their GeForce 8800 GTX card with an assortment of software and accessories that includes a pair of PCI Express power adapters, a DVI to DB15 VGA adapter, an HD component output dongle, a user's manual and installation guide, and a variety of software on CDs. The software compliment included copies of PowerDVD, and the games SpellForce 2 and Trackmania Nations.
Asus GeForce 8800 GTX card is also virtually identical to NVIDIA's reference design. The only differentiating physical feature is an "Asus" decal on the center of the fan. We'll be looking at both of these cards in upcoming articles here at HotHardware.
Finally, Foxconn's GeForce 8800 GTX offering is also of the same NVIDIA reference approach but the company is trying to differentiate with their add-on bundle. The board will come with a bonus USB Gamepad controller that is actually of very high quality and compatible with many current games.
|The GeForce 8800 GTS|
The GeForce 8800 GTS shares many of the same features as the 8800 GTX, but the two cards differ in a number of ways.
For one, the 8800 GTS is built upon a shorter 9" PCB. The card also requires less power; NVIDIA recommends a 400W PSU that can supply 26A on its 12V rails. As such the GTS has only one 6-Pin PCI Express power receptacle. The GTS also has only a single SLI edge connector, so at some point in the future the GTX is likely to offer a few additional features when running in SLI mode.
We actually received a retail-ready EVGA e-GeForce 8800 GTS for the purposes of this article. Underneath the card's cooler, which is identical to the one used on the GTX, lies a G80 GPU clocked at 513MHz and 640MB of GDDR3 memory clocked at 1584MHz. Please note that the GTS has "only" 96 streaming processors enabled in the GPU, and its memory has a 320-bit interface, as opposed to 384-bits on the GTX. The 320-bit memory interface means the GTS is outfitted with 10, 32-bit DRAMs. The PCB does have pads for 12, however. So, there is a possibility that future, unannounced GeForce 8800 series cards with 384-bit memory interfaces may use this PCB design.
EVGA bundles their e-GeForce 8800 GTS with a nice assortment of accessories and software. Included in the box along with the card itself, were a pair of DVI to DB15 VGA monitor adapters, an HD component output dongle, an S-Video cable, a Molex to 6-Pin PCI Express power adapter, a user's manual, some EVGA decals, and a couple of CDs. One disc contained the obligatory drivers, while the other contained a full version of the brand-new game Dark Messiah. Dark Messiah is a great title to showcase some of the capabilities of this card. Many thanks to EVGA for throwing it in with their GTS.
|Anisotropic Filtering Quality and Performance|
NVIDIA has claimed that the G80 at the heart of the GeForce 8800 GTS and GTX offers unsurpassed image quality. And so, prior to benchmarking the new cards, we spent some time analyzing the 8800 GTX's in-game image quality versus a Radeon X1950 XTX and NVIDIA's previous flagship GeForce 7900 GTX. First, we used Half Life 2: Episode 1's "background_01a" map to get a feel for how each card's anisotropic filtering algorithms affected the scene and we also fired up the D3D AF Tester to get a clear visual representation of the angular dependency of each architecture.
As you can see in the screen-shots above, as the level of anisotropic filtering is increased, the clarity and sharpness of the ground texture is enhanced. If we compare the quality of the images produced with each card, it's difficult the pick one that is clearly superior but there are definitely more subtle detail in the captures grabbed with the GeForce 8800 GTX. If you focus your attention on the cracks in the ground in the distance, you'll be able to pick up some of the differences.
The images captured with D3D AF Tester also show the GeForce 8800 GTX's strengths. The 8800 GTX has almost no angular dependency and produces smooth transitions, in an almost circular pattern. The Radeon X1950 XTX also does a great job with anisotropic filtering, but if you open the 16X aniso shots taken with the D3D Tester side-by-side you'll see the 8800 produces the superior pattern.
What the above screen-shots don't show is that the texture shimmering issue that's plagued the G70 is completely gone. The G80's new filtering capabilities have eliminated the texture shimmering present with older architectures, which actually made gaming much easier on the eyes.
To get an idea as to how increasing the level of anisotropic filtering in a game affected performance, we cycled through every available setting using our custom FarCry benchmark with the GeForce 8800 GTX and 8800 GTS. As the results show, anisotropic filtering is almost "free" on the G80. As the level of anisotropic filtering was increased, performance dropped off only slightly with either card.
|Image Quality: Anti-Aliasing & CSAA|
As we've already mentioned, the G80 GPU at the heart of the GeForce 8800 GTX and GTS cards offers new anti-aliasing modes courtesy of the Lumenex Engine. With the G80, NVIDIA designed an anti-aliasing engine that employs a proprietary algorithm called Coverage Sampling Anti-Aliasing (CSAA). Unlike some older multi-sampling techniques, Coverage Sampling Anti-Aliasing uses intelligent color and Z sample information to perform anti-aliasing while reducing the load placed on the memory system. With CSAA, NVIDIA raised the total number of samples that could be taken per-pixel to 16, as opposed to 4 on the G71.
To see how the GeForce 8800 GTX performed in regard to anti-aliasing, we fired up Half Life 2 and captured a few screen-shots at the various anti-aliasing modes available. We also did the same with a GeForce 7900 GTX and a Radeon X1950 XTX. Please pay special attention to the labels and the file names when clicking through the images above though, as only the 4X anti-aliasing shots will represent an apples-to-apples-to-apples comparison between the three cards.
If you flip through the shots, the first thing you're likely to notice is a slight rendering bug on the 8800 that causes a problem with the lighting on some of the building and trees. We're confident this will be fixed in a future driver release so we won't dwell on it. What's more important to focus on are the gradients on the cables that span the top of the screen, and the fine details in the antennas atop the buildings. As the AA levels are increased, the GeForce 8800 GTX does a great job of reducing the jaggies, and 8800 also seems to better preserve some fine detail.
To quickly asses the performance impact enabling CSAA had on frame-rates in a couple of popular games, we ran a handful of tests with F.E.A.R. and Prey using the GeForce 8800 GTX and GTS. We started with 4X anti-aliasing enabled, and cycled though the other modes offered with both games running at 1600x1200 with 16X anisotropic filtering enabled.
Jumping from 4X to 8X anti-aliasing with either card resulted in an approximate 20% to 30% performance drop, but from there on up, performance remained relatively stable until we hit the maximum 16xQ anti-aliasing mode. You may be asking yourself how this can be possible, as moving from 8X to 8xQ and ultimately 16X AA results in roughly equivalent performance. This is due to the Luminex Engine's ability to compress the redundant color and depth/stencil information into the memory footprint and bandwidth of 4 or 8 multi-samples. In fact, 8X AA and 16XAA both store only 1 texture sample and 4 color/Z samples. The two modes differ only in the number of coverage samples taken, which doesn't have as much of an impact on performance. 16xQ anti-aliasing on the other hand stores double the number of color/Z samples (8), hence the additional performance drop off.
|Our Test System and 3DMark06|
HOW WE CONFIGURED THE TEST SYSTEMS: We tested all of the graphics cards used in this article on an EVGA nForce 680i SLI based motherboard powered by a Core 2 Extreme X6800 dual-core processor and 2GB of low-latency Corsair RAM. The first thing we did when configuring the test system was enter the BIOS and set all values to their default settings. Then we manually configured the memory timings and disabled any integrated peripherals that wouldn't be put to use. The hard drive was then formatted, and Windows XP Pro with SP2 and the October DX9 update was installed. When the installation was complete, we then installed the latest chipset drivers available, installed all of the other drivers necessary for the rest of our components, and removed Windows Messenger from the system. Auto-Updating and System Restore were also disabled, the hard drive was defragmented, and a 1024MB permanent page file was created on the same partition as the Windows installation. Lastly, we set Windows XP's Visual Effects to "best performance," installed all of the benchmarking software, and ran the tests.
In terms of 3DMark06's general scoring metric, we see a shadow of things to come in our real-world game engine tests. A GeForce 8800 GTS is roughly as fast as a GeForce 7950 GX2 and significantly faster than a Radeon X1950 XTX. Then of course, the Grand-Daddy here is the new GeForce 8800 GTX, which easily broke the 10K 3DMark threshold, a first for any single GPU configuration that has ever hit our labs.
Perhaps the most interesting data point here is that the GeForce 8800 GTS loses slightly to the GeForce 7950 GX2 in Shader Model 2.0 performance and edges out the GX2 in SM 3.0 performance. We'd offer that perhaps this is a testament to the new GeForce 8800 series, in terms of its shader engine capabilities moving forward with leading-edge game titles employing more complex shader instructions and effects. A final observation is that the GeForce 8800 GTX, according to 3DMark06, is roughly 50% more powerful with SM 3.0 workloads than NVIDIA's previous single-GPU flagship card, the 7900 GTX.
|Half Life 2: Episode 1|
Historically a strong suit for ATI-based graphics cards, we begin our standard benchmark testing with Valve's Half Life 2: Episode 1.
Both of the new GeForce 8800 series cards prove themselves to be the fastest single graphics cards at any resolution in Half Life 2: EP1. They're even able to beat out the dual-GPU configuration of the GeForce 7950 GX2 here and handily take out ATI's current flagship Radeon X1950 XTX. In fact the GTS is some 32% faster than a Radeon X1950 XTX at high resolutions and a GTX comes close nearly 68% faster. Jaw dropping performance to be sure but we need to step up the workload a notch or two as well.
|FarCry v1.4 Performance|
Though slightly on the dated side, Far Cry is still runs on a fairly robust DX9 game engine. Testing with our custom FC time-demo with a fully patched version of the game is next.
Far Cry turned out to be about the same level of challenge for the GeForce 8800 series as was Half Life 2: Episode 1 on the previous page, but in this case the Radeon X1950 XTX had a much easier time with our custom demo and actually managed to nearly match the performance of the GeForce 8800 GTS. Then our dual-GPU infused GeForce 7950 GX2 took second place by a comfortable margin. And in the pole position, the new GeForce 8800 GTX bested the two top single GPU cards by over 30 frames per second and even managed a 15% performance gain over the GeForce 7950 GX2 at high resolution.
|F.E.A.R. v1.08 Performance|
F.E.A.R. is definitely a relatively taxing and impressive game engine with a very realistic particle system and a great physics engine in comparison to many other games currently on the market.
Here our frame-rates are much more subdued and in fact at high resolution with 4X AA enabled, even a powerful card like the GeForce 7900 GTX gets a little pokey. In this test, the GeForce 8800 GTS again edges out the recently released Radeon X1950 XTX and once again the GeForce 8800 GTX reins supreme even over the entire lot, including the dual-GPU powered GeForce 7950 GX2. In terms of single GPU performance, the GeForce 8800 GTX is 40 - 45% faster than a Radeon X1950 XTX.
|Quake 4 v1.3 Performance|
One of the most widely used and re-purposed OpenGL game engines on the market; Quake 4 is next...
Though OpenGL performance and Quake 4 have always been strong points for NVIDIA-based cards, the Radeon X1950 XTX does put up a solid showing here, actually besting NVIDIA's legacy single GPU card, the GeForce 7900 GTX. However, ATI's fastest is still no match for the new GeForce 8800 series powerhouses and both walk off with decisive victories. In fact the GeForce 8800 GTS is actually able to keep pace with the GeForce 7950 GX2. Finally, witnessing the GeForce 8800 GTX push out over 135 frames per second at 1600X1200 with 4X AA enabled borders on insanity. Clearly if you're going to step up for the power and cost of a GeForce 8800 GTX, you better have a high-resolution LCD panel or CRT to go with it or we may have to hunt you down and ridicule you publicly.
|Prey v1.2 Performance|
Take-Two's Prey is a game based on the Doom 3 engine. Like Quake 4 it also places a bit more strenuous demand on the graphics subsystem than many other titles.
Once again GPU-for-GPU, the new GeForce 8800 series from NVIDIA shows itself to be considerably faster in single GPU configurations, than anything on the market currently. The GeForce 8800 GTS clocks in over 15% faster than a Radeon X1950 XTX and a shade under the performance of a GeForce 7950 GX2, in our custom Prey benchmark. The GeForce 8800 GTX is even 20% faster, in this game title, than a dual-GPU powered GeForce 7950 GX2. And at 1600X1200 with 4X AA enabled, Prey at 100+ fps is a whole barrel of fun.
|Need For Speed - Carbon|
In an effort to mix things up a bit and get as far away from the first person shooter genre as possible, we have EA's Need For Speed: Carbon on tap next. A jacked up, pimped out racing simulation with plenty of eye candy, NFS: Carbon should push these cards a bit more to the point where even a new GeForce 8800 series GPU breaks a sweat.
Talk about a whole new world, whether you consider our Prey, F.E.A.R., or Half Life 2: Episode 1 tests, nothing put the hurt on these new Graphics cards like NFS Carbon. Of course you don't need blistering fast frame rates to play this great new racing sim either. Our first observation is that NVIDIA has some driver work to do to get SLI working with the game, as the GeForce 7950 GX2 was having major issues with its dual-GPU setup if it couldn't even beat out a GeForce 7900 GTX. Beyond that, the new Radeon X1950 XTX puts out a solid performance but can't quite catch the drift of a GeForce 8800 GTS. And of course, at the risque of sounding mildly trite, the GeForce 8800 GTX leaves all other competitors in its dust. This new flagship monster GPU from NVIDIA is over 40% faster than the fastest ATI currently has to offer, in our Need For Speed: Carbon testing.
|Battlefield 2142 Performance|
Though not exactly a showcase of GPU horsepower and capability, there is little doubt that Battlefield 2142 is going to become a hugely popular multi-player first person shooter. Thus we've run our new GeForce 8800 series cards through their paces with the new EA title as a relevant reference point for battle-hardened readers.
If you're looking for a clear, decisive performance edge with BF 2142, we'd suggest you focus more on the amount of system memory installed (we recommend 2GB) and perhaps your CPU, rather than the GPU. Regardless, the flat-out fastest card in this mostly CPU-bound test, is once again the GeForce 8800 GTX. Both the 8800 GTS and GTX are able to take ATI's Radeon X1950 XTX to task and beat it handily. In fact the GTX is also able to edge out the ever-potent GeForce 7950 GX2 as well.
|XHD Resolutions: HL2 Episode 1|
For the next round of testing, we've upped the ante and re-tested all of the graphics cards at XHD resolutions with a handful of games.
Once again, NVIDIA's new GeForce 8800 series cards are able to outpace all of the competition in almost every test configuration. The GeForce 8800 GTS was every so slightly slower than the GeForce 7950 GX2 at the higher resolution, likely do to the GX2's larger frame buffer (1GB vs. 640MB), but in the other three tests the GTS and 8800 GTX were dominant. In fact, the GeForce 8800 GTX was about twice as fast as the older, former single-GPU powered kingpins, the GeForce 7900 GTX and Radeon X1950 XTX.
|XHD Resolutions: F.E.A.R.|
At a resolution of 2560x1600, the F.E.A.R. benchmark is able to slow almost all of the graphics cards we tested to a virtual crawl, with the exception of the GeForce 8800 GTX that is. At the lower resolution, things are somewhat competitive with the 8800 GTS coming in between the GX2 and Radeon X1950 XTX, and at 2560x1600, the GTS is actually outpaced by the X1950 XTX, likely due to the latter's super-fast 2GHz frame buffer. The GeForce 8800 GTX is simply in a league of its own, however. At both resolutions it crushes all of the competition by margins ranging from about 10% to a whopping 110%.
|XHD Resolutions: Quake 4|
With a Core 2 Extreme X6800 powering the system and the multi-threaded v1.3 patch installed, all of the cards we tested, with the exception of the GeForce 7900 GTX perhaps, put up playable framerates at both of the XHD resolutions we tested. The GeForce 8800 GTS finished just behind the GeForce 7950 GX2 at both resolutions, and missed the mark set by the Radeon X1950 XTX by about 10% at 2560 x 1600, but its performance was clearly superior to the 7900 GTX. The new GeForce 8800 GTX on the other hand performed extraordinarily in comparison to all of the other cards. Its large frame buffer and higher clocks (relative to the 8800 GTS) propelled the 8800 GTX to the top of the charts by large margins.
|XHD Resolutions: Prey|
The results reported by our custom Prey benchmark at XHD resolutions somewhat mirror those reported by Quake 4 on the previous page. The GeForce 8800 GTX came in every so slightly behind the Radeon X1950 XTX at the higher resolution, but only the GX2 and 8800 GTX were faster at 1920 x 1200. Once again though, the new GeForce 8800 GTX put up one heck of a dominant performance, besting all of the competition by large double-digit percentages across the board.
|PureVideo Features and Performance|
For our next round of tests we took a look at Digital Video processing performance between the two competing core GPU architectures, "PureVideo" technology at work for NVIDIA and "AVIVO" driving ATI.
To characterize CPU utilization when playing back WMV HD content, we used the Performance Monitor built into Windows XP. Using the data provided by Windows Performance Monitor, we created a log file that sampled the percent of CPU utilization every second, while playing back the 1080p version of the "Amazing Caves" video available for download on Microsoft's WMVHD site. The CPU utilization data was then imported into Excel to create the graph below. The graph shows the CPU utilization for a GeForce 7900 GTX, a Radeon X1950 XTX, and the GeForce 8800 GTX using Windows Media Player 11, with XP patched using the DXVA updates posted on Microsoft's web site (Updates Available Here). The desktop resolution was set to 1920 x 1200 for these tests.
One of the more interesting things to ever happen in the HH labs, took place during our CPU utilization testing for this article. As you can see in the graph above, all three of the cards handled the Amazing Caves video without much of a problem. No card ever went over the 25% CPU utilization mark. With a GeForce 7900 GTX in the test system, roughly 16.5% of CPU resources were used during the playback of this HD video. With both the Radeon X1950 XTX and GeForce 8800 GTX installed though, exactly 16.0065407% of the CPU's resources were required. We factored the result out to seven decimal places to show just how wild this result was. With 86 samples recorded during the video playback for each GPU, the results averaged out to the exact same value. Any mathematicians in the audience? What are the odds of that happening again?
Next up, we have the HQV DVD video benchmark from Silicon Optics. HQV is comprised of a sampling of SD video clips and test patterns that have been specifically designed to evaluate a variety of interlaced video signal processing tasks, including decoding, de-interlacing, motion correction, noise reduction, film cadence detection, and detail enhancement. As each clip is played, the viewer is required to "score" the image based on a predetermined set of criteria. The numbers listed below are the sum of the scores for each section. We played the HQV DVD using the latest version of Intervideo's WinDVD 7 Platinum Suite, with hardware acceleration and PureVideo extensions enabled.
NVIDIA's latest Forceware drivers give the GeForce 7900 GTX a nice boost in performance in this test, and give the GeForce 8800 series of cards a slight edge over ATI's Radeon X1950 XTX. The only differences between the GeForces and Radeon, however, were in the Noise Reduction tests, where we gave the NVIDIA-powered cards a slight advantage. In all honestly though, if HQV's scoring guidelines allowed it, we'd probably give ATI a 7.5 on the NR tests, and NVIDIA an 8. But HQV doesn't allow this. The output from either architecture is really that close.
|Preliminary SLI Testing|
SLI was not quite ready for prime-time with initial driver release NVIDIA provided to analysts (v96.94), but a few days ago NVIDIA came though with an updated driver that was SLI capable (v96.97). Some functions are still currently disabled, like SLIAA for example, but the driver was stable and worked perfectly throughout a short run with some various benchmarks.
We ran a handful of benchmarks using a pair of GeForce 8800 GTS and GeForce 8800 GTX cards and have a quick comparison available below. When looking at the graphs, please note that we're comparing single-card performance versus SLI - GeForce 8800 GTS and 8800 GTX numbers are separated into two separate graphs. All of the tests were run at the same resolution and settings (1920x1200 | 4XAA/16X Aniso), with the exception of the 3DMark06 test which represents a default benchmark run (1280x1024).
Performance scaled very well with the GeForce 8800 GTS and GTX cards running in SLI mode, especially in the Prey and F.E.A.R. benchmarks. Those two games in particular showed massive performance improvements with two GPUs sharing the rendering workload. Even at this early stage of driver development (relatively speaking), NVIDIA seems to have SLI working well with the GeForce 8800 series, at least from a performance perspective. And we suspect that things will only get better moving forward, as NVIDIA's driver team gets more familiar with the intricacies of their latest GPU.
|Overclocking the new GeForces|
As we neared the end of our testing, we spent a little time overclocking the new GeForce 8800 GTS and GTX cards using the clock frequency slider available within NVIDIA's Forceware drivers, after enabling the "Coolbits" registry tweak.
We were pleasantly surprised by the overclockability of both of the new GeForce 8800 series cards, but had some interesting results. We were able to take the GeForce 8800 GTX up from its stock GPU core and memory clock frequencies of 576MHz / 1.8GHz to 626MHz / 1.9GHz. And we were able to take the GeForce 8800 GTS up from its default GPU and memory clocks of 513MHz / 1584MHz, to 563MHz / 1684MHz, core and memory clock frequency increases of 50MHz and 100MHz, respectively, for both cards. We suspect there is actually a little more left in the tank with these cards, and will experiment with retail product and third-party overclocking tools in the near future.
|Power Consumption, Temps and Noise|
We have a few final data points to cover before bringing this article to a close. Throughout all of our benchmarking, we monitored how much power our test system was consuming using a power meter, and also took some notes regarding its noise output and GPU temperatures. Our goal was to give you all an idea as to how much power each configuration used and to explain how loud the configurations were under load. Please keep in mind that we were testing total system power consumption here, not just the power being drawn by the video cards alone.
The new GeForce 8800 GTX and GeForce 8800 GTS consumed more power than any of the other cards we tested, but the results are somewhat promising. Some of you may be thinking we're a little nuts to say that, considering the GTX requires two 6-Pin PCI Express supplemental power leads, but the good news is that the increased performance offered by these new cards is not directly proportional to their power consumption. If we look at the GeForce 8800 GTX in particular, in many cases it offered nearly double the performance of the Radeon X1950 XTX, yet under load, the test system consumed "only" 33 more watts. And the GTS and X1950 XTX were roughly on par with one another under load. When compared to the 7900 GTX things don't look quite as rosy, but power consumption is not quite as insane as initial rumors regarding the G80 let on. And who knows, with a die-shrink and GDDR4 memory, the spring-refresh products - if they in fact use GDDR4 memory and get a die-shrunk GPU; we're speculating - will likely offer better performance with lower power requirements.
As you'd probably expect looking at the power consumption numbers, the GeForce 8800 GTS and GTX put off a substantial amount of heat. We monitored GPU temperatures over a 15 minute span with RTHDRIBL running at 1920x1200, as saw maximum temperature of about 80'C for the GTS and 84'C for the GTX. If you're in the market for either one of these cards, we definitely recommend good case ventilation.
Lastly, we have some comments regarding the noise generated by the coolers used on the new GeForce 8800 GTS and GTX. Throughout our testing, the fans on both cards spun up after only a few minutes of gaming. The noise output wasn't bad though. We couldn't register a solid result on our aging sound level meter, but we can say that the 8800 GTX and GTS are perhaps a bit louder than a 7900 GTX. We definitely wouldn't categorize the fans as quiet when spun-up, but we don't think the noise output will be an issue for any gamer or enthusiast.
|Our Summary and Conclusion|
Performance Summary: NVIDIA's new GeForce 8800 GTS and GTX cards are mighty strong performers. Throughout our entire battery of benchmarks, both cards put up framerates at, or near the top of the charts. The GeForce 8800 GTS outperformed a GeForce 7900 GTX in every test we ran. It did, however, missed the mark set by a Radeon X1950 XTX in a couple of high-resolution tests, and trailed a GeForce 7950 GX2 on a few occasions, but the features and enhanced image quality offered by the 8800 GTS offset any of these results in our opinion.
The GeForce 8800 GTX's performance was for more dominant. In every benchmark we ran, the GeForce 8800 GTX was clearly the best performer, and in some cases it doubled the performance of any of the previous generation, single-GPU powered cards. And it did so with superior image quality. There is simply no currently other consumer level video card that can come close to matching the performance of the GeForce 8800 GTX.
NVIDIA has taken a monumental step forward with the GeForce 8800 GTS and GTX. These new cards are superior to their predecessors in every meaningful way. The G80 GPU's Unified Architecture, with its 128 (GTX) or 96 (GTS) stream processors delivered outstanding performance in every application we tested, whether it was based on DirectX or OpenGL. Not to mention the fact that the G80 GPU also supports all DirectX 10 features as well.
The new Lumenex Engine also provides many real-world, tangible benefits. The G80's new capabilities improve upon the previous generation of GPUs in terms of anti-aliasing and anisotropic filtering quality, and NVIDIA can now claim full support for HDR with AA, something they couldn't say with the G70. The GPU's full 10-bit display pipeline is also a welcomed feature that will pay even more dividends once next-gen 10-bit displays become available. NVIDIA also promises that the raw floating point performance of the G80 will usher in an era of high-performance physics processing on the GPU, and CUDA will eventually bring even more capabilities to the G80 as highly parallel, data intensive applications are compiled for execution on the GPU.
The GeForce 8800 GTX will be available immediately from multiple e-tail outlets for approximately $599. The GeForce 8800 GTS will be available right away as well, for about $449. Considering their performance, new features and enhanced PureVideo capabilities, the GeForce 8800 GTX and GTS are sure to have many hardcore enthusiasts chomping at the bit. And we're told you won't have to worry about the recall that has been in the news the last few days. In fact, NVIDIA sent this over in an effort to put potential customer's minds at ease:
We can say that the samples we evaluated functioned perfectly throughout our testing, and we suspect even if a few faulty boards make it into the hands of consumers, they'll be repaired or replaced immediately.
So there you have it. The first fully unified, DX10 compliant graphics cards have arrived and its clear they're vastly superior to their predecessors. The GeForce 8800 GTX and GTS are the real-deal, and there's nothing else on the market that even comes close to touching them in terms of features and performance. These babies are HOT and they rock.