Features & Architecture
We outlined the specifications for all of the cards in ATI's new X1000 family line-up on the last couple of pages, but didn't contrast them against NVIDIA's current offerings, or ATI's previous generation of cards. The chart below will give you a general idea of where ATI's new cards stand in terms of bandwidth and fillrate.
As you can see, even though the X1800s have fewer pixel shader pipelines than NVIDIA's GeForce 7800 GT and GTX, their higher clock speeds help keep fillrate competitive, and memory bandwidth for the X1800 XT is well above the rest of the pack. The X1600s and X1300s don't fare quite as well, however. The 12-pipe/4 ROP and 4-pipe/4 ROP configurations and 128-bit memory interfaces prevent them from putting up the same kind of numbers as the higher-end cards listed here.
ATI focused on efficiency and scalability with their new GPU architecture. Their goals were to reduce idle time and latency,while decoupling processing units from their previous rigidly defined pipelines. ATI also wanted to expand their feature set, and they've done so by finally introducing full Shader Model 3.0 support in the entire X1000 graphics family, from top to bottom.
Decoupling the GPU processing units allows ATI to design an entire line of products based on the same core GPU architecture, but with varying levels of performance and affording better overall design efficiency. NVIDIA took a similar approach with the design of the GeForce 6 and 7 series.
Workloads enter the pipeline via the Vertex Engine and are then passed on to the geometry setup engine, which then forwards to the dispatch processor for allocation amongst the pixel shaders. The "Ultra-Threaded Dispatch Processor" is supposedly where a lot of pre-processing efficiencies come into play with significantly improved levels of flow control and thread management, keeping the pixel pipelines fully utilized and avoiding stalls. In this architecture there are 16 pixel shader processors organized in four independent quad-shader cores, that are managed by the Ultra-Threading Dispatch Processor.
The Pixel Shader
The Vertex Shader
All told, there are 8 vertex shaders, 16 texture address units, 16 texture units and 16 render back-end units in a top -of-the-line X1800 series GPU. Of course both pixel and vertex shaders have been upgraded to support the Shader Model 3.0 specification, with dynamic flow control and virtually unlimited instruction length.
Another more efficient approach ATi is supposedly bringing to the table with the entire X1000 series of cards is the ability to process a larger number of smaller threads for better granularity and parallelism. As the diagram above shows, the GPU processes a given workload in 4 pixel square (16 pixels total) thread sizes, which in scenarios like the shadow mapping example noted above, can provide much better coverage of an area that needs to be processed and rendered but avoiding areas that do not need to be processed for the operation.
The GPUs at the heart of the X1000 family of graphics cards are enhanced with other capabilities and features as well. For example, ATI now has full support for HDR with anti-aliasing, although most games will need to be patched to take advantage of this capability. The X1000 family also features a new adaptive AA algorithm to reduce the appearance of jagged edges in scenes where transparent textures are used, along with a new type of memory controller, and a beefed up video pipeline, dubbed Avivo.