More On The Architecture
You can't read about a new game these days, without hearing about the quality of the shadows produced by its engine. Doom 3 and Quake 4 immediately come to mind, as two games that make heavy use of shadows to produce a realistic looking game world.
A widely used method for producing shadows is shadow mapping. This technique of rendering shadows works by first rendering the scene from the point of view of a light source. The results are not displayed, but instead stored in a special shadow map texture where each value represents the distance of the nearest object to the light source. The scene is then rendered from the gamer's viewpoint, and each pixel is checked against the shadow map to determine if there are any objects between it and the light source. If an object falls within the shadow map, the pixel is in shadow and will be darkened, otherwise it is lit normally.
Basic shadow maps typically create hard-edged shadows, which isn't very realistic. In the real world, shadows usually have much softer edges. To create soft shadows in games, the shadow map is usually filtered in some way. The filtering can be done by taking X number of samples, and then combining them in a pixel shader. Generally, the more samples used, the better the resulting soft shadows. Doing this requires a large number of texture lookups, however, which can hurt performance.
To speed up the texture lookups necessary for using this technique for soft shadows, the Radeon X1900 includes a new texture sampling feature called Fetch4. It works by exploiting the fact that most textures are composed of color values, each with four components (Red, Green, Blue, and Alpha or transparency). The texture units are designed to sample and filter all four components from one texture address simultaneously. However, when looking up different types of textures with single-component values (such as shadow maps), Fetch4 instead allows four values from adjacent addresses to be sampled simultaneously. This effectively increases the texture sampling rate by a factor of 4. To exploit the Fetch4 feature though, specific code needs to be used in the game engine.
Another enhancement made to the R580 should help with performance at ultra high resolutions, think 1920x1200 and above. All Radeon GPUs support a Hierarchical Z feature, that works by detecting and eliminating pixels that will be hidden in the final rendered image, and discarding them before any further pixel processing takes place. To function though, this feature requires high speed on-chip memory, or a buffer, and this memory is often of a limited size. Rendering at resolutions higher than this integrated buffer was designed to support can reduce the effectiveness of Hierarchical Z. The Radeon X1900 incorporates 50% more on-chip memory for Hierarchical Z than the Radeon X1800. This means that its performance should not drop off as dramatically at very high resolutions.