Fermi, Continued

Other features of Fermi include support for C++ (current-generation CUDA products only support C), and, of course, the already oft-repeated fact that this core is some three billion transistors in size. NVIDIA has publicly tried to blow the importance of this off, claiming that analysts have always expressed concerns over the size of the company's chips, but there's no arguing that three billion transistors is a lot. Typically speaking, the more transistors in a product, the greater the chance something will go wrong when fabbing it; NVIDIA is taking something of a risk in building Fermi on a monolithic core instead of aiming for a mid-range, mid-size core and dual-GPU configurations ala AMD.
 

Fermi's block-level diagram. The increased amount of configurable/L1 cache per SM and the 768K of unified L2 are obvious improvements over GT200, but NVIDIA has made changes to boost core execution efficiency all the way around.

Dig into NVIDIA's whitepapers on Fermi, and you may end up thinking that the company designed a compute engine that happens to be capable of handling graphics rather than the other way around. Many of Fermi's changes should translate across GPU computation and gaming; there's no inherent reason why both sides can't benefit from certain improvements. Certain features, like support for 64-bit addressing, however, are rather obviously aimed at the scientific computing market rather than the needs of the game industry.

For the moment, NVIDIA is talking about Fermi strictly as a scientific computing part, non-Tesla versions will come, of course, but they aren't the company's focus today. As for when those announcements will become reality, that's anyone's guess. Jen-Hsun refused to comment on when we might see Fermi cores ship beyond pointing to a Q4 2009/Q1 2010 timeframe. Fermi's evolution is a demonstration of how divergent AMD and NVIDIA's roadmaps have become. While AMD is staying focused in the consumer and workstation space, NVIDIA is adamant in its belief that scientific computing and major data set crunching (as well as consumer app acceleration) are the waves of the future. On paper, Fermi appears to be a strong competitor, but if it takes NVIDIA nine more months to push GeForce cards out the door, it could find itself matched against an even new series of Radeon cards, rather than the 5800 products currently on the market.

When we discussed NVIDIA's
Tegra platform, we noted that the company's lack of a CPU design would undoubtedly impact its own Tegra product development. With Fermi, NVIDIA has built an architecture with some similar features to what you might expect to find on a massively parallel processor. In order to help developer's take full advantage of the process, NVIDIA has developed its own heterogeneous programming environment.

Related content