Nvidia's Kal-El Demonstration Marred By Benchmark Confusion

When Nvidia announced its next-generation Tegra product at the Mobile World Conference, it pulled out all the stops in an effort to impress. The stats themselves were impressive—the chip packs a twelve-core GeForce GPU in with a quad-core ARM CPU—but NV opted to hammer home the point by showing benchmark results.

According to Nvidia, Kal-El turned in a score of 11,352 in the embedded processor benchmark Coremark while a T7200 Core 2 Duo (65nm, 2GHz dual-core, 4MB L2, 667MHz FSB) returned a score of just 10,136. If you want to see Nvidia's original video, you can do so here. We weren't completely sold on the results because, as we explained:
The program [Coremark], published by the Embedded Microprocessor Benchmark Consortium (EEMBC), is only designed to test the core functions of a processor. According to CoreMark.org: "It is encouraging to see the industry, as well as academia, adapting to a new standard so quickly, but let us not forget – CoreMark only targets core operations. EEMBC’s full-featured application benchmarks are much better suited for testing a processor’s capability in a real application. Furthermore, processors are becoming increasingly complex and one core-based benchmark is insufficient for a comprehensive analysis.
It turns out we were right. The credit for this discovery goes to ilsistemista.net, who caught the very different fine print under Kal-El vs. the T7200.

It turns out that the version of Coremark that Kal-El ran was heavily optimized and used a relatively new version of GCC, while the T7200 flavor was compiled using an old version of GCC with minimal optimization. The next step Ilsistemista took was to test what happens when Coremark is compiled and run on the T7200 using the same optimizations that were used for Kal-El. As you'll see, there's a bit of a difference.

When the scales are evenly balanced the T7200 turns in results nearly 50 percent faster than NV's published numbers.

What Was The Point?

The strangest thing about Nvidia's move is that there's no reason for it. Compare Kal-El to Tegra 2, and you'll note that Kal-El is no less than 94 percent faster than its predecessor. It wouldn't surprise us if Kal-El's graphics engine really is 5x faster than T2's, at least in some metrics. Certainly no one else is seriously talking about a mobile platform that can push 2560x1600 dropping by the end of the year. There's no reason to think Kal-El will be inherently limited to larger devices, either—NV should be able to fit the part into smaller power envelopes by disabling CPU/GPU cores. If a quad-core / 12-pipe configuration works well for a tablet, a dual-core / 6-pipe configuration should work just fine for a smartphone.

Second, there's no single test that's emerged on a consumer level as "the" benchmark for smartphones. Various browser-centric timed tests come closest, but browsers themselves are a significant confounding variable. An awful lot of testing boils down to whether or not Phone A "feels" faster than Phone B. Unfortunately, research has shown that human beings are pretty bad at actually judging such improvements and will often think that a newer / supposedly faster device actually is faster when timed tests prove it isn't.

Third, and perhaps most importantly, this sort of strategy inevitably raises questions about why NV felt it had to skew the numbers to make Kal-El look good. Kal-El, to our way of thinking, still looks pretty darn good on its own. If it delivers as promised, it'll be a huge leap from Tegra 2. As for its theoretical performance against a five year old Core 2 Duo that would never fit inside a tablet, we don't see a reason to care.