Just yesterday, we addressed the dubious claim
that Intel's Clover Trail+ low power mobile
processors had somehow seized a massive lead over ARM's products and we noted some of the suspicious discrepancies in the popular AnTuTu benchmark. It turns out that the situation is far shadier than we initially thought. The latest benchmark version isn't just tilted to favor Intel -- it seems to flat-out cheat to accomplish it.
Anandtech forum user Exophase went digging into the benchmark source code to determine why the latest version showed such one-sided gains in favor of x86 processors. AnTuTu
is basically nBench, a mid-1990s benchmark that was compiled by the now-defunct Byte magazine. The new 3.3 version of AnTuTu was compiled using Intel's C++ Compiler, while GCC was used for the ARM variants. The Intel code was auto-vectorized, the ARM code wasn't -- there are no NEON instructions in the ARM version of the application. Granted, GCC isn't currently very good at auto-vectorization, but NEON is now standard on every Cortex-A9
SoC -- and these are the parts people will be benchmarking.
But compiler optimizations are just the beginning. According to Exophase's investigation, the Intel code deliberately breaks the benchmark's function. At a certain point, it runs a loop that's meant to be performed 32x just once, then reports to the benchmark that the task completed successfully. Now, the optimization in question is
part of ICC, but was only added recently. It's not the kind of thing you'd call by accident.
AnTuTu For Android Benchmark
Is Intel Up To Its Old Tricks?
There are three great constants in life: Death, taxes, and companies that periodically cheat on benchmark tests. Nvidia cheated when it misrepresented Tegra 3's performance
during the initial unveil of the chip, AMD cheated back in the old days of
Quack 3 and Intel... well, Intel cheated so thoroughly, its alleged actions were a substantial part
of AMD's antitrust lawsuit back in 2005.
Back then, Intel's compilers would silently refuse to vectorize SSE/SSE2
code if the computer was using an AMD microprocessor. Keep in mind, this was a situation where AMD had lawfully acquired a license from Intel to use
SSE2. Major benchmarks of the era were often compiled using Intel's software, which meant they'd run far more slowly on AMD hardware than they would have otherwise. Part of Intel's settlement with AMD was an agreement not to engage in this kind of behavior anymore.
Is there proof
that the AnTuTu developers deliberately sabotaged the app to favor x86 chips and penalize ARM performance? No. But consider the chain of events.
A just-released version of the benchmark is used in "research" reports from prominent analysts claiming that Intel chips now offer huge benefits over ARM chips -- even though Intel, in closed-door discussions with journalists, never made such claims.
The just-released version of the benchmark is compiled with one compiler, while the ARM variants use GCC. Vectorization is used for Intel, but not for ARM.
- The actual code "happens" to favor Intel in a way that breaks the benchmark's function.
Benchmark happen to leak for Bay Trail and show it demolishing the competition. What do they use? Why, AnTuTu!
As someone who saw the unfair compiler optimization issue play out years ago, this isn't accidental. It's just designed to look that way. Mobile products are already difficult enough to benchmark accurately without this kind of behavior. It doesn't just make fair comparison more difficult, it undermines the limited set of tools we have for making those comparisons in the first place.
AnTuTu has released a "new" version of the benchmark in which Intel performance drops 20-50%. Systems based on high-end ARM devices again win the benchmark overall, as they did previously.