AnTuTu Mobile Benchmark Cited By Analysts Fundamentally Broken, Heavily Favors Intel Architecture - HotHardware
AnTuTu Mobile Benchmark Cited By Analysts Fundamentally Broken, Heavily Favors Intel Architecture

AnTuTu Mobile Benchmark Cited By Analysts Fundamentally Broken, Heavily Favors Intel Architecture

Just yesterday, we addressed the dubious claim that Intel's Clover Trail+ low power mobile processors had somehow seized a massive lead over ARM's products and we noted some of the suspicious discrepancies in the popular AnTuTu benchmark. It turns out that the situation is far shadier than we initially thought. The latest benchmark version isn't just tilted to favor Intel -- it seems to flat-out cheat to accomplish it.

Anandtech forum user Exophase went digging into the benchmark source code to determine why the latest version showed such one-sided gains in favor of x86 processors. AnTuTu is basically nBench, a mid-1990s benchmark that was compiled by the now-defunct Byte magazine. The new 3.3 version of AnTuTu was compiled using Intel's C++ Compiler, while GCC was used for the ARM variants. The Intel code was auto-vectorized, the ARM code wasn't -- there are no NEON instructions in the ARM version of the application. Granted, GCC isn't currently very good at auto-vectorization, but NEON is now standard on every Cortex-A9 and Cortex-A15 SoC -- and these are the parts people will be benchmarking.

But compiler optimizations are just the beginning. According to Exophase's investigation, the Intel code deliberately breaks the benchmark's function. At a certain point, it runs a loop that's meant to be performed 32x just once, then reports to the benchmark that the task completed successfully. Now, the optimization in question is part of ICC, but was only added recently. It's not the kind of thing you'd call by accident.

AnTuTu For Android Benchmark Screen Shot
AnTuTu For Android Benchmark

Is Intel Up To Its Old Tricks?

There are three great constants in life:  Death, taxes, and companies that periodically cheat on benchmark tests. Nvidia cheated when it misrepresented Tegra 3's performance during the initial unveil of the chip, AMD cheated back in the old days of Quake 3 Quack 3 and Intel... well, Intel cheated so thoroughly, its alleged actions were a substantial part of AMD's antitrust lawsuit back in 2005.

Back then, Intel's compilers would silently refuse to vectorize SSE/SSE2 code if the computer was using an AMD microprocessor. Keep in mind, this was a situation where AMD had lawfully acquired a license from Intel to use SSE2. Major benchmarks of the era were often compiled using Intel's software, which meant they'd run far more slowly on AMD hardware than they would have otherwise. Part of Intel's settlement with AMD was an agreement not to engage in this kind of behavior anymore.


Any time optimizations are this one-sided, it's time to take a closer look. Courtesy of Jim McGregor, EETimes

Is there proof that the AnTuTu developers deliberately sabotaged the app to favor x86 chips and penalize ARM performance? No. But consider the chain of events.  

  • A just-released version of the benchmark is used in "research" reports from prominent analysts claiming that Intel chips now offer huge benefits over ARM chips -- even though Intel, in closed-door discussions with journalists, never made such claims.
  • The just-released version of the benchmark is compiled with one compiler, while the ARM variants use GCC. Vectorization is used for Intel, but not for ARM.
  • The actual code "happens" to favor Intel in a way that breaks the benchmark's function.
  • Benchmark happen to leak for Bay Trail and show it demolishing the competition. What do they use? Why, AnTuTu!
As someone who saw the unfair compiler optimization issue play out years ago, this isn't accidental. It's just designed to look that way. Mobile products are already difficult enough to benchmark accurately without this kind of behavior. It doesn't just make fair comparison more difficult, it undermines the limited set of tools we have for making those comparisons in the first place.

Update:  AnTuTu has released a "new" version of the benchmark in which Intel performance drops 20-50%. Systems based on high-end ARM devices again win the benchmark overall, as they did previously.
0
+ -

oh my gosh wow what a leap from almost nothing to blasting out killer amazing

0
+ -

Good going, Exophase.

And....I'm a little wary of anything based on mid-90'd code. Yes, I know there's backwards compatibility, but to my non-techie eye, it's a little bogus. Chips & the microcode do much more today, not just run faster.

(If i'm wrong about this, flame away. I appreciate the education!)

0
+ -

Christopher,

I'm not sure. I'm not sure anyone is. There's been a lot of uncertainty over how to best capture performance on older cores -- one reason I think you see older tests is because low-power cores often resemble older chips that ran those tests in the first place.

Another reason is because many benchmarks don't have analogs. PCMark 7 doesn't have an equivalent workload to run on a mobile part. Ditto for a Photoshop benchmark, an H.264 encode, or even a lot of games. So we turn back to older, simpler tests that could still give an accurate performance comparison (within their limits).

0
+ -

so basically its back to the basics. Dont trust any company. Its amazing that we now live in an era where benchmarks are starting to matter less and less. Its going to start coming down to real world usage. Which requires real people not computers.

0
+ -

.

0
+ -

CALLED IT!

Login or Register to Comment
Post a Comment
Username:   Password: