Logo   Banner   TopRight
Vivante: Challenging the Status Quo In Mobile GPUs
Date: Oct 10, 2013
Author: Joel Hruska

Over the past few years, a handful of mobile graphics companies have emerged as key pillars of the industry. The top dog, by far, has been Imagination Technologies, with Qualcomm, Nvidia (during the Tegra 2 / Tegra 3 era) and ARM all picking up significant businesses of their own as well. But now, there's a new kid on the block -- a company with a tiny, highly customized GPU, a number of recent design wins, and a strong product portfolio.

Enter Vivante. According to research from Jon Peddie, Vivante has surged from a 0.3% market share in 2012 to 9.8% of the market in 2013, thanks to multiple design wins both in Western products and multiple Chinese markets.

Vivante got started in 2004 and started licensing its GPU designs in 2007. The company's early wins have been in Eastern markets, but this past year, it's begun to show up in devices intended for the West, including the Samsung Galaxy Tab 3.

A nifty new GPU core isn't worth much if you don't have vendors shipping your technology, but Vivante has made notable strides in the past year. The company's GC1000 GPU powers Google's Chromecast, the Samsung Galaxy Tab 3, and the 2D section of Texas Instrument's OMAP 4470 SoC. The company has also been picking up major market share in China, and with new wins in chips from Marvell, Freescale, Action Semiconductor, Ingenic, Rockchip, and even China's homegrown Godson-2H processor.

Vivante's Growth This Past Year Has Been Impressive

Many of these companies are smaller players that focus on Eastern markets at the moment, while the rest are motley group of midrange and low-power embedded players in the United States. Vivante's entire pitch, however, is that even the GC1000 core inside the Samsung Galaxy Tab 3 is only a modest example of the company's scalable performance.

The GC1000 contains 8-16 shader units depending on implementation -- the Marvell PXA988 appears to be an eight-core variant of the architecture, but this is uncertain. What we do know is that Vivante has designed the underlying architecture to scale up to 64 shader units in a hypothetical GC6000 implementation. That's rather larger than any product currently shipping -- Freescale's i.MX product family currently uses a GC2000 core with 16 shader units -- but it shows that the underlying design has legs and width to spare.

GPU Architecture

Vivante has taken a different approach to core design from most of the other companies that play in this space. All modern GPUs are explicitly designed to be modular and scalable, from smartphone hardware to workstation implementations. Typically what that means is that a company like Nvidia or AMD defines a single compute unit that can be duplicated throughout the GPU design. For the Radeon GCN architecture, for example, a compute unit is a group of 64 stream processors, a pair of asynchronous command engines, and a set of render outputs (ROPs) and texture mapping units (TMUs).

Vivante's GPUs are modular as well, but with a much finer level of granularity.

Each of the three shaded blocks (3-D Pipeline, Vector Graphics Pipeline, 2-D Pipeline) can be segmented or stacked into various configurations. A GPU core, in other words, could contain more ultra-threaded shaders, or additional vector graphics engines, up to 32 cores in total. Since the number of graphics front ends can vary depending on how many shader cores are hooked to each graphics core, the counts themselves can get rather confusing. The GC1000 graphics processor we'll be discussing today can be built in two configurations -- 2 (VEC-4), or 8 (VEC-1). The first configuration uses two GPU front-ends with 4 shader cores per GPU block, while the second has eight GPU front-ends with a single shader core in each. Different core configurations can be fine-tuned for maximum efficiency depending on workload.

This kind of fine-grained approach is fundamentally different from what we've seen from other manufacturers, who tend to balance pixel, shader, and other resources in one of two ways. Simple architectures, based on older GPU technology like Tegra 4 and its predecessors, you partition in advance for a fixed number of pixel and vertex shaders and hope you get your balance right. Unified GPUs, like the mobile flavor of Kepler that'll debut next year, can be programmed for multiple tasks and allocate their resources accordingly. Vivante GPUs use a unified shader architecture and they're more granular -- which means manufacturers can eat their cake and have it too when it comes to allocating GPU resources.

Each shader core contains 16 registers that can be ganged together depending on workload. A shader can perform up to five double-precision operations per cycle per shader unit quantum. There are up to 16 shader cores per GPU core, and up to four GPU cores in a single implementation, though no one has built a Vivante core anywhere near that large at this point.

One of the advantages of this tiny, modular architecture is that you can clock the cores like gangbusters. According to Vivante, the 28nm high performance silicon variant of the Vivante architecture can clock up to 1GHz at full speed, but fall back to 1/64th of this in power saving mode, or roughly 16MHz.

We tested the Vivante GC1000 using a Galaxy Tab 3 8GB device. The Tab 3 is a relatively modest performer, and the GC1000 is a relatively lightweight chip in any case. Coincidentally, it turns out that the only somewhat comparable device we have on hand, an iPhone 4S, is a better fit for a head-to-head than we suspected. While the screens are different sizes and dimensions, the iPhone 4S's resolution of 960x640 works out to 614,400 pixels. The Galaxy Tab 3 7.0 has a 1024x600 display, which also works out to 614,400 pixels. Onscreen performance comparisons, in other words, are still fair game as far as the total number of pixels displayed. We've rounded up three different, graphics-centric benchmarks -- 3DMark, GLBenchmark, and Basemark X.

The iPhone 4S is a good comparison for the Galaxy Tab 3 for another reason -- both are dual-core Cortex-A9 designs. Here, the Galaxy Tab 3 even has an advantage, its cores are clocked at 1.2GHz rather than the iPhone 4S' 800MHz.

In Basemark, the iPhone 4S leads in both offscreen and on-screen tests, and by a significant margin. Neither chip is strong enough to return a smooth frame rate though.

GLBenchmark's Onscreen test shows us a smaller gap between the two chips. Again, the iPhone 4S wins out here, but the older Egypt HD test runs decently well on the Galaxy Tab 3.

T-Rex HD's offscreen test, in its 1920x1080 resolution, is extremely hard on both cores. The relative performance gaps, however, don't change much.

The three 3DMark Ice Storm tests are actually much closer than the other benchmarks we ran. The Vivante GC1000 core is almost as fast as the iPhone 4S in the "Unlimited" and standard Ice Storm test, but does fall behind in the Extreme test.

Overall, our performance benchmark figures paint the Vivante GC1000 chip as offering much of the performance of the iPhone 4S' SGX543MP2, but still falling back significantly. Still, for a company just breaking into the market, these figures aren't bad at all. This chip has real potential, once we start seeing larger versions.
Vivante's Long-Term Play, Conclusion

According to Vivante, one of the core's advantages, even when performance isn't top notch, is die size. The photo below compares the SGX544 (functionally identical to the SGX543 in the iPhone 4S) against the Vivante GC1000. While these two chips are compared at 28nm, the relative proportions of the two solutions shouldn't change at 45nm.

The die size advantage the GC1000 possesses over the SGX543 is considerable. That's not going to convince top-brand manufacturers to automatically opt for Vivante products, but it's part of the company's greater strategy to appeal to a host of smaller manufacturers.Vivante's goal at this time isn't to compete with the top-end hardware from Nvidia, Imagination Technologies, or Qualcomm. Instead, the company has focused on creating small, low-power cores that can scale up to hit higher performance targets, but focus on economy of scale for now. The manufacturers that are using Vivante hardware -- Marvell, Freescale, Rockchip and the like -- are often smaller themselves. But with the Chinese and Indian markets growing rapidly, there's demand for product that can hit tiny die sizes.

That may be important to some of Vivante's customers. While TSMC and GlobalFoundries continue pushing smaller process nodes, companies like Marvell and Freescale cannot make the jump as quickly. Being able to implement high-efficiency cores in larger geometries could actually be part of an overall strategy to compete with high-end companies and minimize costs. Viewed from a performance-per-die area standpoint, the Vivante GC1000 is actually more efficient than the SGX543MP2 at the heart of the iPhone 4S, and that's a significant achievement for a fairly young company.

If the company continues to pick up design wins at its current rate, it could command significant market share within a few years -- and we'll start seeing more robust implementations that compete with top-tier tablets and smartphones, rather than staking out positions in the midrange.

This is one company to keep an eye on...

Content Property of HotHardware.com