|Introduction and Architectural Details|
It has been quite a while since AMD launched a truly new CPU core architecture. It was way back in September of 2003 that the first "K8" based desktop processors arrived in the form of the single-core Athlon 64 and Athlon 64-FX. And while the company has launched a slew of new desktop, server, and mobile processors since then, there haven’t been any major changes made to the base CPU architecture. AMD did make some modifications to the execution core as they introduced processors for different sockets or with different L2 cache sizes, but overall the architecture remained largely unchanged.
Today’s Athlon 64 X2 processors have quite a lot in common with those initial Athlon 64s. The first major change came when AMD built the first native dual-core X2, which was essentially a pair of Athlon 64 execution cores and a single memory controller on a single die. Then AMD surgically removed the original DDR memory controller and replaced it with a DDR2 controller with the transition to socket AM2 in May of last year. But in the almost four year span since the launch of the initial Athlon 64 processors up until today, the technology employed in the processors was essentially the same.
September 10, 2007 is a big day for AMD, however. Today is the day AMD is officially taking the wraps of their native quad-core Barcelona-based Opteron processors. While the Barcelona core does still borrow heavily from their last-gen processors, it incorporates a number of enhancements for increased performance and power efficiency.
The new processors being launched today are the Quad-Core AMD Opteron Processor model numbers 8350, 8347, 8347 HE, 8346 HE, 2350, 2347, 2347 HE, 2346 HE, and 2344 HE. The image above illustrates the layout of the actual Barcelona die and highlights the functional blocks within the CPU. Although the processors being launched today have some different capabilities, namely their ability to be used in different multiprocessors configurations, they only differ in their compliment of active HyperTransport links (visible around the edge of the die shot).
The chart above is a breakdown of the technical specifications common to all Barcelona-based Opteron processors. Each of the four cores is outfitted with 64K of L1 instruction and 64K of L1 data cache, for a total of 512K of L1 cache per CPU. The L2 cache compliment of each core is 512K, for a total of 2MB. New to the Barcelona core is 2MB of dynamically shared L3 cache. Unlike L1 and L2 caches, which are exclusive to each execution core (data in Core 1’s L2 cache cannot be accessed by Core 3, for example), the L3 cache is shared among all the cores. Also new to Barcelona-based Opterons is a 128-bit wide memory controller that can be configured as dual independent 64-bit channels to allow for simultaneous read and write memory operations.
At present, the new AMD Opterons will be built in AMD’s Dresden, Germany facility using the company’s 65nm SOI (silicon on insulator) manufacturing process. Each Barcelona die is comprised of approximately 463M transistors (about 119M less than Intel’s quad-core Kentsfield) and is about 285mm2 in size.
Above is a list of the nine different quad-core Opterons being launched today along with their respective core and Northbridge frequencies, power consumption characteristics, and default voltages. As you can see, the highest clocked model will arrive at 2.0GHz, but with a TDP of only 95w.
|Architectural Details (Cont.)|
Since the Barcelona core, AMD's first native quad-core processor, is targeted to the data center and enterprise server markets, the positioning of this processor is squarely pitched on the all-mighty "performance-per-watt" metric that we've heard so much of from both camps as of late. However, although energy efficiency is a recurring theme, AMD is still planning to offer "SE" high performance models, scaling up to 2.3GHz in Q4 of this year.
Looking deeper into Barcelona's quad-core, single die architecture, we've learn that the cores themselves have been revamped considerably. Here are a few of the key salient points of Barcelona's new core micro-engines.
In addition, the Barcelona architecture will now support dynamic clock gating on a per-core basis. Though core voltages won't be managed independently, the clock speed of each core can throttle back when idle, providing significant power savings. And AMD's "CoolCore" technology allows for functional blocks of each core to be shut off when not in use, further improving power efficiency. You may have heard of AMD's "Dual Dynamic Power Management" technology referred to as "split power planes" in the past. We should note that to fully take advantage of AMD's "Dual Dynamic Power Management" technology, a next-gen platform must be used. Users that drop a Barcelona based Opteron in an existing socket 1207 platform with not have support for split power planes, because current motherboards lack the necessary support.
New Acronyms - AMD ACP:
Finally, in a move reminiscent of their campaign from long ago to debunk the "megahertz myth" with processor performance rating-type model numbers, AMD is announcing a new power consumption metric called "ACP" or "Average CPU Power". AMD claims historically that they took a much more conservative approach with providing measurements for traditional TDP (Thermal Design Power) ratings, opting to list worst case, maximum numbers versus Intel's average or "typical" rating guidelines. As a result, AMD's processor architecture, at least on paper, appeared significantly more power hungry than it was in practice.
In essence, an Opteron ACP rating of 75 watts, for example, indicates that the processor under typical conditions and workloads will consume 75 watts of power. In addition, AMD is suggesting that customers should relate their ACP rating of a processor to a comparable TDP rating on a competitive Intel CPU. With respect to TDP, AMD notes that those "worst case" numbers are still going to be available but are not a level, fair comparison to an Intel processor TDP rating, since Intel doesn't report worst case characterization data in their TDP listing. Though we haven't tested these claims in the lab just yet ourselves, we'll note this for future reference and report real-world metrics as they become available.
|More Details and Conclusion|
AMD in also introducing a few enhancements with the Barcelona core designed to improve virtualization performance.
The new AMD-V instructions in Barcelona offer hardware acceleration of shadow paging, which allows guest operating systems to have their own memory management. AMD calls this feature “nested paging” and it should dramatically decrease the amount of time virtualization software needs to manage shadow pages. In fact, AMD claims up to a 79% increase in normalized transactions per minute with Barcelona versus the fastest current dual-core Opterons.
What you see above is a virtual decoder ring that explains the model numbering scheme used with the new Opterons. The first digit represents the maximum number of processors supported in a single system, the second represents the generation of the processors, and the third and fourth digits are used to designate relative performance.
We also have a breakdown of the pricing structure of the new Barcelona-based Opterons being launched today. The least expensive model will be the 2344 HE at $209 with the flagship 2.0GHz 8350 weighing in at $1019.
Looking at futures, we see of course that AMD's desktop equivalent Phenom processor, based on the same multi-core architecture as Barcelona, is due to arrive some time in December this year, along with the new 7XX series of chipsets. We've also heard that a Radeon refresh is due out this year as well, driven by a new manufacturing process migration for power consumption reduction.
Finally, in terms of future processor platforms, AMD is targeting the release of their first 45nm part in 1H08. Code-named Shanghai, this quad processor core, we've been told, will offer a 15% per-core performance boost, clock-for-clock and a significantly larger 6MB L3 cache supporting individual per-core 512KB caches. Beyond Shanghai we see Sandtiger, AMD's first octal-core effort that we're told is slated to arrive in a single socket MCM (Multi-Chip Module) solution utilizing HT3 high speed serial links for inter-processor communication.
And that wraps-up our preliminary coverage of AMD's new native quad-core architecture called Barcelona. We hope we also gave you some insight as to what AMD has in store in the months ahead as well. AMD representatives told us that these new Barcelona core based Opterons will be available in the channel in volume immediately. Unfortunately we weren't able to secure a Barcelona-based system for testing and deeper analysis for you here today, but we hope AMD can hit stride with these new quad-core processors and in addition turn up clock speeds to compete more vigorously with Intel. At 2GHz, it's going to be very hard to keeping pace with a 3GHz Clovertown-based quad core Xeon, not to mention the 45nm Penryn-based derivatives waiting in the wings.