AMD's Takes The Fight To Intel With Renewed Server Offensive
More Cores, Different Socket
Magny-Cours is a twelve-core processor built by stacking two Istanbul chips side-by-side. Unlike previous AMD processors, Magny-Cours is an MCM (multi-chip module), the eight-core version of the chip uses the same two dies but disables four of the cores. AMD has packed a great deal of advanced power management technology into Magny-Cours that wasn't present on earlier processors, including support for C1E. The core offers two major new performance features. First, it supports four channels of DDR3-1066 memory—using two discrete Istanbul dies allowed Sunnyvale to double available memory bandwidth. The other new feature is a fourth HyperTransport link, which enables a connection pattern AMD calls Direct Connect 2.0. Up until now, Opteron's had three HyperTransport links and connected as below:
The problem with Direct Connect 1.0 is that it wasn't very efficient past two-sockets. At four sockets, each processor has to pass data requests through a second chip if they need data stored on the CPU that sits diagonally across from them. At the time it was introduced (2003), it was still an immense leap over Intel's FSB, but the DC 1.0 style of connection has become increasingly antiquated. Communicating through a second processor adds latency and clutters up the HT links that connect all the CPUs together. With Direct Connect 2.0, AMD has rectified the problem. Direct Connect 2.0 doubles the amount of available memory bandwidth, allows for twelve DIMMs per socket, and puts an HT link directly between the two diagonal Opterons. AMD is a little ahead of Intel on this one, but it won't be for long; Intel's Nehalem-EX connects in a similar manner.

Maranello uses AMD's new G34 socket—the upcoming 4-6-core platform for smaller servers will use the G32 socket. Both sockets will support future Bulldozer processors; future 12-16 core Bulldozer Opterons (Interlagos) will drop into G34; 6-8 core Bulldozers (Valencia) will utilize G32. We're only just starting to hear whispers about Bulldozer. The whispers are encouraging, but not enough to draw conclusions from.

Two in a box, stacked in the same socket - count 'em, that's twelve
Two Cores For the Price of One: AMD's Not-So-Secret Weapon
The glue that holds this entire initiative together is AMD's new pricing model. It's a known fact that Shanghai/Phenom II processors can't keep up with Intel's Core i7 architecture. We predicted AMD would counter Nehalem's superior performance and Hyper Threading* support by banking on a higher core count, but we didn't expect the comparison to be nearly this aggressive.
Before we talk numbers, let's talk positioning. Currently, Intel offers Nehalem processors for single and dual-socket motherboards, for a maximum of 12 32nm cores at 3.33GHz. The 1K volume price on those cores according to Intel's spec sheet is $1663, or $3326 for two. There's also a 45nm quad-core Xeon series, but those chips are only slightly cheaper, at $1600 per or $3200 for eight 3.33GHz cores. Intel also offers a single-socket Nehalem Xeon, at $589 for a 2.93GHz quad-core.
If you want a four-socket system, you can't buy Core i7 technology quite yet and have to make do with an older, six-core Xeon based on Core 2 technology. Those cores top out at 2.66GHz and $2729 per chip. These aren't AMD's prime competition—Dunnington is handicapped by its antiquated FSB—but we've included them for comparison. The chart below shows AMD's previous price structure and its new model. We want to note that the Intel graph isn't completely accurate; Intel's most expensive modern Xeon tops out at $2729, not $3600+. AMD may have been taking a stab at future prices for Intel's Nehalem-EX; we know that chip will carry a substantial premium, but we don't know exact pricing yet.

As the chart shows, AMD's repositioning is a game-changer. The new Opteron 6176 SE is a 12-core part at 2.3GHz. That's twice as many cores as Intel's current Westmere, for 20 percent less than Intel's price. If you're curious about cost-per-core, each 6176 SE core is $115.50, compared to $277.17 for each Westmere. The Opteron processors, meanwhile, support >2P motherboards, offer more DIMM slots, and are forward-compatible. As an aside, AMD's new Magny-Cours chips annihilate Intel's Dunnington; the latter breaks down to $454.83 per-core.
Putting It All Together: AMD's New Strengths (and Weaknesses):
AMD has created two central strengths for itself. First, it has built a server platform that can potentially outperform Intel's Westmere in several areas. Applications and workloads that are extremely parallel, require huge amounts of bandwidth, or stress maximum available RAM are all scenarios where AMD has a good chance of besting or at least tying Intel's top-end Xeons. Applications that don't scale well are the weakest chink in AMD's armor. Westmere is clocked 45 percent higher than AMD's top-end 12-core chips and outperforms Shanghai significantly clock-for-clock. That's a one-two punch that Magny-Cours simply can't counter in certain workloads; Nehalem-based Xeons will almost certainly be the processors of choice for apps that scale poorly above six cores. Knowing that it can't match Intel's raw performance in these areas, AMD has chosen to compete on price, RAM loadout, and memory bandwidth. Combine those three factors, and the smaller CPU manufacturer can still make a case for its products, even in situations where Xeon's are clearly the faster chips.
Intel's upcoming Nehalem-EX won't do much to change this picture. Nehalem-EX will be even faster than Westmere, but its eight cores, quad-channel DDR3-1333, and 16 DIMMs per CPU will carry a significant price premium. If Beckton debuts anywhere near the $3600 price point AMD listed above, Sunnyvale will still be able to make a potent price/performance argument.
During the company's Q4 2009 conference call, AMD CEO Dirk Meyer predicted Sunnyvale would gain significant server market share in 2010. We were honestly dubious. With Core 2-based CPUs phasing out and Westmere coming in, it wasn't clear how AMD would boost its server share in the face of Intel's faster processors. After what we've seen today, we're optimistic. AMD's DIMM count, core density, and raw bandwidth give Opteron teeth; its pricing could make it formidable. AMD's future beyond 2010 is still misty; the company has bet the farm on Bulldozer. For this year, however, things are looking pretty good. Intel will unquestionably win in a number of performance comparisons, but AMD is in the game. We didn't have a server to test, but you'll find the benchmarks up and around the Internet confirm the general analysis here. Win, lose, or draw, AMD is firmly on the field.
*It's important to keep two things in mind when evaluating the potential performance boost from Hyper Threading. First, Hyper Threading isn't always useful. The better a program is at making efficient use of the CPU, the smaller the benefit from Hyper Threading. Second, Intel predicts HT will typically improve performance 15-20 percent (Atom is an exception to this rule). HT helps—but even in ideal circumstances, it's not the same as a full, physical CPU core.


