|Introduction and Specifications|
Thirty million transistors on the head of a pin. Think about that for a minute. Where on earth can you fit 30 million of anything in that amount of space? It used to be that 30 million transistors was a good-sized chip. These days, in a 45nm Hafnium-based High-K process, it almost seems like we (OK, OK, Intel...) can defy the laws of physics. We're talking rocket science here people. Actually, it's probably a bit more complex than rocket science. Titanium (Ti), Zirconium (Zr), Gallium (Ga), heck we've even heard of Rubidium (Rb), but Hafnium? Is someone at Intel just making this stuff up?
Whether you fancy yourself a scientist that can appreciate naturally occurring isotopes utilized in leading-edge manufacturing processes, or maybe you're a gear-head that knows four cores running at 3.2GHz is just "freakin' fast" - there is no denying that Intel is completely unstoppable currently, when it comes to semiconductor process and manufacturing R&D. No other semiconductor company in the world is shipping anything in high volume at 45nm. That's 45 nanometers or .045 micron if you prefer. Sure, 45nm has been "demonstrated" by the likes of IBM, TSMC, and Charter Semiconductor but getting to volume is a completely different ball of wax altogether. Few companies have the resources and capital that Intel has to bring the technology to market first. And when it comes to processors comprised of 800 million plus transistors, every tenth of a micron counts.
If you're familiar with the basic chip-level architecture of Intel's 45nm Yorkfield core quad-core processor that we showed you a few weeks back in our Core 2 Extreme QX9650 launch piece, then you've probably noticed that the new QX9770 model is simply a "speed bump" of sorts. Although the QX9770 represents a speed bump on couple of levels, not just core processor speed. Intel is taking the processor to 3.2GHz not by increasing the bus multiplier of the chip, rather they are raising the front side bus speed to 1600MHz, which in turn also provides additional system bus and memory bandwidth.
We've published a number of articles relating to Intel's Core microarchitecture, Core 2 Duo and Extreme family of processors, Penryn, and Intel's 45nm manufacturing process in the past here at HotHardware. For more detail or a refresher on the technologies employed in these products, we suggest taking a look at the following related articles.
The Intel 45nm Fab Process and Penryn previews above are probably most valuable material if you want to get familiar with the new technologies employed in the Yorkfield core that is at the heart of the new QX9770. These articles will give you a solid background for understanding the underlying technologies of the new Intel processor we'll show you in the pages ahead.
|X48 Chipset Sneak Peek|
As we noted previously, Intel has increased the Core 2 Extreme quad-core's clock speed by incrementally increasing front side bus speed. Along with this FSB boost, they've decided to revise the existing X38 chipset in an effort to better support the higher front side bus speed of the chip. Here's where the new X48 Express chipset comes in. That's not to say that existing X38 boards won't also support the higher 1600MHz FSB of the new Core 2 Extreme QX9770. The product will be supported on a case-by-case basis, depending on the motherboard model, manufacturer and BIOS revision. In fact, all of the benchmarks you'll see here today were taken from an X38 chipset-based motherboard from Asus (P5E3 Deluxe) and things were completely stable at stock 1600MHz FSB speeds and higher.
The Intel X48 Express chipset is based on the original X38 design, but is tweaked slightly to better support overclocking, high memory and FSB speeds. As such, it will command a premium in the channel with all manufacturers but for those who want to push their systems and maintain better stability, the X48 will be a better chioce.
As we see here, the X48 chipset is partitioned with its various high speed serial links and local and external IO, identical to what is found on the X38 chipset as well. In addition, Intel pairs the MCH Northbridge up with the same ICH9R Southbridge. Dual X16 PCI Express 2.0 slots are available for graphics and those slots are also backwards compatible, auto-negotiating down to PCIe 1.1 speeds as required. In addition, the MCH still supports up to 4 channels of DDR3 memory but the only difference between this chipset and the X38, is that the MCH now officially supports 1600MHz memory and a synchronous 1600MHz FSB to the processor.
If you compare the two boards above, you'll note that they are virtually identical, in terms of layout and features. The only thing that distinguishes the new X48-based Asus P5E3 Premium between its X38-based sibling, is the model name ("Premium" versus "Deluxe") and the newer board's blue color chipset heatsink covers. However, under the hood, the new X48 Express chipset breaths a bit more FSB and memory headroom. Incidentally, clock-for-clock, the X38 will perform nearly identical to the X48 chipset, with only a minor advantage in performance at very high overclock speeds (think the 1800MHz+ range), due to slightly better signal integrity of the chipset in extreme high speed corner cases. Again, the primary benefit of the X48 chipset is stability at high bus speeds.
|Our Test Systems and SANDRA|
How we configured our test systems: When configuring our test systems for the upcoming series of benchmarks, we first entered each respective system's BIOS and set the motherboard to its "Optimized" or "High performance Defaults". We then saved the settings, re-entered the BIOS and set memory timings for either DDR2-1066 (AMD systems) with 5,5,5,15 timings or DDR3-1066 - 1600 with 7,7,7,20 timings (Intel systems). The hard drives were then formatted, and Windows Vista Ultimate was installed. When the Windows installation was complete, we updated the OS, and installed the drivers necessary for our components. Auto-Updating and Windows Defender were then disabled and we installed all of our benchmarking software, defragged the hard drives, and ran all of the tests.
We began our testing with SiSoftware's SANDRA XII, the System ANalyzer, Diagnostic and Reporting Assistant. We ran six of the built-in subsystem tests that partially comprise the SANDRA XII suite with the Core 2 Extreme QX9770 (CPU Arithmetic, Multimedia, Multi-Core Efficiency, Memory, Cache, and Memory Latency). All of the scores reported below were taken with the processor running at its default clock speed of 3.2GHz with a 1600MHz FSB and 1600MHz DDR3 memory at CL7 timings.
In terms of overall processing throughput (ALU, FPU and Multimedia SSE), the new Core 2 Extreme QX9770 is the fastest single-chip processor in SANDRA's reference table, only bested by the octal-core dual Xeon 5345 system. Other than that, the QX9770 leaves all others in its wake. Looking at the efficiency side of the equation, inter-core bandwidth and latency are also off the charts for the QX9770. The only test that sort of put a wrinkle in our brow was the Memory Bandwidth test. SANDRA's reference database shows the QX9770 and X38 chipset at 1600MHz FSB and memory speed, with slightly less bandwidth versus the P35 with 1GHz DDR3. Don't let this mislead you however, as it did us. It would seem, in order to hit that sort of memory speed, that the bus speed and multiplier of the CPU had to be altered, thus raising overall bandwidth. This test can be dramatically influenced with a higher FSB. Remember, the QX9770 is running at stock speeds and not overclocked in any of these tests. Let's move on to more "real-world" testing.
For our next round of benchmarks, we ran a few of the modules built into Futuremark's PCMark Vantage test suite. Vantage is a new benchmarking tool that we've incorporated into our arsenal of tests here at HotHardware. Here's how Futuremark positions their new benchmarking tool:
The PCMark Vantage "Memories" suite includes the following tests:
Memories 1 - Two simultaneous threads, CPU image manipulation and HDD picture import
Memories 2 - Two simultaneous threads, GPU image manipulation and HDD video editing
Memories 3 - Video Transcoding: DV to portable device
Memories 4 - Video Transcoding: media server archive to portable device
When it comes to image manipulation and video transcoding, at least according to PCMark Vantage, it's all Intel currently. Specifically, the QX9770 completely obliterates anything from AMD, save perhaps for the dual-socket Athlon 64 FX-74 3GHz system, where it still overtakes the best AMD has to offer in this test by about 20% and with a lot less power consumption as you'll see later. The QX9770 is only about 3% faster than the QX9650, though it has roughly a 6+% clock speed advantage.
The Vantage HDD suite includes the following tests:
HDD 1 - HDD: Windows Defender
Since our hard disk subsystem was so similar between each of our test systems, the above results for Vantage's HDD test suite come as no surprise. Our group test systems all clocked in on top of each other in this test, largely because we used the identical hard drives across all test beds.
Vantage Communications suite includes the following tests:
Communications 2 - Three simultaneous threads. Web page rendering: open various news pages from IE 7 Favorites in separate tabs, close them one by one, Data decryption: CNG AES CBC, HDD: Windows Defender
Communications 3 - Windows Mail: Search
Communications 4 - Two simultaneous threads, Data encryption: CNG AES CBC, Audio transcoding: WMA -> WMA - to simulate VOIP
When we looked at things like data encryption and decryption throughput, as the PCMark Vantage Communications suite shows us, our results varied tremendously. When you're processing 128-bit or 256-bit AES keys, you tend to separate the men from the boys so to speak. The higher the clock speed and the greater number of cores, the better your results. It's that simple. Both the QX9650 and QX9770 burn past all processors in this test but the FX-74 dual socket quad-core system puts up a solid fight, though it is still about 8% behind Intel's fastest architecture, clock for clock. The new Phenom quad-cores simply couldn't keep up with their relatively low clock speed, or even versus a Core 2 Duo at the same clock speed; the difference most likely being cache again, with Core 2 Duo's 4MB of L2 and the Phenom's 2MB of L2 and 2MB of higher latency L3.
Vantage Productivity suite includes the following tests:
Productivity 2 - Two simultaneous threads, Windows Contacts: search, HDD: Windows Defender
Productivity 3 - HDD: Windows Vista start-up
Productivity 4 - Three simultaneous threads, Windows Contacts: search, Windows Mail: Run Message Rules, Web page rendering: simultaneously open various pages from IE7 Favorites in separate tabs, close them one by one
For general Windows Vista performance, the Core 2 Extreme QX9770 once again posted the fastest score of the group, edging out the QX9650 by abotu 5%. Again AMD's 3GHz FX-74 rig keeps pace but the more power and cost-efficient Phenom core systems performed only at the level of their quad-core Intel counterparts at the same clock speed.
|PCMark Vantage (Continued)|
We continue our test coverage with a few more modules from the comprehensive PCMark Vantage suite of benchmarks.
Vantage TV and Movies suite includes the following tests:
TV and Movies 1 - Two simultaneous threads, Video transcoding: HD DVD to media server archive, Video playback: HD DVD w/ additional lower bitrate HD content from HDD, as downloaded from the net
TV and Movies 2 - Two simultaneous threads, Video transcoding: HD DVD to media server archive, Video playback, HD MPEG-2: 19.39 Mbps terrestrial HDTV playback
TV and Movies 3 - HDD Media Center
TV and Movies 4 - Video transcoding: media server archive to portable device, Video playback, HD MPEG-2: 48 Mbps Blu-ray playback
In the TV and Movies test suite, multi-threaded processing with a larger number of cores at work, leveled the playing field a bit more. In fact, the Phenom 9700 and 9600 put up a solid performance and kept pace with a similarly clocked Core 2 Quad chip. Regardless, the new 3.2GHz Core 2 Extreme QX9770 offers the best available performance in this test, besting AMD's fastest chip by a significant 15% margin.
Vantage Music suite includes the following tests:
Music 1 - Three simultaneous threads, Web page rendering – w/ music shop content, Audio transcoding: WAV -> WMA lossless, HDD: Adding music to Windows Media Player
Music 2 - Audio transcoding: WAV -> WMA lossless
Music 3 - Audio transcoding: MP3 -> WMA
Music 4 - Two simultaneous threads, Audio transcoding: WMA -> WMA, HDD: Adding music to Windows Media Player
Processing and transcoding audio content proved to offer similar results as the TV and Movies test but also favoring pure clock speed a bit more. As you can see, a 3GHz Core 2 Duo offers better performance than a 2.4GHz Core 2 Quad not to mention the 2.4GHz quad-core Phenom 9700. The new Core 2 Extreme QX9770 is some 38% faster than AMD's fastest single chip solution here and over 10% faster than the dual socket quad core Athlon 64 FX-74 system.
Courtesy, Futuremark: "Gaming is one of the most popular forms of entertainment for all ages. Today’s games demand high performance graphics cards and CPUs to avoid delays and sluggish performance while playing. Loading screens in games are yesterday’s news. Streaming data from an HDD in games – such as Alan Wake™ – allows for massive worlds and riveting non-stop action. CPUs with many cores give a performance advantage to gamers in real-time strategy and massively multiplayer games. Gaming Suite includes the following tests: "
Gaming 1 - GPU game test
Gaming 2 - HDD: game HDD
Gaming 3 - Two simultaneous threads, CPU game test, Data decompression: level loading
Gaming 4 - Three simultaneous threads, GPU game test, CPU game test, HDD: game HDD
The PCMark Vantage Gaming test needs little explanation, the numbers speak for themselves. This test is basically a re-run of the Futuremark's 3DMark 06 engine, so the chips fall as expected. At 6% more performance versus the QX9650 and 30% faster than the next fastest chip, the AMD Phenom 9700, the new QX9770 is easily king of the hill here.
The overall PCMark Vantage score is a weighted average of all of the modules in the Vantage suite calculated in total "PCMarks". Here are the results:
The numbers do the talking here just fine. You really don't even need us to provide commentary but of course we will anyway. You want the fastest single-chip (or even multi-chip) desktop CPU on the planet right now? According to PCMark Vantage, you need to look no further than the Intel Core 2 Extreme QX9770.
|LAME MT and Sony Vegas|
In our custom LAME MT MP3 encoding test, we convert a large WAV file to the MP3 format, which is a popular scenario that many end users work with on a day-to-day basis to provide portability and storage of their digital audio content. LAME is an open-source mid to high bit-rate and VBR (variable bit rate) MP3 audio encoder that is used widely around the world in a multitude of third party applications.
Lame MT is multi-threaded but only supports up to two threads and as a result, the scores are reflective of clockspeed and overall IPC throughput between the CPUs. However, on-chip cache also has a bearing on this test and as a result, the Yorkfield based QX9650 and QX9770, with 12MB of L2, show their muscle.
Sony's Vegas DV editing software is heavily multi-threaded as it processes and mixes both audio and video streams. This is a new breed of digital video editing software that takes full advantage of current dual and multi-core processor architectures.
Finally we see AMD's Phenom gain some sort of competitive equalization against the other quad-core Intel CPUs we tested. However, the Core 2 Extreme QX9770 thrashes the Phenom 9700 with a 50% performance gain over the fastest Phenom AMD can muster currently.
|POV-Ray and Kribibench|
POV-Ray , or the Persistence of Vision Ray-Tracer, is a top-notch open source tool for creating realistically lit 3D graphics artwork. We tested with POV-Ray's standard included benchmarking model on all of our test machines and recorded the scores reported for each. We shoudl also note that we used the latest 64-bit beta build of the program. Results are measured in pixels-per-second throughput.
Our POV-Ray test methodology has proven itself to be hugely lopsided toward Intel processor architecture. Again the AMD Athlon FX-74 system actually puts up a reasonable fight, going toe-to-toe with a a single Core 2 Quad Q6600 at 2.4GHz but in turn gets left in the dust by the new QX9770. The Core 2 Extreme QX9770 is 74% faster than the AMD Phenom 9700 in this test.
For this next batch of tests, we ran Kribibench v1.1, a 3D rendering benchmark produced by the folks at Adept Development. Kribibench is an SSE aware software renderer where a 3D model is rendered and animated by the host CPU and the average frame rate is reported.
If you're a CAD professional that spins wireframe models for a living occasionally, the chart above for our Kribibench test speaks volumes. The recurring trend here is that the QX9770 is simply the fastest desktop or workstation processor on the planet right now, no matter what you throw at it.
|Cinebench R10 and 3DMark06|
Cinebench 10 is an OpenGL 3D rendering performance test based on Cinema 4D. Cinema 4D from Maxon is a 3D rendering and animation tool suite used by 3D animation houses and producers like Sony Animation and many others. It's very demanding of system processor resources and is an excellent gauge of raw computational throughput.
Cinebench is perhaps our most favorite "quick and dirty" test for gauging how fast a new CPU core is. If you're looking for a general quick-take view of system performance and CPU power, Cinebench consistently gives results that we rely on here in our labs. In the multi-threaded version of our this test, the QX9770 is 63% faster than the Phenom 9700. And with only a 33% clock speed advantage over the new Phenom, obviously the new Intel core is significantly more efficient clock-for-clock with a higher IPC (instructions per clock cycle) throughput.
3DMark06's built-in CPU test is a multi-threaded DirectX gaming metric that's useful for comparing relative performance between similarly equipped systems. This test consists of two different 3D scenes that are processed with a software renderer that is dependent on the host CPU's performance. Calculations that are normally reserved for your 3D accelerator are instead sent to the CPU for processing and rendering. The frame-rate generated in each test is used to determine the final score.
Got game? Intel does that's for sure. Though 3DMark 06 is a "synthetic" gaming test, results especially with its CPU performance module definitely scale proportionately with real world performance. The QX9770 is 6% faster than the 3GHz QX9650 and why even bother to compare it to the Phenom 9700. Intel's new high-end CPUs are in a league of their own. Gamers get ready. We have real in-game performance data for you next.
|Gaming: Crysis and F.E.A.R.|
For our last set of benchmark tests, we moved on to some in-game benchmarking with F.E.A.R. and the fulll game version of the wildly popular title, Crysis from Crytek. For testing processors with Crysis or F.E.A.R, we dropped the screen resolution to 800x600, and reduced all of the in-game image quality options to their minimum values to isolate CPU and memory performance as much as possible. However, the in-game effects, which control the level of detail for such things as the game's physics engine and particle system, are left at their maximum values, since these actually do place some load on the CPU rather than GPU.
The fastest single processor for gaming from the AMD side of the house, generally speaking according to these two tests, is the Athlon 64 X2 6400+. Again, that's according to the game engines at work in Crysis and F.E.A.R. The fastest processor of Intel's offering is obviously the QX9770, which looks to be 6 - 8% faster than its 3GHz counterpart, the QX9650. In general though, the AMD systems are easily outperformed by the Intel-based setups, in some cases by a large margin.
We have one final data point we'd like to cover before bringing this article to a close. From the perspective of performance-per-watt, we felt it was important to give you an idea of how much power each of the system configurations we tested consumed while at idle and running under load.
Please keep in mind that we were testing total system power consumption here at the outlet, not just the power being drawn by the processors alone. In this test, we're showing you a ramp-up of power from idle on the desktop to full CPU load. We tested with a combination of Cinebench R10 and SANDRA XII running on the CPU.
The best performance-per-watt metric also belongs to the new Yorkfield core Core 2 Extreme QX9XX series of processors. Our 2.4GHz Phenom system actually consumed more power than either our QX9650 or QX9770-based systems, whether at idle or under loaded conditions. We spoke of how "process kills" in the opening section of this article. A 65nm-built processor, even with half the number of transistors (820M in the Yorkfield core versus ~450M in Phenom) can't compete with one built on a 45nm process. Again, think about this. AMD's Phenom consumes more power at a clock speed that is running 800MHz slower, with half the number of transistors on chip. Game over.
|Our Summary and Conclusion|
Intel is once again poised to bring to the world the fastest X86 processor on the planet. Though the chip is actually not available for sale just yet, we are told that an imminent release of Q1'08 is on the horizon. AMD's Phenom 9700 is slated for release during this time frame as well, as the fastest quad-core CPU the company has too offer. However, the price-point variance between the two is staggering. AMD has noted that the Phenom 9700 will hit around the $300 or less range. On the other hand, Intel's QX9650 is currently listed with an MSRP of $999, so you can expect the QX9770, when it is released, will retail for at least that much, maybe even more like the $1200 range. Putting it mildly, only those with virtually limitless budgets need apply for this level of performance.