Logo   Banner   TopRight
TopUnder
Transparent
Intel V8 Media Creation Platform - Dual Sockets - Dual Xeons
Transparent
Date: Jun 13, 2007
Section:Processors
Author: Marco Chiappetta
Transparent
Introduction, CPU and Motherboard Specs

v8_logo.jpg

It was at this year's Consumer Electronics Show that we were first exposed to an Intel demo machine dubbed the ' V8' .  Intel wouldn't label the machine as a direct response to the QuadFX platform by AMD, but they did want to stress the point that enthusiasts could have a dual-socket, eight-core Intel powered machine today if they went with a workstation platform.  At the heart of the 'V8' machine were two quad-core Xeon processors, a dual-socket S5000XVN motherboard powered by the Intel 5000X-series chipset, and 4GB of RAM.  No doubt a powerful system.  But there was only so much Intel could convey in a 'quick and dirty' demo.  To assess the true power and capabilities of the 'V8', we'd have to gain unfettered access to the machine and put it through its paces on our own terms and with our own suite of benchmarks.  And that's exactly what we've done.

Armed with a pair of quad-core, 3.0GHz Xeon X5365 processors, 4GB worth of Samsung FB-DIMMs, and the very same S5000XVN motherboard, we assembled a full system and compared its performance to a dual Athlon 64 FX-74 powered QuadFX rig and a Core 2 Extreme QX6800 powered system.  Before we show you how our version of the 'V8' performed, however, lets take some time and look at the individual pieces that make up the system's foundation.  Below are the features and specifications of the motherboard and processors used in this system, and on the proceeding pages we'll get more intimate with the actual hardware.

Intel Xeon X5365 Quad-Core Processor
Specifications

cpu_specs.png

The Xeon X5365's features and specifications look much like the QX6800, save for a couple of important details. At the core of the CPU is the same base architecture used in Kentsfield and Conroe.  But with the Xeon X5365, Intel has upped the FSB to 1333MHz and cranked the CPU frequency to an even 3.0GHz.  That makes these processors the fastest Core-microarchitecture based CPUs to come from Intel yet.  The higher clock speeds, however, also increase the chip's TDP to 150 watts. The Xeons use a different 771 pin socket as well and for now they're not widely available.  If you want to score a pair of X5365s today, about the only way to do so is to purchase a fancy new Mac Pro. Intel has plans to make these processors more widely available by the end of the year.

For a more detailed look at the technologies employed in the Xeon X5365 processor, we suggest taking a few minutes to check out our Conroe and Kentsfield launch coverage.  Between those two articles, all of the main features of the Core microarchitecture are explained in detail.

Intel Workstation Motherboard S5000XVN
Specifications

mobo_specs.png

Although the dual-socket S5000XVN motherboard is built around a chipset that isn't featured on any of the enthusiast-class motherboards we typically show you here at HotHardware, its main features should be familiar to many of you. The motherboard is powered by the Intel Chipset 5000X for Xeon processors. It has RAID support, PCI Express support, dual-Gigabit Ethernet, and High Definition audio.  Where it differs from today's desktop motherboards is in its support for fully buffered DIMMs (FB-DIMM) and in its expansion slot configuration. There are a number of other differences as they relate to the motherboard's BIOS as well, but we'll get into more detail on that a little later.

Transparent
A Look At The V8 Platform

We can't very well talk about a total platform without giving you some details regarding the actual hardware.  To that end, we have some pictures of the individual components that were used to build up our Intel V8 Media Creation Platform.  In addition to the processors and motherboard, we want to give some attention to the memory and coolers used in this system.

         

The PIB heatsinks included with the Xeon X5365 processors are made completely of copper and get mounted using a quartet of screws distributed around their edges.  The fans used on the coolers throttle based on CPU activity, from a barely-audible low-speed to an ear-piercing high-speed that's akin to a powerful hair dryer. With these stock coolers, there is no way anyone could use the system daily and not be distracted by the noise level. There are aftermarket coolers available, however, that are significantly quieter and tolerable.

         

The FB-DIMMs used in the system come by way of Samsung.  Four, 1GB sticks of M395T2953CZ4 - CE60 PC2-5300 memory populate the motherboard.  The memory is rated for operation at 667MHz with 5-5-5 timings and it's equipped with basic, flat heat spreaders.  Despite the relatively low frequency of these FB-DIMMs in comparison to most desktop memory, we found it to run quite hot in our open-air test bench.  This is due in part to the AMB chip on board the modules, which essentially is a serial interface device for the memory.  AMB chips typically consume 3 - 5 watts of power themselves.  Multiply that times four and we're looking at up to 20 watts total required per 4GB installation (using 1GB sticks) above and beyond the memory chips themselves. 

The V8's need for FB-DIMMs is both a pro and a con depending on your point of few. FB-DIMM eschews the parallel bus architecture of traditional DRAMs in favor of a serial interface between the memory controller and an AMB (Advance Memory Buffer) on each DIMM.  Using a serial interface that connects to only the AMB allows for an increase in the bus width of the memory without overcomplicating the connections between it and the memory controller.  With FB-DIMMs the memory controller does not write to the memory directly, but rather the AMB.  And the AMB offers a number of benefits, like compensating for signal deterioration and error correction without any CPU overhead.  In addition, since each DIMM has a dedicated serial interface to work with, bus loading is not an issue and installations of up to 8 FBDIMM sockets can be accomodated, which is something that is simply not possible with standard registered DIMMs.  The AMB does, however, introduce additional latency to the memory pipeline.  That latency can be somewhat negated with higher frequencies, but as of today FB-DIMMs aren't available at any speeds higher than 667MHz.

       

We've already given you some details regarding the processors and motherboard on the previous page, but there's still more to talk about.  As we've mentioned, the Xeon X5365 processors have four execution cores, clocked at 3.0GHz.  They used Intel's LGA771 packaging and are equipped with 8MB of L2 cache.  Using two of these processors in a single platform results in eight available cores and a total of 16MB of L2.  To use two processors in one system though, you'll need a motherboard that supports that configuration, like the S5000XVN.  This particular model of the board is the S500XVNSAS, which as its name implies, features a Serial Attached SCSI drive controller.  The board has only 1 PCI Express x16 slot, so multi-GPU configurations are out.  It's also got a pair of x8 slots (with x4 electrical connections) and a pair of PCI-X slots, which do work with most standard PCI expansion cards as well.

The board is passively cooled by a trio of basic, aluminum heatsinks affixed to the chipset and VRM. Given this platform's workstation roots, the S5000XVN's large 13" x 12" PCB and layout are reminiscent of motherboards used in rack-mount servers, and will not fit is many mid-tower (or smaller) enclosures.  And it requires both an 8-pin and 4-pin power connector (in addition to the 24-pin connector), so some power supplies are out.  The board has HD audio, but can only output 2 channels.  And all of its expansion headers and connectors are huddled together in two groups at opposite corners of the board.  Perhaps one of the more interesting features of this motherboard is that it has dual-front side busses, so each CPU has its own 1333MHz FSB.

Transparent
Digging Into The System BIOS

 

The S500XVN's system BIOS will likely consist of a mix of familiar and unfamiliar items depending on your computing background. If you've got any experience with Intel-built motherboards or servers, the main menu and general layout of the BIOS and its menus should be old-hat. For desktop enthusiasts, however, the BIOS options shown here are going to definitely be out of the ordinary.

Intel Server Board S500XVN
Exploring The BIOS

         

 

         

 

         

Most of the BIOS menus consist of standard items for configuring integrated peripherals or the many features native to Intel's processors, like Virtualization or EIST for example.  From within the S5000XVN's BIOS, user's can enable or disable any of the on-board peripherals, tweak memory timings, set passwords, and configure the boot sequence, etc  There aren't any controls for tweaking voltages or front side bus frequencies though, so overclocking is out.  There is no traditional hardware monitoring functionality directly in the system BIOS either, and acoustic controls are limited to selecting how far above sea-level the system is located.  Definitely not an enthusiast-class motherboard by any means.

Transparent
Test Systems and SANDRA XI SP3

How we configured our test systems: When configuring our test systems for this article, we first entered their respective system BIOSes and set each board to its "Optimized" or "High performance Defaults".  The hard drives were then formatted, and Windows Vista Ultimate 64-bit Edition was installed. When the Windows Vista installation was complete, we installed the drivers necessary for our components, performed a disc clean-up, disabled UAC, and set up a 2048MB permanent page file on the same partition as the Windows installation. Lastly, we enabled Vista's AERO interface, installed all of our benchmarking software, defragged the hard drives, and ran all of the tests.

HotHardware's Test Systems
AMD & Intel Inside!

System 1:
Intel Xeon X5365 x 2
(3.0GHz - Quad Core)

Intel S5000XVN
(Intel Chipset 5000X)

4 x 1GB Samsung FB-DIMM
DDR2-667

GeForce 8800 GTX
On-Board Ethernet
On-board Audio

WD150 "Raptor" HD
10,000 RPM SATA

Windows Vista Ultimate x64
Intel Inf Drivers v8.3.0.1013
NVIDIA Forceware v158.24
DirectX 9.0c June 2007

System 2:
Intel C2E QX6800 x 2
(2.93GHz - Quad Core)

Intel D975XBX2
(Intel 975X Express)

4 x 1GB Corsair PC2-6400
DDR2-800

GeForce 8800 GTX
On-Board Ethernet
On-board Audio

WD150 "Raptor" HD
10,000 RPM SATA

Windows Vista Ultimate x64
Intel Inf Drivers v8.3.0.1013
NVIDIA Forceware v158.24
DirectX 9.0c June 2007

System 3:
AMD Athlon 64 FX-74 x 2

(3.0GHz - Dual Core)

Asus L1N64-SLI WS
(NVIDIA nForce 680a SLI)

4 x 1GB Corsair PC2-6400
DDR2-800

GeForce 8800 GTX
On-Board Ethernet
On-board Audio

WD150 "Raptor" HD
10,000 RPM SATA

Windows Vista Ultimate x64
NVIDIA nForce Drivers v9.53
NVIDIA Forceware v158.24
DirectX 9.0c June 2007

 

Preliminary Testing with SiSoft SANDRA XI SP3
Synthetic Benchmarks

We began our testing with SiSoftware's SANDRA XI, the System ANalyzer, Diagnostic and Reporting Assistant. We ran six of the built-in subsystem tests that partially comprise the SANDRA XI suite with the Athlon 64 X2 6000+ ( CPU, Multimedia, Multi-Core Efficiency, Memory, Cache, and Memory Latency) . All of the scores reported below were taken with the processors running at their default clock speeds of 3.0GHz.

  

 
Xeon X5365 x 2 @ 3.0GHz
CPU Arithmetic


 
Xeon X5365 x 2
@ 3.0GHz
MultiMedia

 
Xeon X5365 x 2
@ 3.0GHz
Multi-Core Efficiency



 
Xeon X5365 x 2
 @ 3.0GHz
Memory Bandwidth


 
Xeon X5365 x 2
 @ 3.0GHz
Cache and Memory

 
Xeon X5365 x 2
 @ 3.0GHz
Memory Latency

The SiSoft SANDRA results presented above are a mix of good and bad. The processor arithmetic tests that can reside completely in L2 cache allow the 8-cores to run flat out, which results in some extremely high performance in those tests - nothing in SANDRA's database can touch the V8 here.  Inter-Core and Cache bandwidth are also very high with this platform, but it does falter in regard to raw memory bandwidth and latency. The V8 rig barely broke the 6GB/s mark in the bandwidth test and its access latency was the highest of the bunch.

Transparent
PCMark05: CPU and Memory

For our next round of synthetic benchmarks, we ran the CPU and memory performance modules built into Futuremark's PCMark05 suite.

Futuremark PCMark05
More Synthetic CPU and Memory Benchmarks

"The CPU test suite is a collection of tests that are run to isolate the performance of the CPU. The CPU Test Suite also includes multithreading: two of the test scenarios are run multithreaded; the other including two simultaneous tests and the other running four tests simultaneously. The remaining six tests are run single threaded. Operations include, File Compression/Decompression, Encryption/Decryption, Image Decompression, and Audio Compression" - Courtesy FutureMark Corp.

 

Despite having double the number of cores at its disposal - a 100% increase - the V8 rig didn't muster a score all that much higher than the QX6800.  Technically, the V8 rig was the fastest of the bunch here, but it's 210+ point margin of victory equiates to only an 2.2% advantage.


"The Memory test suite is a collection of tests that isolate the performance of the memory subsystem. The memory subsystem consists of various devices on the PC. This includes the main memory, the CPU internal cache (known as the L1 cache) and the external cache (known as the L2 cache). As it is difficult to find applications that only stress the memory, we explicitly developed a set of tests geared for this purpose. The tests are written in C++ and assembly. They include: Reading data blocks from memory, Writing data blocks to memory performing copy operations on data blocks, random access to data items and latency testing."  - Courtesy FutureMark Corp.

 

Remember that higher latency we talked about a few pages back?  Well it reared its ugly head in PCMark05's memory performance module, where the dual-Xeon V8 rig couldn't keep up with the QX6800 powered system.

Transparent
POV-Ray and Cinebench R9.5

POV-Ray , or the Persistence of Vision Ray-Tracer, is a top-notch open source tool for creating realistically lit 3D graphics artwork. We tested with POV-Ray's standard included benchmarking model on all of our test machines and recorded the scores reported for each.   We shoudl also note that we used the latest 64-bit beta build of the program.  Results are measured in pixels-per-second throughput.

POV Ray Performance
Details: www.povray.org

POV-Ray is a true multi-threaded application that benefits greatly by having more CPU cores at its disposal.  In this benchmark, the V8 rig outpaced the next fastest system, the QX6800, by over 94% and it was mroe than twice as fast as the QuadFX system.  Wasn't there some hub-bub recently over a Barcelona system hitting just over 4000 in POV-Ray recently?  A POV-Ray score of 4000 is so last year.

Cinebench 9.5 Performance Tests
3D Modeling & Rendering Tests

The Cinebench R9.5 benchmark is an OpenGL 3D rendering performance test, based on the commercially available Cinema 4D application. Cinema 4D from Maxon is a 3D Rendering and Animation tools suite used by many 3D Animation houses and producers like Sony Animation and many others.  And of course it's very demanding of system processor resources.

This is a multi-threaded, multi-processor aware benchmark that renders a single 3D scene and tracks the length of the entire process. The time it took each test system to render the entire scene is represented in the graph below (listed in seconds).

Interestingly enough, all three of our test platforms rendered the Cinebench R9.5 scene in :44 seconds under Windows Vista Ultimate 64-bit, using the single-threaded test.  The multi-threaded test, however, shows the true power of V8.  In the MT test, the V8 rig rendered the entire scene in only 8 seconds - that's over 42% faster than the QX6800 and 38% faster than the dual FX-74 powered QuadFX system.

Transparent
3DMark06: CPU and Kribibench

3DMark06's built-in CPU test is a multi-threaded "gaming related" DirectX metric that's useful for comparing relative performance between similarly equipped systems.  This test consists of two different 3D scenes that are generated with a software renderer that is dependent on the host CPU's performance.  This means that the calculations normally reserved for your 3D accelerator are instead sent to the central processor.  The number of frames generated per second in each test are used to determine the final score.

Futuremark 3DMark06 - CPU Test
Simulated DirectX Gaming Performance

 

Although this score is lower than the one Intel boasted about at CES (likely due to our choice of OS - Vista vs. XP), the V8 rig was still clearly superior to anything else running 3DMark06's built-in CPU benchmark. The V8 rig was alsmot 40% faster than the QX6800 and almost 54% faster than the QuadFX rig.

Kribibench v1.1

Details: www.adeptdevelopment.com

For this next batch of tests, we ran Kribibench v1.1, a 3D rendering benchmark produced by the folks at Adept Development.  Kribibench is an SSE aware software renderer where a 3D model is rendered and animated by the host CPU and the average frame rate is reported.

We used two of the included models with this benchmark: a "Sponge Explode" model consisting of over 19.2 million polygons and the test suite's "Ultra" model that is comprised of over 16 billion polys.

 

Kribibench, like POV-Ray, is fully multi-threaded and benefits greatly by having more CPU cores at its disposal. In both of the models that we rendered, the V8 rig was significantly faster than either of the other test systems. In fact, with the heavily tacing Ultra model, the V8 system was over twice as fast as QuadFX and 77% faster than the QX6800.

Transparent
Office XP and Photoshop CS2

PC World Magazine's Worldbench 6.0 (beta 2) is a Business and Professional application benchmark.  The tests consist of a number of performance modules that each utilize one, or a group of popular applications to gauge performance.

WorldBench 6.0 (beta 2): Office XP SP1 and Photoshop CS2 Modules
Real-World Application Performance

Below we have the results from WB 6.0's Office XP SP1 and Photoshop CS2 performance modules, recorded in seconds.  Lower times indicate better performance here, so the shorter the bar the better.

 

Neither of these two tests benefit much from having more CPU cores, although some of the filters used in the Photoshop benchmark are multi-threaded.  Despite this fact, the V8 rig did put up the fastest score in the Office test.  It fell a few seconds behind the QX6800 in the Photoshop test, however.

Transparent
Firefox & WME MT and WinZip 10

For our next test, we moved on to another module built-into WorldBench v6.0 that's based on Windows Media Encoder. WorldBench 6's Windows Media Encoder and Firefox multi-tasking test reports encoding times in seconds, and like the tests on the previous page, lower times indicate better performance.

WorldBench 6.0: Windows Media Encoder 9 & Firefox Multi-Tasking
Digital Video Encoding

In this test, a video is encoded using Windows Media Encoder 9, while an instance of the Firefox browser is running and navigating through various cached pages in the background. Because the system is multi-tasking with two different applications, this test is more taxing than just running one instance of WME.

 

Here's one that we weren't expecting. In WorldBench's multi-tasking WME9 / Firefox test, the V8 rig was the worst performer of the bunch.  It seems memory bandwidth and latency play a larger role in this test - two areas where the V8 rig is lacking in comparison to its competition.

WorldBench 6.0: WinZip 10
File Compression

In WorldBench v6.0's WinZip 10 benchmark module, a large group of files is compressed into a single .ZIP file using the popular WinZip application.  Once again, lower times indicate better performance here.

 

The V8 rig jumped back into the pole position in WorldBench 6.0's WinZip v10 benchmark, albiet by a very small margin. In this test, it completed the workload 3 seconds faster than the QX6800 and 57 seconds faster than QuadFX.

Transparent
LAME MT MP3 Encoding

In our custom LAME MT MP3 encoding test, we convert a large WAV file to the MP3 format, which is a very popular scenario that many end users work with on a day-to-day basis to provide portability and storage of their digital audio content.

LAME MT MP3 Encoding Test
Converting a Large WAV To MP3

In this test, we created our own 223MB WAV file (a never-ending Grateful Dead jam) and converted it to the MP3 format using the multi-thread capable LAME MT application in single and multi-thread modes. Processing times are recorded below. Once again, shorter times equate to better performance.

 

 

LAME MT uses only two threads in the multi-threaded test, hence the similar performance between the V8 rig and the QX6800 here.  Only one second separated the QX6800 and V8 rigs in our custom single-threaded LAME MT test, in the V8 system's favor.  In the multi-threaded test we had a similar 1 second performance delta separating the QX6800 and V8, but this time around the desktop CPU put up the better score.  In both tests, the QuadFX righ brought up the rear.

Transparent
EP1 and F.E.A.R.: Low Resolutions

For our last set of game tests, we moved on to some in-game benchmarking with Half Life 2: Episode 1 and F.E.A.R. When testing processors with EP1 or F.E.A.R, we drop the resolution to 800x600, and reduce all of the in-game graphical options to their minimum values to isolate CPU and memory performance as much as possible.  However, the in-game effects, which control the level of detail for teh games' physics engines and particle systems, are left at their maximum values, since these actually do place some load on the CPU rather than GPU.

Benchmarks with HL2: Episode 1 and F.E.A.R. v1.08
DirectX 9 Gaming Performance

 

Neither F.E.A.R. nor Episode 1 benefit from multi-core processors, but they did show an interesting performance trend.  In out custom EP1 benchmark, the V8 rig trailed the competition by a large margin, probably due to its comparatively lackluster memory bandwidth and latency.  In the F.E.A.R. benchmark though, the V8 rig and QX6800 run neck and neck, and well ahead of the QuadFX rig.

Transparent
Power Consumption

We have one final data point we'd like to cover before bringing this article to a close. Our goal was to give you all an idea as to how much power each of the system configurations we tested used while idling and running under load.

Power Characteristics
Processors and Platforms

Please keep in mind that we were testing total system power consumption here at the outlet, not just the power being drawn by the processors alone.  In this test, we're showing you a ramp-up of power from idle on the desktop to full CPU load.  We tested with a combination of Cinebench R9.5 and SANDRA XI running on the CPU.

 

Our total system power consumption tests revealed an interesting trend.  For these tests, all voltages were set to 'auto' in each system's respective BIOS and Cool'n'Quiet and EIST were enabled where applicable. Our testing showed the V8 system consuming much more power than anything else while idling at the Windows desktop; almost 50W more than QuadFX and over 100W more than the QX6800.  With the processors operating under full load, however, the tables turned somewhat. Under load, the QuadFX rig - even though it had half the number of processor cores at work - consumed 24 more watts than V8.

Transparent
Our Summary and Conclusion

Performance Summary: The V8 Media Creation Platform's performance varied depending on the type of application being run.  In our synthetic tests and in the video rendering benchmarks, nothing came close to the V8 rig. It's 8 cores, higher core clock speeds, and 1333MHz front side bus paid huge dividends in the fully multithreaded applications that exploit these features. In more common desktop application and games, however, the power of the V8 platform is not fully utilized and it occasionally fell behind its lower clocked, quad-core desktop counterpart.

After spending a number of weeks experimenting with Intel's V8 Media Creation Platform, it is abundantly clear that it is an immensely powerful proposition.  Yes, it is overkill for the vast majority of people reading this.  It's very expensive, and it consumes large amounts of power (relatively speaking), but even now in an era where true multi-threaded applications aren't very widespread, there are instances where having 8 cores increases performance dramatically. And although no one in their right mind would purchase a pair of Xeon X5365 processors for high single-threaded performance, the processor's 3GHz clock speed and 1333MHz FSB make is very fast on that front as well.

Ultimately, if you ask us, Intel's V8 isn't about promoting a platform as much as it is a show of strength and a glimpse of things to come.  What V8 and QuadFX show is that both Intel and AMD are on a path to offering true, enthusiast-class, dual-socket platforms. And that's a good thing.  Perhaps AMD is a little further down the path thanks to a more tweaker-friendly motherboard in the QuadFX-comaptible Asus L1N64-SLI WS, but until consumers have more motherboards to choose from and perhaps quad-core processors from AMD, we can't very well declare that the time for QuadFX has arrived. One motherboard does not a platform make.

V8 in its current incarnation does have deficiencies and the idea of spending upwards of $3800 on two processors, a motherboard, and memory isn't exactly enticing, but this doesn't detract from the Intel V8 Media Creation Platform's immense power.  It would be virtually impossible to build something similar to V8 that can match its multi-threaded performance.

  • Very Powerful
  • Great MT Performance
  • Good Performance Scaling
  • Stable Platform
  • Expensive
  • Loud
  • High Power Consumption
  • FB-DIMMs


Content Property of HotHardware.com