AMD'S HECTOR RUIZ
WITH AN ATHLON 64
| It has been about two
years, since AMD first divulged information about their "K8"
architecture, also known as the "Hammer", at the
Microprocessor Forum in 2001. At the time, AMD was
having much success with their "K7" line of processors.
Enthusiasts and industry analyst were eager to see just what
AMD could do with their next generation processor
architecture. AMD was no longer following in Intel's
footsteps. They were introducing new technology in an
effort to become an industry leader and innovator, rather
than just a "me too" player. AMD's break-away
technology initiative resulted in the Athlon, which as you
probably know, was AMD's most successful line of
microprocessors to date. In the early days of the
Athlon, who would have thought AMD could make such a
significant dent in Intel's market share? Home and
Enterprise level consumers rejoiced. Finally, there
was real competition for Intel's Pentium. This rivalry
could only result in better technology, at faster design
cycles, with lower prices. The future was bright for
Personal and Enterprise computing and it's still getting
brighter, here in late 2003.
The "K8" architecture, which
has evolved into the Opteron and now the Athlon 64 line of
CPUs, is a significantly more radical departure from
traditional x86 architectures. Opterons, Athlon 64s
and Athlon 64 FXs would be AMD's first microprocessors built
using .13 micron SOI (Silicon-on-Insulator) technology,
which ideally would allow for higher clock speeds with lower
thermal characteristics. AMD also planned on pulling
the memory controller out of the Northbridge block and
incorporating it into the processors die, to reduce latency,
which in turn would increase performance even further.
Of course, then AMD decided to execute the boldest move the
industry has seen to date, in x86 computing. As the
Athlon 64's branding suggests, AMD's new Athlon would be
designed from the ground up as a native 64-bit machine with
the capability to also run in 32-bit mode. Around the
time AMD introduced the Opteron, Intel since scoffed at the
idea, stating that 64-bit computing will not be required for
at least a year down the roadmap. However, AMD decided
to make 64-bit computing a reality, today for the Desktop
PC, with the introduction of the Athlon 64 and Athlon 64
FX-51.
A host of other enhancements
were implemented as well, culminating in the product we'll
be looking at today, AMD's new flagship desktop CPU, the
Athlon 64 FX-51. The Athlon 64 FX-51 is a 2.2GHz
processor, targeted squarely at gamers and enthusiasts, who
need the absolute fastest machine available, at almost any
cost. The mainstream Athlon 64 3200+ also debuts today
at 2.0GHz, with a price tag that will put it within reach of
a much larger audience.
THE AMD ATHLON 64 FX-51: UP
CLOSE & PERSONAL
|
Features & Specifications of the AMD Athlon 64 FX
and Athlon 64 |
Source: AMD |
|
AMD64:
When utilizing the
AMD64 Instruction Set Architecture, 64-bit mode is
designed to offer:
- Support for 64-bit
operating systems to provide full, transparent, and
simultaneous 32-bit and 64-bit platform application
multitasking.
- A physical address
space that can support systems with up to one
terabyte of installed RAM, shattering the 4 gigabyte
RAM barrier present on all current x86
implementations.
- Sixteen 64-bit
general-purpose integer registers that quadruple the
general purpose register space available to
applications and device drivers.
- Sixteen 128-bit XMM
registers for enhanced multimedia performance to
double the register space of any current SSE/SSE2
implementation.
Integrated DDR memory controller:
- Allows for a
reduction in memory latency, thereby increasing
overall system performance.
An
advanced HyperTransport link:
- This feature
dramatically improves the I/O bandwidth, enabling
much faster access to peripherals such as hard
drives, USB 2.0, and Gigabit Ethernet cards.
- HyperTransport
technology enables higher performance due to a
reduced I/O interface throttle.
Large level one (L1) and level 2 (L2) on-die cache:
- With 128 Kbytes of
L1 cache and 1 Mbyte of L2 cache, the AMD Athlon 64
processor is able to excel at performing matrix
calculations on arrays.
- Programs that use
intensive large matrix calculations will benefit
from fitting the entire matrix in the L2 cache.
64-bit
processing:
- A 64-bit address
and data set enables the processor to process in the
terabyte space.
- Many applications
improve performance due to the removal of the 32-bit
limitations.
|
Processor core clock-for-clock improvements:
- Including larger
TLB (Translation Look-Aside Buffers) with reduced
latencies and improved branch prediction through
four times the number of bimodal counters in the
global history counter, as compared to
seventh-generation processors.
- These features
drive improvements to the IPC, by delivering a more
efficient pipeline for CPU-intensive applications.
- CPU-intensive games
benefit from these core improvements.
- Introduction of the
SSE2 instruction set, which along with support of
3DNow! Professional, (SSE and 3DNow! Enhanced)
completes support for all industry standards.
- 32-bit instruction
set extensions.
Fab location: AMD's Fab 30 wafer
fabrication facility in Dresden, Germany
Process Technology:
0.13 micron SOI (silicon-on-insulator) technology
Die Size: 193mm2
Transistor count: Approximately 105.9
million
Nominal Voltage: 1.50v
ATHLON
64 FX-51
|
ATHLON
64
| |
Today,
AMD is taking the wraps of two new desktop processors, the
flagship Athlon 64 FX-51 and their new performance /
mainstream CPU, the Athlon 64 3200+. The FX-51 debuts
at 2.2GHz, while the Athlon 64 3200+ arrives clocked at
2GHz. The differences don't stop there, however.
As the chart above indicates, the Athlon 64 FX-51 uses a
940-pin package, similar to AMD's Opteron, while the Athlon
64 3200+ uses a 754-pin package. The Athlon 64 FX-51
also has a memory controller that is twice as "wide" as the
3200+; 128-bits vs. 64-bits respectively. The Athlon
64 FX-51 also requires registered memory to function,
whereas the Athlon 64 3200+ can use standard unbuffered DDR
memory. Registered memory uses an additional "buffer"
that isolates memory chip load from the memory controller,
which allows for the use of more DIMMS. ECC memory has
extra bits of storage that help in the identification and
repairing of errors, hence "ECC" - Error Checking and
Correction. Please don't confuse registered memory
with ECC though. ECC and registered memory types are
totally different animals. It's possible to buy memory
that is registered, but not ECC, or vice versa.
Something the chart does not show is the packaging material
used for each CPU. In its current form, the FX-51 is
housed is ceramic packaging material, ala the Thunderbird.
The Athlon 64 3200+ is using organic packaging like the
current generation of Athlon XPs. These processors do
share many features and enhancements, which is why you're
here reading about their release today...
AMD PROCESSOR
COMPARISON CHART
| AMD64
- 64-bit Processing:
The Athlon 64s, like the Opteron, have the ability to run
64-bit operating systems though the use of a new set of
extensions to the x86 ISA (Instruction Set Architecture).
With the 64-bit Itanium, Intel introduced the IA-64 ISA,
which has its advantages, but one major caveat with
introducing a new ISA and microprocessors that use the new
instructions set, is that they are not natively compatible
with x86 code. AMD took a much different approach to
64-bit computing. They simply extended the x86 ISA to
support 64-bit memory addressability. This makes the Athlon
64 natively compatible with current x86 code, while giving
it support for 64-bit applications going forward. Due
to the fact that the Athlon 64 can run two different types
of code, x86 and AMD64, the CPU operates in two different
modes dubbed "legacy mode" and "long mode". In legacy
mode, the Athlon 64 natively runs all 16-bit or 32-bit x86
applications. In long mode, which requires a 64-bit
AMD64 compliant operating system, the Athlon 64 will enjoy
all of the benefits of 64-bit computing. Long mode
also has a compatibility sub-mode that allows the running of
32-bit applications with a 64-bit operating system.
The Athlon 64's ability to run all these different types of
code make it a very versatile processor.
Integrated DDR Memory Controller:
One of the Athlon 64's major new
features performance enhancing features is its integrated
memory controller. With most current processors, the
Northbridge houses the memory controller, which communicates
with the CPU via the Front Side Bus (FSB). With the
Athlon 64, the memory controller is now on the processor's
die, which means memory traffic no longer has to travel out
of the CPU to chipset and back. Being that the memory
controller is now integrated into the CPU, it will run at
the same speed as the host processor. This type of
configuration drastically reduces latency, which should
yield significant performance gains. One negative to
having the memory controller integrated into the processor's
die is that to support emerging memory technologies, like
DDR2 for example, the controller has to be redesigned and
the processor needs to be replaced.
An
Advanced HyperTransport Link:
AMD has also
replaced aging chip-to-chip
interconnects with their HyperTransport technology. Today's
fastest desktop processors interface with the motherboard's
chipset, and subsequently the memory and AGP bus, etc,
through the FSB at 200MHz (400MHz effective with the Athlon
XP - 800MHz effective with the Pentium 4). The Athlon
64s, however, are equipped with a HyperTransport link that
operates at up to 800MHz DDR (1600MHz effective). When
operating at top-speed, a single HyperTransport link offers
a maximum of 6.4GB/s of bandwidth.
Large
L1 & L2 On-Die Cache:
In February of this year, AMD released Athlon XPs based on
the "Barton" core, with double the amount of on-die L2 cache
as the older "Thoroughbred" core. The Bartons have
512KB of full-speed L2 cache versus the Thoroughbred's 256K.
The Athlon 64s take things a step further with a full 1MB
(1024KB) of on-die L2 cache. This added cache should
provide a boost in performance, especially in applications
where large amounts of data are being sent to the processor
and main system memory. With twice the L2 cache of the
Barton based Athlon XPs, the new Athlon 64 core can run a
larger chunk of code out of its on-chip cache resources,
versus having to fetch it from system memory. A side
effect of having this much L2 cache is that the Athlon 64
now has a die size of 193mm2, almost twice the
size of the Athlon XP. With a die this large, the
Athlon 64 is going to be expensive to produce. AMD
claims that when they move to 90nm (.09-micron)
manufacturing process next year, the corresponding die
shrink will bring the die size on a comparable chip down to
a much more palatable 120mm2.
Larger TLBs, Better Branch Predicition, More Counters:
The Pentium 4 has taken a lot of flak because its deep
20-stage pipeline was less efficient than the Athlon XP's
10-stage pipeline. The deep pipeline is part of what
allowed the Pentium 4 to reach such high clock speeds, but
it Is also why an Athlon, clocked at a much lower clock
speed than a P4, can perform at similar levels.
Clock-for-clock, that Athlon XP can handle more
instructions. With the Athlon 64, AMD has deepened the
processor's pipeline to 12-stages, which you'd think would
lower the core's IPC (Instructions Per Clock).
However, thanks to some core architectural improvements, it
hasn't. The Athlon 64 has larger Translation
Look-Aside Buffers (TLB), with improved latency and improved
branch prediction. The Athlon 64 has quadruple the
number of bimodal counters in its global history counter,
when compared to the Athlon XP. All this technical
jargon means that at similar clock speeds, even though it
has a deeper pipeline, an Athlon 64 should outperform an
Athlon XP in most circumstances. Later on, you'll see
we tested an Athlon XP 3200+, alongside the Athlon 64 FX-51,
and with both processors clocked at 2.2GHz, the FX-51 was
clearly a much faster chip. AMD's efforts to increase
the Athlon 64's IPC seems to have paid dividends nicely.
Supporting Hardware & Chipsets
|