AMD x86-64 Hammer

by Ian Tay

This preview of the AMD Hammer is based on information released by AMD and is still subject to change. I will mainly discuss the hardware side of this microprocessor and the ISA used.

The AMD K8 x86-64 Hammer is, as the name suggests, designed and manufactured by AMD. It is known as the K8 as it is AMD's 8th Generation microprocessor. The name x86-64 means that it uses the x86 Instruction Set Architechture(ISA) but with 64bit extensions.

Lets start off with the basics of the x86-64 ISA used. x86-64 is natively like the x86 but with improvements made. The main problems with the x86 ISA is that it has too few General Purpose Registers(GPR) 8 to be exact and has a stack based floating point processor. There is also a new 16 register IEEE standard floating point unit. x86-64 ISA has an additional 8 GPR's added. x86-64 has also extended all GPRs to 64bits. x86-64 has 8 new registers for Streaming Single Instruction Multiple Data Extensions(SSE)(Twice the amount Intel offers)and an instruction pointer. A very attractive feature is that it is able to run legacy 16bit or 32bit applications and operating systems without recompilation and without compromise to performance. This make the transtion from 32 to 64 bits smooth as companies will not have to worry about speed compromisations for exsisting software. The downside is that when running 32bit code, the processor will not be able to use the features found in 64bit mode.

Intel has chosen a different approach to 64bit processors. The Itanium by Intel uses EPIC (Explicit Parellel Instruction Computing)and thus needs software recompilation. The Itanium can run 32 or 16bit applications but much slower. Tests have shown that Itanium runs exsisting x86 code as slow as a 486 processor.

Why do people need 64bit proceessing, you might ask. With a 64bit processor, you can address more than the current limit of 4GB or physical memory.(4.5 Petabytes) [Ed: I will assume 1 petabyte is 1000 terabytes which is 1000 gigabytes which is 1000 megabytes?? Care to explain Ian?] This is needed in large scale processing, database management and CAD. All these will benefit from the additional GPRs and 64bit addresses.

Since AMD is focusing on compatibility with exsisting applications, speed is their top priority. As such, the Hammer has to have powerful hardware features. The Hammer will compete with the Itanium in the server market and Pentium 4 in the desktop and mobile segment. Intel plans on releasing Pentium 4 for laptops.

AMD has chosen to build on their reliable K7 Athlon architechture. The K8 has an additional 2 stages in the execution pipeline which brings it to 12 stages. This is to allow room for frequency adjustments. The Pentium 4 has 20 stages which allows it to scale to high frrequencies but at the cost of IPCs. The Pentium 4 therefore will process much fewer Instructions Per Clock Cycle than the Hammer. The K7 and P6 can do more IPCCs than the Pentium 4.

AMD has decided to stick with the three FPUs, AGUs and ALUs like the Athlon. AMD has decided to integrate the memory controller onto the CPU for lower DRAM latency. Doing so will ensure that the execution units will be filled more often and make the processor more efficient thereby increasing IPC. Hammer will use DDR SDRAM such as PC1600(200Mhz), PC2100(266Mhz) and PC2700(333Mhz). Though PC2700 RAM does not have as much bandwidth as PC800(400Mhz) RAMBUS, it has lower latency. The Hammer will feature a 128bit memory bus which will allow the RAM to have twice the bandwidth. The Pentium 4 in comparison has a 64bit memory bus. AMD has thus removed the need for a North Bridge. Hammer will feature AGP 8X on an external controller.

The Hammer will have an 800Mhz FSB and this will increase with the clock speed as the memory controller is intergrated. The Pentium 4 Williamette has a 400Mhz FSB and the Pentium 4 Northwood will have a 533Mhz FSB. This will also increase IPC as the executional units will be filled faster.

An interesting thing is that the Hammer will have dual cores on a single CPU. This will make chips cheaper to produce but may only be available for the Server segment.

The Hammer will have SSE2 optimisations which will allow it to perform as well as the Pentium 4 on heavily optimized SSE2 software.

Hammer will feature Hypertransport technology which will allow up to 19.2GB I/O. AMD may implement 3GIO technology in which it is a partner with Intel.

The Hammer will be manufactured using a 0.13micron process with SOI(Sillicon On Insulator). AMD has already ordered the SOI wafers. This will prevent electrical currents from leaking from the transistors. The 0.13micron manufacturing process will make the die smaller and cheaper to produce.

Hammer will feature a Translational Lookaside Buffer which is supposed to be more efficient than Intel's Trace Cache. The TLB can handle "large workloads" more efficiently than the Trace Cache escpecially when running many programs at once. It is essentially a branch predictor.

Hammer will feature 1MB of L2 cache. It will be 16 way associative.

AMD will call the server variant Sledgehammer and the desktop and mobile variant Clawhammer. SledgeHammer is slated to be released for testing in mid 2002. Northwood should be coming out just before that.

AMD hinted at a SPECint score of about 1400. The 2Ghz Pentium 4 has a score of about 550 and the Athlon 650. The Hammer will debut at about 2.5-3Ghz.

Links

AMD Website

Back to Reviews