The HyperTransport Bus Used By AMD Processors
By Gabriel Torres on August 30, 2007


Introduction

Processors based on AMD64 architecture – such as Athlon 64, Athlon 64 X2, Athlon 64 FX, Opteron, Sempron and Phenom – have two external busses. One is used on the communication between the CPU and the memory, and it is simply called “memory bus”, and the other is used on the communication between the CPU and all other PC components thru the motherboard chipset and is called HyperTransport – being an I/O (Input/Output) bus. In this tutorial we will be explaining how the HyperTransport bus works and clarifying common mistakes people assume about this bus.

On all other processors – including AMD processors not based on AMD64 architecture, like the original Athlon, Athlon XP and Sempron socket 462 processors – the CPU has only one external bus, also known as front side bus (FSB). On this approach the external bus carries both memory and I/O communications.

Theoretically the architecture used on AMD64 processors is better, since in theory they can communicate with the memory and with other PC components (like the video card) at the same time, something impossible on other processors, as there is only one datapath out of the processor.

On Figure 1 you can see how an AMD64 processor communicates to the external world. The “bridge” chip is the motherboard chipset. Depending on the chipset you can have one or two chips. On two-chip solutions all peripherals (such as hard disk drives, add-on cards, sound cards, etc) are connected to the second chip (this second chip is called south bridge, not shown on Figure 1), while on single-chip solutions everything is connected to this single chip.

HyperTransport Bus
click to enlarge
Figure 1: Location of the HyperTransport bus on AMD64 processors.

AMD CPUs targeted to servers – i.e. Opteron processors – can have one, two or three HyperTransport busses, depending on the model. These extra busses are used to interconnect several CPUs allowing them to talk to each other, i.e. used on servers with more than one CPU on the motherboard. Since desktop and notebook CPUs does not support this kind of configuration there is only one HyperTransport bus on them.

For a more in-depth explanation of AMD64 architecture, please read our Inside AMD64 Architecture tutorial.

Besides providing AMD64 processors with separated datapaths for memory and I/O, HyperTransport brings another advantage: it provides separated links for the CPU input and output operations, allowing the CPU to transmit (“write”) and receive (“read”) I/O data at the same time (i.e. in parallel). On the traditional architecture using a single external bus since the external bus is used for both input and output operations reads and writes cannot be done at the same time.

HyperTransport Bus
click to enlarge
Figure 2: The HyperTransport bus provides separated input and output datapaths.

HyperTransport 1.x

HyperTransport bus can operate under several different clock and width (i.e. number of bits that are transmitted per time) configurations. This is probably where there are a lot of misconceptions and mistakes regarding HyperTransport are said and written.

HyperTransport is a bus created by a consortium made of several companies, including AMD, nVidia and Apple. This bus can be used on several applications and it is not limited to AMD processors.

This means that the actual configuration of the HyperTransport bus will depend on the hardware developer.

Also some developers announce an exaggerated transfer rate of the HyperTransport bus they are using.

Current AMD64 processors use HyperTransport 1 (HT1) or HyperTransport 2 (HT2), with forthcoming AMD processors using HyperTransport 3 (HT3). In all these cases, AMD processors use 16-bit links, even though HyperTransport allows the use of 32-bit links.

HyperTransport 1 is used on all socket 754 processors and socket AM2 Sempron processors (other AM2-based processors use HyperTransport 2.0).

Here is a breakdown of all possible clock and transfer rates on HyperTransport 1.x (i.e. available on socket 754 processors):

HyperTransport transfers two data per clock cycle, a concept also known as DDR, double data rate.

The formula to find the maximum theoretical transfer rate is:

Transfer rate = width (number of bits) x clock x number of data per clock cycle / 8

Thus with socket 754 processors, HyperTransport bus can work up to 800 MHz or 3,200 MB/s. Some people advertise this clock and transfer rate using other numbers, generating a lot of confusion on the market:

Another misunderstanding is saying that the external bus or FSB (Front Side Bus) of Athlon 64 (or any other AMD64-based CPU) is of 1,600 MHz. This is partially right. We can say this regarding I/O operations but not for memory, as processors based on AMD64 architecture have two separated external busses, as we saw. Thus it is better if you say HyperTransport and not “external bus” nor “FSB” to not create confusion.

It is important to notice that AMD processors can work with several other clock rates below the announced 1,600 MT/s (800 MHz). In fact they can work with any of the speeds on the list published above.

The chipset can negotiate a lower clock rate with the CPU and even an 8-bit width instead of the 16-bit one. In fact when the first Athlon 64 chipsets came out VIA claimed that their chipset for the Athlon 64, the K8T800, was superior to the competition for working with the HyperTransport bus at 1,600 MT/s accusing the competition (without mentioning names) of not working at the maximum transference rate the HyperTransport permits, but rather at one of those inferior taxes or even using 8-bit transfers instead of 16-bit ones.

At http://www.hypertransport.org, HyperTransport’s official website, you will see that they announce a maximum transfer rate of 12.8 GB/s for HyperTransport 1.x. This maximum transfer rate is achieved by using 32-bit links – as we explained AMD processors use 16-bit links. But if you do the math you will find 6,400 MB/s (32 bits x 800 MHz x 2 / 8). Here the consortium doubled the maximum transfer rate just because there are two datapaths available (one for transmitting and another for receiving). As we said before, we do not agree with this methodology of calculating transfer rates.

HyperTransport 2.0

HyperTransport 2.0 adds new clock rates – and thus new transfer rates – and a new feature, PCI Express mapping, which helps interfacing between HyperTransport and PCI Express – in order words, make it easier for the CPU to “talk” to PCI Express devices.

The new clock and transfer rates introduces by HyperTransport 2.0 are the following, assuming 16-bit links (which is the configuration used by AMD processors):

HyperTransport 2.0 devices can also work with HyperTransport 1.x transfer rates.

AMD uses HyperTransport 2.0 on all AMD64 CPUs based on sockets 939 and AM2 (except on Sempron CPUs, which continue to use HyperTransport 1.0), however supporting only the lower HT2 speed – in fact AMD was more interested in the PCI Express mapping feature than a higher transfer speed. So, even though these processors are based on HT2 the maximum transfer rate of their HT links are of 4,000 MB/s.

To make things a little bit confusing, AMD uses several times the name “HT1” to describe the HyperTransport bus of CPUs that have their HyperTransport links working at 1,000 MHz. This is probably to avoid people assuming that since they are HT2 parts they can work up to 1,400 MHz (5,600 MB/s), which is not the case, as we are explaining.

Also some people refer to this 1,000 MHz/4,000 MB/s HyperTransport link used by socket 939 and AM2 processors as:

Another misunderstanding is saying that the external bus or FSB (Front Side Bus) of Athlon 64 (or any other AMD64-based CPU) is of 2,000 MHz. This is partially right. We can say this regarding I/O operations but not for memory, as processors based on AMD64 architecture have two separated external busses, as we saw. Thus it is better if you say HyperTransport and not “external bus” nor “FSB” to not create confusion.

Just like HyperTransport 1.x it is important to have in mind that socket 939 and AM2 processors can work with any of the clock rates below 1,000 MHz.

Once again official values for HyperTransport 2.0 are inflated as HyperTransport consortium announces them using 32-bit links and multiply them by two because there are two links available (one for transmitting and another for receiving data). As we mentioned before, we do not agree with this methodology. Because of this methodology HT2 maximum theoretical transfer rate is advertised as 22.4 GB/s (1,400 MHz x 32 x 2 / 8 x 2 links).

HyperTransport 3.0

Besides adding new clock rates – and thus new transfer rates – HyperTransport 3.0 brings several new features over HyperTransport 2.0, like AC operating mode, Link Splitting (a.k.a. Un-Ganging), Hot Plugging and Dynamic Link Clock/Width Adjustment. Forthcoming AMD processors, like Phenom, will use this new version of the HyperTransport bus.

HyperTransport 3.0 will be used on CPUs based on sockets AM2+ and 1207+.

HyperTransport 3.0 adds the following new clock rates, keeping compatibility with HT1 and HT2 rates (transfer rates assuming 16-bit links, which is the configuration used by AMD processors):

AMD is saying that their forthcoming CPUs will support the maximum HT3 transfer rate – 10,400 MB/s, which AMD calls 5.2 GT/s, i.e. billions of transfers per second. Keep in mind, however that these CPUs will still be compatible with lower rates. This means two things. First, new HT3-based CPUs can be installed on HT2-based motherboards – i.e. to install a socket AM2+ processor on a socket AM2 motherboard –, even though they won’t achieve their maximum I/O performance. The second thing is that at the time of launch maybe some chipsets won’t be able to run at 10,400 MB/s transfer rate, even if they are HT3, similarly to what happened when Athlon 64 was first launched.

Just like what happens with lower clock rates, probably there will be people calling HT3’s maximum clock rate as 5,2 GHz or its maximum transfer rate as 20,8 GB/s.

Once again the transfer rates announced by the HyperTransport consortium are highly exaggerated. They announce HyperTransport 3.0 as having a maximum transfer rate of 41.6 GB/s. To reach this number they considered 32-bit links (and not 16-bit links) and doubled the number found by two because there are two links available. The math used was 2,600 MHz x 32 x 2 / 8 x 2 links. As we have already explained, AMD processors use 16-bit links, not 32-bit ones, and we don’t agree with the methodology of doubling the transfer rate only because there is one link for transmitting and another for receiving data. We would only agree with this if the links were on the same direction.

Let’s now talk about the extra features brought by HyperTransport 3.0.

The new AC operating mode (translation: using a signaling system similar to networks) allows HyperTransport bus to achieve longer distances. The goal is to allow HyperTransport to be used directly to interconnect cases, boards and backplanes. Processors won’t use this feature.

Link splitting, also called un-ganging, allows the 16-bit link to be accessed as two independent 8-bit links. This can be used for increasing the number of links available, allowing more CPUs to be interconnected without using any extra fancy hardware.

Hot Plugging allows HyperTransport devices to be installed and removed with the bus running. It won’t allow you to replace your CPU with the system turned on because the CPU has several other pins besides the HyperTransport, but this feature may be used on storage servers based on HT3.

And finally Dynamic Link Clock/Width Adjustment, which will be used by HT3-based AMD CPUs – as long as they are installed on a motherboard using a HT3 chipset, of course. This feature allows the CPU to change the clock and the number of bits that are transmitted per clock cycle dynamically, i.e. “on the fly”. The idea here is to reduce power consumption. For example, if the CPU senses that running its HyperTransport bus at 2,600 MHz (10,400 MB/s) is too much for what it is doing right now, it can reduce the bus to 1,000 MHz (4,000 MB/s) – or whatever rate it thinks will be more suitable. The same goes for the number of bits transferred per clock cycle – it can be reduced from 16 to whatever number the CPU feels like, based on the current system usage.

Originally at http://www.hardwaresecrets.com/article/19


© 2004-9, Hardware Secrets, LLC. All Rights Reserved.

Total or partial reproduction of the contents of this site, as well as that of the texts available for downloading, be this in the electronic media, in print, or any other form of distribution, is expressly forbidden. Those who do not comply with these copyright laws will be indicted and punished according to the International Copyrights Law.

We do not take responsibility for material damage of any kind caused by the use of information contained in Hardware Secrets.