The Execution Units
After the instructions are decoded, they are sent to the appropriate scheduler, integer or floating-point. The Bulldozer architecture has only one floating-point unit, which is shared between the two “cores” available. On the other hand, it has two completely independent integer units, the so-called “cores.”
Each integer engine has four Execution units, labeled as:
- EX, MUL: Can execute any kind of integer instruction, including multiplication, but not division
- EX, DIV: Can execute any kind of integer instruction, including division, but not multiplication
- AGen: Address generation, a.k.a. AGU or Address Generation Unit, used to generate the address the CPU will get or store a data
It also has a Load/Store unit (“Ld/ST”), which is in charge of getting from the memory or storing in the memory a data requested by an instruction. Usually this unit is drawn side-by-side with the units listed above, but somehow in this presentation AMD decided to draw it separately.
The Bulldozer architecture uses an out-of-order execution engine, like AMD64 CPUs and Intel CPUs since the Pentium Pro (P6 architecture). Because not all execution engines can process all kinds of instructions, if there wasn’t an out-of-order engine some of the execution units would be idle sometimes. Let’s say the next instruction to be executed is an integer division, but the unit that is able to process this kind of instruction is busy processing another instruction. Instead of waiting for this unit to be free, the scheduler will look for an instruction that can be executed right away in one of the other units, if they are free, of course. So the role of the scheduler is to keep all execution unit as busy as possible.
After integer instructions are executed, they are sent to the Retire unit, where the CPU will put them back in the correct order.
The floating point unit also has four Execution units, labeled as:
- MMX: Can execute all basic floating-point instructions (x87 instructions), including MMX
- 128-bit FMAC: Can execute all floating-point instructions