Hardware Secrets


Home | Camera | Case | CE | Cooling | CPU | Input | Memory | Mobile | Motherboard | Networking | Power | Storage | Video | Other
Content
Articles
Editorial
First Look
Gabriel’s Blog
News
Reviews
Tutorials
Main Menu
About Us
Awarded Products
Compare Prices
Datasheets
Dictionary
Download
Drivers
Facebook
Forums
Links
Manufacturer Finder
Newsletter
On The Web
RSS Feed
Test Your Skills
Twitter
Newsletter
Subscribe today!
Search




Recommended
Upgrading and Repairing PCs (19th Edition)
Upgrading and Repairing PCs (19th Edition), by Scott Mueller (Que), starting at $17.49
Home » CPU
Inside AMD64 Architecture
Author: Gabriel Torres
Type: Tutorials Last Updated: May 16, 2006
Page: 8 of 9
Real-time pricing for Zalman CNPS10XEXTREME.
Copper/aluminum Cpu Cooler Zalman CNPS10XEXTREME 35554 Electronics Usually ships in 24 hours
Amazon: $62.78 TigerDirect: $74.99
Newegg: $65.99 Buy.com: $75.99

Dispatch and Schedule

As mentioned, the Instruction Control Unit is the Reorder Buffer of AMD64 processors. Here the macro-ops can be picked and sent to the schedulers out-of-order, i.e., not in the same order the instructions appeared on the program that is being executed. For example, if the program has something like this:

Integer
Integer
Integer
Integer
Integer
FP
Integer
FP

AMD64 architecture has three integer execution engines and three floating-point execution engines. If it hadn’t an out-of-order execution engine, its floating-point engines would be idle when running this program, since the forth instruction is also an integer instruction and can’t be executed at the same time because all three execution engines are already being used. Since it implements out-of-order execution, the fifth instruction, the first FP instruction, can be sent to execution together with the first one, increasing the CPU performance. In fact, since it has three FPUs, both FP instructions available on this program could be dispatched at the same time. The goal of the scheduler it to keep all CPU execution engines busy all the time.

The reorder buffer available on AMD64 architecture has 72 entries and what is quite interesting is that each integer execution engine has its own scheduler with its own buffer (8 entries each). The FP execution units have only one 36-entry scheduler. So AMD64 has a total of four schedulers, the same amount available on Pentium 4.

The reorder buffer is also in charge of register renaming. CISC x86 architecture has only eight 32-bit registers (EAX, EBX, ECX, EDX, EBP, ESI, EDI and ESP). This number is simply too low, especially because modern CPUs can execute code out-of-order, what would ”kill“ the contents of a given register, crashing the program.

So, at this stage, the processor changes the name and contents of the registers used by the program into one of the 96 internal registers available, allowing the instruction to run at the same time of another instruction that uses the exact same standard register, or even out-of-order, i.e., this allows the second instruction to run before the first instruction even if they mess with the same register.

AMD64 architecture has 96 internal registers, while Pentium 4 has 128. Intel’s 6th generation processors (like Pentium II and Pentium III) there were only 40 internal registers. It is interesting to note how AMD did a trick on AMD64 architecture to achieve those 96 registers. They simply created a result field on each one of the 72 reorder buffer entries for storing the results of each instruction (this isn’t available on Pentium 4; Pentium 4 needs to allocate an internal register for storing the results each time an instruction is executed). Plus its register file (or IFFRF, Integer Future File and Register File, as AMD calls it) has 40 entries (since 16 of them stores the ”correct“ value for each x86 register, they cannot be used). So while the correct answer for ”how many internal registers does AMD architecture have?“ is 40, the effective number is 96 due to this architectural difference.

AMD64 Reorder Buffer and Schedulers
click to enlarge
Figure 14: AMD64 reorder buffer and schedulers.

« Previous |  Page 8 of 9  | Next »
Print Version | Send to Friend | Bookmark Article | Comments (0)

Related Content
  • AMD 64-bit architecture (x86-64)
  • Athlon 64 Cores
  • All Athlon 64 Models
  • All Opteron Models
  • Inside AMD K10 Architecture

  • Recommended Deal.
    Zalman CNPS10XEXTREMEZalman CNPS10XEXTREME

    Copper/aluminum Cpu Cooler Zalman CNPS10XEXTREME 35554 Electronics Usually ships in 24 hours

    Amazon: $62.78 TigerDirect: $74.99
    Newegg: $79.99 Buy.com: $75.99

    RSSLatest News
    Antec Announces the One PC Case
    February 9, 2012 - 8:06 AM PST
    Cooler Master Releases Elite 361 PC Case
    February 8, 2012 - 7:50 AM PST
    Microsoft Launches Kinect for Windows
    February 2, 2012 - 8:42 AM PST
    .:: More News ::.


    © 2004-12, Hardware Secrets, LLC. All rights reserved.
    Advertising | Legal Information | Privacy Policy
    All times are Pacific Standard Time (PST, GMT -08:00)