Hardware Secrets


Home | Camera | Case | CE | Cooling | CPU | Input | Memory | Mobile | Motherboard | Networking | Power | Storage | Video | Other
Content
Articles
Editorial
First Look
Gabriel’s Blog
News
Reviews
Tutorials
Main Menu
About Us
Awarded Products
Compare Prices
Datasheets
Dictionary
Download
Drivers
Facebook
Forums
Links
Manufacturer Finder
Newsletter
On The Web
RSS Feed
Test Your Skills
Twitter
Newsletter
Subscribe today!
Search




Recommended
The Unabridged Pentium 4: IA32 Processor Genealogy
The Unabridged Pentium 4: IA32 Processor Genealogy, by Tom Shanley (Addison-Wesley Professional), starting at $9.98
Home » CPU
Inside Pentium M Architecture
Author: Gabriel Torres
Type: Tutorials Last Updated: January 4, 2006
Page: 6 of 7
Real-time pricing for Intel BX80623G620.
Intel Pentium G620 Dual Core 2.6 GHz HD Graphics Retail LGA 1155 Processor BX80623G620 S735858220149 Electronics Usually ships in 2 to 4 weeks
Amazon: $69.99 TigerDirect: $81.99
Newegg: $69.99

Reservation Station and Execution Units

As we mentioned before, Pentium M uses fused micro-ops (i.e., carries two micro-ops together) from the Decode Unit up to the dispatch ports located on the Reservation Station. The Reservation Station dispatches each micro-op individually (defused).

Pentium M has five dispatch ports numbered 0 through 4 located on its Reservation Station. Each port is connected to one or more execution units, as you can see in Figure 5.

Pentium M Execution Units
click to enlarge
Figure 5: Reservation Station and execution units.

Here is a small explanation of each execution unit found on this CPU:

  • IEU: Instruction Execution Unit is where regular instructions are executed. Also known as ALU (Arithmetic and Logic Unit). ”Regular“ instructions are also known as ”integer“ instructions.
  • FPU: Floating Point Unit is where complex math instructions are executed. In the past this unit was also known as ”math co-processor“.
  • SIMD: Is where SIMD instructions are executed, i.e., MMX, SSE and SSE2.
  • WIRE: Miscellaneous functions.
  • JEU: Jump Execution Unit processes branches and is also known as Branch Unit.
  • Shuffle: This unit executes a kind of SSE instruction called ”shuffle“.
  • PFADD: Executes a SSE instruction called PFADD (Packed FP Add) and also COMPARE, SUBTRACT, MIN/MAX and CONVERT instructions. This unit is pipelined, so it can start executing a new micro-op at each clock cycle even if it didn’t complete the execution of the previous micro-op. This unit has a latency of three clock cycles, i.e., it delays three clock cycles to deliver each processed instruction.
  • Reciprocal Estimates: Executes two SSE instructions, one called RCP (Reciprocal.Estimate) and another called RSQRT (Reciprocal Square Root Estimate).
  • Load: Unit to process instructions that ask a data to be read from the RAM memory.
  • Store Address: Unit to process instructions that ask a data to be written at the RAM memory. This unit is also known as AGU, Address Generator Unit. This kind of instruction uses both Store Address and Store Data units at the same time.
  • Store Data: Unit to process instructions that ask a data to be written at the RAM memory. This kind of instruction uses both Store Address and Store Data units at the same time.

Keep in mind that complex instructions may take several clock cycles to be processed. Let’s take an example of port 0, where the floating point unit (FPU) is located. While this unit is processing a very complex instruction that takes several clock ticks to be executed, port 0 won’t stall: it will keep sending simple instructions to the IEU while the FPU is busy.

So, even thought the maximum dispatch rate is five microinstructions per clock cycle, actually the CPU can have up to twelve microinstructions being processed at the same time.

As we mentioned, on instructions that ask the CPU to read a data stored at a given RAM memory address, the Store Address Unit and the Store Data Unit are used at the same time, one for calculating the address and the other for reading the data.
 
Actually that’s why ports 0 and 1 have more then one execution unit attached. If you pay attention, Intel put on the same port one fast unit together with at least one complex (and slow) unit. So, while the complex unit is busy processing data, the other unit can keep receiving microinstructions from its corresponding dispatch port. As we mentioned before, the idea is to keep all execution units busy all the time.

As we explained, after each micro-op is executed, it returns to the Reorder Buffer, where its flag is set to ”executed“. Then at the Retirement Stage the micro-ops that have their ”executed“ flag on are removed from the Reorder Buffer on its original order (i.e., the order they were decoded) and then the x86 registers are updated (the inverse step of register renaming stage). Up to three micro-ops can be removed from the Reorder Buffer per clock cycle. After this the instruction was fully executed.

« Previous |  Page 6 of 7  | Next »
Print Version | Send to Friend | Bookmark Article | Comments (0)

Related Content
  • Intel is going to Identify Their Processors Through Numbers
  • Does Celeron Centrino Exist?
  • Intel Fab18 Factory Tour in Kiryat Gat, Israel
  • Intel EM64T Technology Explained
  • All Pentium M Models

  • Recommended Deal.
    Zalman CNPS10XEXTREMEZalman CNPS10XEXTREME

    Copper/aluminum Cpu Cooler Zalman CNPS10XEXTREME 35554 Electronics Usually ships in 24 hours

    Amazon: $62.78 TigerDirect: $74.99
    Newegg: $79.99 CompUSA: $74.99

    RSSLatest News
    Antec Announces the One PC Case
    February 9, 2012 - 8:06 AM PST
    Cooler Master Releases Elite 361 PC Case
    February 8, 2012 - 7:50 AM PST
    Microsoft Launches Kinect for Windows
    February 2, 2012 - 8:42 AM PST
    Transcend Announces SSD720 SSD Series
    February 1, 2012 - 7:55 AM PST
    .:: More News ::.


    © 2004-12, Hardware Secrets, LLC. All rights reserved.
    Advertising | Legal Information | Privacy Policy
    All times are Pacific Standard Time (PST, GMT -08:00)