Hardware Secrets

Home | Camera | Case | CE | Cooling | CPU | Input | Memory | Mobile | Motherboard | Networking | Power | Storage | Video | Other
Content
Articles
Editorial
First Look
Gabriel’s Blog
News
Reviews
Tutorials
Main Menu
About Us
Awarded Products
Datasheets
Dictionary
Download
Drivers
Facebook
Forums
Links
Manufacturer Finder
Newsletter
On The Web
RSS Feed
Test Your Skills
Twitter
Newsletter
Subscribe today!
Search
Recommended
The Unabridged Pentium 4: IA32 Processor Genealogy
The Unabridged Pentium 4: IA32 Processor Genealogy, by Tom Shanley (Addison-Wesley Professional), starting at $26.22



Home » CPU
Inside Pentium 4 Architecture
Author: Gabriel Torres
Type: Tutorials Last Updated: October 18, 2005
Page: 2 of 7
Pentium 4 Pipeline

Pipeline is a list of all stages a given instruction must go through in order to be fully executed. On 6th generation Intel processors, like Pentium III, their pipeline had 11 stages. Pentium 4 has 20 stages! So, on a Pentium 4 processor a given instruction takes much longer to be executed then on a Pentium III, for instance! If you take the new 90 nm Pentium 4 generation processors, codenamed ”Prescott“, the case is even worse because they use a 31-stage pipeline! Holy cow!

This was done in order to increase the processor clock rate. By having more stages each individual stage can be constructed using fewer transistors. With fewer transistors is easier to achieve higher clock rates. In fact, Pentium 4 is only faster than Pentium III because it works at a higher clock rate. Under the same clock rate, a Pentium III CPU would be faster than a Pentium 4 because of the size of the pipeline.

Because of that, Intel has already announced that their 8th generation processors will use Pentium M architecture, which is based on Intel’s 6th generation architecture (Pentium III architecture) and not on Netburst (Pentium 4) architecture. This arquitecture, called Core, can be studied on our Inside Core Microarchitecture tutorial.

In Figure 2, you can see Pentium 4 20-stage pipeline. So far Intel didn’t disclosure Prescott’s 31-stage pipeline, so we can’t talk about it.

Pentium 4 Architecture
click to enlarge
Figure 2: Pentium 4 pipeline.

Here is a basic explanation of each stage, which explains how a given instruction is processed by Pentium 4 processors. If you think this is too complex for you, don’t worry. This is just a summary of what we will be explaining in the next pages.

  • TC Nxt IP: Trace cache next instruction pointer. This stage looks at branch target buffer (BTB) for the next microinstruction to be executed. This step takes two stages.
  • TC Fetch: Trace cache fetch. Loads, from the trace cache, this microinstruction. This step takes two stages.
  • Drive: Sends the microinstruction to be processed to the resource allocator and register renaming circuit.
  • Alloc: Allocate. Checks which CPU resources will be needed by the microinstruction – for example, the memory load and store buffers.
  • Rename: If the program uses one of the eight standard x86 registers it will be renamed into one of the 128 internal registers present on Pentium 4. This step takes two stages.
  • Que: Queue. The microinstructions are put in queues according to their types (for example, integer or floating point). They are held in the queue until there is an open slot of the same type in the scheduler.
  • Sch: Schedule. Microinstructions are scheduled to be executed according to its type (integer, floating point, etc). Before arriving to this stage, all instructions are in order, i.e., on the same order they appear on the program. At this stage, the scheduler re-orders the instructions in order to keep all execution units full. For example, if there is one floating point unit going to be available, the scheduler will look for a floating point instruction to send it to this unit, even if the next instruction on the program is an integer one. The scheduler is the heart of the out-of-order engine of Intel 7th generation processors. This step takes three stages.
  • Disp: Dispatch. Sends the microinstructions to their corresponding execution engines. This step takes two stages.
  • RF: Register file. The internal registers, stored in the instructions pool, are read. This step takes two stages.
  • Ex: Execute. Microinstructions are executed.
  • Flgs: Flags. The microprocessor flags are updated.
  • Br Ck: Branch check. Checks if the branch taken by the program is the same predicted by the branch prediction circuit.
  • Drive: Sends the results of this check to the branch target buffer (BTB) present on the processor’s entrance.
« Previous |  Page 2 of 7  | Next »
Print Version | Send to Friend | Bookmark Article | Comments (0)

Related Content
  • Activating the Hyper-Threading
  • Intel is going to Identify Their Processors Through Numbers
  • Pentium 4 Thermal Throttle
  • Intel Dual Core Technology
  • Intel Fab18 Factory Tour in Kiryat Gat, Israel


  • RSSLatest News
    LUXA2 Releases New P1-PRO Battery Power Pack
    October 1, 2013 - 7:23 AM PST
    MSI unveils GP70 and GP60 Laptops
    September 30, 2013 - 7:23 AM PST
    AMD Unveils Next-Generation Radeon Graphics Cards
    September 27, 2013 - 5:33 AM PST
    Genius Introduces Energy Mouse in North America
    September 27, 2013 - 5:32 AM PST
    Apple Updates iMac
    September 25, 2013 - 5:27 AM PST
    .:: More News ::.







    2004-13, Hardware Secrets, LLC. All rights reserved.
    Advertising | Legal Information | Privacy Policy
    All times are Pacific Standard Time (PST, GMT -08:00)