Hardware Secrets
Home | Audio | Case | Cooling | CPU | Input | Memory | Mobile | Motherboard | Networking | Power | Storage | Video | Other
Content
Articles
First Look
News
Reviews
Tutorials
Main Menu
About Us
Awarded Products
Compare Prices
Datasheets
Dictionary
Download
Drivers
Forums
Gabriel's Blog
Links
Manufacturer Finder
Newsletter
On The Web
RSS Feed
Test Your Skills
Newsletter
Subscribe today!
Search




Recommended Book
The Winn L. Rosch Hardware Bible (6th Edition)
By Winn L Rosch
Que
Price: $0.25

Home » CPU
Inside Intel Nehalem Microarchitecture
Author: Gabriel Torres
Type: Tutorials Last Updated: August 26, 2008
Page: 4 of 7
$ Check REAL-TIME pricing for Intel Core 2 Duo Retail Boxed E7200 Processor - 2.53GHz, 3MB Cache, 1066MHz FSB, 45nm Wolfdale E700 Boxed Processor $
Dell SMB: $135.99 ZipZoomfly.com: $119.90
Office Depot: $147.95 Amazon: $119.90
Netrithms: $149.99 TheNerds: $132.99

Enhancements to the CPU Pipeline

As mentioned, Nehalem (Core i7) is based on the architecture used by Core 2 Duo, bringing some enhancements on the way instructions flow inside the CPU. On this page we will describe these enhancements.

Core 2 Duo is, by the way, based on Pentium M, which in turn is based on Pentium III. All these CPUs are 6th generation Intel CPUs (if you run a CPUID instruction all of them will return “6” for the Family field). Pentium 4 was a 7th generation Intel CPU, using a complete different microarchitecture – Core 2 and Core i7 CPUs have absolutely nothing to do with Pentium 4. You may find strange a manufacturer going back to an “old” architecture but this is what happened (the “old” microarchitecture proved to be more efficient than the “new” one).

Refer to Figure 5 to understand the genealogy of the new Nehalem microarchitecture. We also added the main improvements brought by each new CPU; each CPU has everything brought by the previous CPU plus the mentioned improvements. Of course each CPU has other minor improvements; we listed only the most important ones.

Nehalem Core i7
click to enlarge
Figure 5: Nehalem microarchitecture genealogy tree.

In order to understand the improvements brought by this new microarchitecture you need to remember that programs are written using x86 instructions (also called “macro-op” or simply “instructions”), which aren’t understandable by the CPU execution units. They must be first decoded into microinstructions (also called “micro-op” or “µop”). This architecture is a CISC/RISC  hybrid and was introduced by the Pentium Pro: CPU receives x86 (CISC) instructions, but execute proprietary microinstructions (RISC).

Core microarchitecture, used on Core 2 CPUs, introduced macro-fusion, which is the ability of translating two x86 instructions in just one microinstruction (also known as “micro-ops”) to be executed inside the CPU, improving performance and lowering the CPU power consumption, since it will execute only one microinstruction instead of two. This scheme, however, only works for comparing and conditional branching instructions (i.e. CMP or TEST plus a Jcc instruction).

Nehalem microarchitecture improves macro-fusion in two ways. First it adds the support for several branching instructions that couldn’t be fused on Core 2 CPUs. And second, on Nehalem-based CPUs macro-fusion is used on both 32- and 64-bit modes, while on Core 2 CPUs macro-fusion only works when the CPU is working under 32-bit mode.

Core microarchitecture also add a Loop Stream Detector, basically a small 18-instruction cache between the fetch and the decode units from the CPU. When the CPU is running a loop (a part of a program that is repeated several times) the CPU doesn’t need to fetch the required instructions again from the L1 instruction cache: they are already close to the decode unit. In addition the CPU actually turns off the fetch and branch prediction units while running a detected loop, making the CPU to save some power.

On Nehalem-based CPUs this small cache has been moved to after the decode unit. So instead of holding x86 instructions like on Core 2 CPUs, it holds micro-ops (up to 28). This improves performance, because when the CPU is running a loop, it now doesn’t need to decode the instructions present in the loop: they will be already decoded inside this small cache. Also, the CPU can now turn off the decode unit in addition to the fetch and branch prediction units when running a detected loop, saving even more power.

Nehalem Core i7
click to enlarge
Figure 6: Location of the Loop Stream Detector on Core and Nehalem CPUs.

Nehalem architecture adds one extra dispatch port and has now 12 execution units, see below. With that CPUs based on this architecture can have more microinstructions being executed at the same time than previous CPUs.

Nehalem Core i7
click to enlarge
Figure 7: Dispatch ports and execution units.

Nehalem microarchitecture also adds two extra buffers: a second 512-entry Translation Look-aside Buffer (TLB) and a second Branch Target Buffer (BTB). The addition of these buffers increases the CPU performance.

TLB is a table used for the conversion between physical addresses and virtual addresses by the virtual memory circuit. Virtual memory is a technique where the CPU simulates more RAM memory on a file on the hard drive (called swap file) to allow the computer to continue operating even when there is not enough RAM available (the CPU gets what is on the RAM memory, stores inside this swap file and then frees memory for using).

Branch prediction is a circuit that tries to guess the next steps of a program in advance, loading to inside the CPU the instructions it thinks the CPU will try to load next. If it hits it right, the CPU won’t waste time loading these instructions from memory, as they will be already inside the CPU. Increasing the size (or adding a second one, in the case of Nehalem-based CPUs) of the BTB allows this circuit to load even more instructions in advance, improving the CPU performance.

Pages (7): « 1 2 3 [4] 5 6 7 »
Print Version | Send to Friend | | Bookmark Article | Comments (2)

Related Content
  • Details on Intel’s Forthcoming 45 nm Manufacturing Technology
  • Core 2 Duo E6750 Review
  • Core 2 Duo E7200 CPU Review
  • Core 2 Duo, Core 2 Quad, Phenom X3 and Phenom X4: Which One is the Best USD 200 CPU?
  • Everything You Need to Know About The QuickPath Interconnect (QPI)

  • Recommended Deals
    AMD Athlon 64 3500+, 2.2 GHz (ada3500dik4b) OEM / Unboxed Processor


    eBay: $52.00 ZipZoomfly.com: $39.99
    Hpshopping: $491.99 Amazon: $749.99
    Intel Core 2 Duo E6850, 3 GHz (BX80557E6850) Boxed Processor


    eBay: $200.93 CompuVest: $190.61
    Dell SMB: $249.99 Office Depot: $229.99
    Amazon.com Marketplace: $201.99 Next Warehouse: $200.28
    AMD Athlon 64 X2 3800+, 2.4 GHz (ADA3800DAA4BW) OEM / Unboxed Processor


    eBay: $54.58 Amazon: $749.99
    Hpshopping: $491.99
    Intel Core™2 Quad Q6600, 2.40 GHz (BX80562Q6600) Boxed Processor


    CompUSA.com: $189.99 Dell: $219.99
    TigerDirect.com: $189.99 CompuVest: $205.16
    ZipZoomfly.com: $193.90 Buydig: $1569.00

    RSSLatest News
    New Thermalright Fans
    November 28, 2008 - 4:44 AM PST
    Glacialtech Launches UFO V51 CPU Cooler
    November 27, 2008 - 3:18 PM PST
    Lexar Media Launches Triple-Channel DDR3 kits for Core i7 Processors
    November 26, 2008 - 6:23 PM PST
    OCZ Intros Fatal1ty Power Supplies
    November 26, 2008 - 5:53 PM PST
    Hynix Introduces 7 Gbps GDDR5 Memory Chips
    November 25, 2008 - 6:42 PM PST
    Geil Launches EVO Cyclone Memory Cooler
    November 24, 2008 - 6:19 PM PST
    Cooler Master Intros New Silent Force Power Supplies
    November 24, 2008 - 5:56 PM PST
    Akasa Launches Smart Fan Case Fans
    November 21, 2008 - 2:58 PM PST
    Titan Launches Cooler for Intel Core i7 CPUs
    November 20, 2008 - 4:47 PM PST
    Thermaltake Launches Fanless 330 VGA Cooler
    November 19, 2008 - 6:17 PM PST
    .:: More News ::.

    RSSLatest Content
    Nintendo DS Lite Ice Blue Review
    New Page: Awarded Products
    KeyScan KS810 Keyboard-Scanner Review
    Everything You Need to Know About Camera Sensors
    Zalman ZM750-HP Power Supply Review
    How does or would your notebook improve your quality of life? What role does or would it play in your life?
    SilverStone Decathlon 700 W Power Supply Review
    Antec Signature 650 Power Supply Review
    Seventeam ST-550P-AG Power Supply Review
    Logitech QuickCam Pro 9000 Review
    GeForce GTX 260 with 216 cores Video Card Review
    Laptop Design: What are some ways that form can enhance function?
    Panasonic Link-to-Cell Phone System Review
    Topower TOP-1100P10 Power Supply Review
    Gigabyte MA78GM-S2H Motherboard

    Our Most Popular Articles
    Maximum CPU Temperature
    829,047 views
    nVidia Chips Comparison Table
    517,012 views
    AMD ATI Chips Comparison Table
    454,860 views
    How to Find Out Your Motherboard Manufacturer and Model
    440,299 views
    ATI Radeon X1300 Pro Review
    431,149 views
    ATI Radeon X1600 XT Review
    424,594 views
    Connecting Two PCs Using a USB-USB Cable
    390,803 views
    How To Correctly Apply Thermal Grease
    373,682 views
    Sempron vs. Athlon XP
    291,499 views
    Sempron 3400+ Review
    283,691 views

    Latest Threads in Our Forums
    Build or buy - long post..
    by Aknot
    Can viruses really steal ID information?
    by cs0khunter82
    Cases: How to Avoid Overheating
    by Merman
    need help building gaming desktop under $1.5k
    by ray-solomon
    Need help what to buy for PC build
    by ray-solomon
    Nintendo DS Lite Ice Blue
    by Hardware Secrets Team
    UpGrade BIOS
    by Philphoto
    Too much thermal grease? [pics]
    by darkabis315
    How long have you been with Vista?
    by cs0khunter82
    Good CPU for around $200?
    by DavidFlorida
    .:: Visit Our Forums ::.


    © 2004-8, Hardware Secrets, LLC. All rights reserved.
    Advertising | Legal Information | Privacy Policy
    All times are Pacific Standard Time (PST, GMT -08:00)