Inside the Intel Ivy Bridge Microarchitecture

[nextpage title=”Introduction”]

The Ivy Bridge microarchitecture will be used in the third-generation Core i3, i5, and i7 CPUs, to be released in 2012. It brings minor modifications to the Sandy Bridge microarchitecture used in the second-generation Core i processors, but with a completely new graphics engine, a PCI Express 3.0 controller, and a new manufacturing process. Let’s see what is new.

For a better understanding of this tutorial, we recommend that you read our “Inside the Intel Sandy Bridge Microarchitecture” tutorial before continuing.

The good news is that CPUs based on the Ivy Bridge microarchitecture will use the same socket as Sandy Bridge processors (socket LGA1155). This means that you will be able to upgrade your second-generation Core i processor to a third-generation Core i model without having to change the motherboard. The Ivy Bridge microarchitecture incorporates a PCI Express 3.0 controller, which doubles the bandwidth of the PCI Express slots connected to the CPU (i.e., the video card slots) from 500 MB/s per lane to 1 GB/s per lane. However, if your current motherboard doesn’t have PCI Express 3.0 channel chips, access to these slots will be limited to 2.0 speeds (500 MB/s per lane).

The Ivy Bridge microarchitecture expands the Sandy Bridge microarchitecture by adding the following new features:

Socket LGA1155
PCI Express 3.0 controller, which increases the bandwidth of the PCI Express lanes connected to the CPU from 500 MB/s to 1 GB/s; motherboards must use PCI Express 3.0 channel chips, otherwise, the video card slots will be limited to 2.0 speeds
Two new security features: a digital random number generator and a Supervisory Mode Execution Protection (SMEP)
Float16 format conversion instructions, which convert between a 16-bit compressed floating point memory format and a 32-bit single precision format
Improved performance for instructions that handle strings (REP MOVSB and REP STOSB)
Four new instructions for allowing applications to access the FS and GS registers of the CPU
Support for DDR3L (i.e., low-power DDR3) memories in mobile CPUs
DirectX 11 graphics engine
New 2D graphics engine
Support for three video monitors
Memory overclocking limit was increased from 2,133 MHz to 2,800 MHz, and memory clock can be configured in 200 MHz increments now
Dynamic overclocking, allowing you to change the clock ratio on unlocked CPUs without needing to reboot the PC, and higher clock ratios (up to 63) available for unlocked CPUs
Improvements in power management
22-nm manufacturing process

Let’s now talk a little more about some of these new features.

[nextpage title=”The 22-nm Manufacturing Process”]

While CPUs based on the Sandy Bridge microarchitecture are manufactured using a 32-nm process, processors based on the Ivy Bridge microarchitecture will use a new 22-nm process. (Read our “How Chips are Manufactured” tutorial for a better understanding of the subject.) However, instead of simply shrinking down the size of the parts inside the chip (namely, the transistors), the new Intel 22-nm manufacturing process uses a completely new approach using three-dimensional transistors.

In Figure 1, you can see how a traditional field-effect transistor (FET) works. This is the kind of transistor traditionally used inside CPUs. It is comprised of a channel, where current flows from one of its ends (called a “source”) to the other (called a “drain”). A third terminal, called a “gate,” controls the amount of current that flows through the channel. In CPUs, transistors work as switches, allowing current to flow when no current is applied to their gate, and blocking current from flowing when current is applied to their gate.

Figure 1: How traditional transistors work

In the tri-dimensional approach, the channel was moved sideways, as you can see in Figure 2. This way, the channel is now thinner but taller. This physical appearance of the channel is called the “fin.” Several fins can now be added to increase the amount of current supported by the transistor. See Figure 3.

Figure 2: The 3D transistor

Figure 3: Multiple fins for higher current

This design also allows the transistor to spend less power. According to Intel, this new approach allows a 37% performance increase (i.e., faster switching speed) when using low voltage (0.7 V) and a 50% power reduction.

[nextpage title=”Security Improvements”]

As previously mentioned, there are two new security features in the Ivy Bridge microarchitecture. The first one is the implementation of a digital random number generator (DRNG), which can be used through a new instruction called RDRAND.

Traditionally, when a program asks the CPU to generate a random number, the CPU uses the real-time clock of the system to do that. In other words, the exact time of the day the number was generated is used to create the random number through an algorithm. Therefore, a hacker knowing the exact time the random number was generated can replicate the algorithm used by the CPU to create that number and guess what that number is. This means that random numbers traditionally generated by the CPU aren’t exactly random – the correct name for them would be pseudo-random. A random number generator (RNG) solves this issue by creating a random number which isn’t based on the time of the day it was generated.

In Figure 4, you can see a summary of the new digital random number generator.

Figure 4: Digital random number generator (DRNG)

The second new security feature added to the Ivy Bridge microarchitecture is called Supervisory Mode Execution Protection (SMEP), which prevents a security attack called Escalation of Privilege (EoP). This kind of attack works by gaining access to a more privileged software (e.g., the operating system) and making it run a piece of software (the malicious software) that is installed in a memory area that is only used by applications.

This new security mechanism works by blocking any attempt of the execution of the code that is installed in the user memory space when the CPU is running at a privileged level (i.e., it is running instructions of the operating system). So, if a hacker is able to hijack the operating system, the malicious code won’t be able to run from the user memory space.

Figure 5: Supervisory Mode Execution Protection (SMEP)

[nextpage title=”The Graphics Engine”]

The most important change in the Ivy Bridge microarchitecture is the use of a completely new graphics engine. While the Sandy Bridge microarchitecture uses a DirectX 10.1 engine supporting two video monitors, Ivy Bridge uses a DirectX 11 part supporting three video monitors and the new 4K HD video resolution (4096 x 2304). In fact, the HD video decoder embedded in the Ivy Bridge can decode videos up to 4096 x 4096.

Instead of having two separate engines, one handling 2D processing and another handling 3D processing, several blocks were combined and are shared by the two engines, as you can see in Figure 6. (The blocks in green are the ones shared. In the slides presented on this page, “$” means “cache.”) In Figure 7, you can see the full block diagram of the Ivy Bridge graphics engine.

Figure 6: Block diagram of the combined 2D/3D engine

Figure 7: Block diagram of the combined 2D/3D engine

Several enhancements were made to the 2D portion of the graphics engine. Let’s talk about them separately.

[nextpage title=”The 2D Graphics Engine”]

The “2D” or “media” part of the graphics engine is in charge of not only generating the image you see while running regular programs such as word processors and web browsers, but it is also able to encode and decode video. This way, instead of using the CPU to run instructions for video encoding and decoding, these tasks are handled by the graphics engine, a process that improves performance and image quality.

The decoding engine used in the Ivy Bridge is called MFX or Multi-Format Codec Engine, and supports the AVC, VC1, and MPEG2 formats. (The first two are used by Blu-Ray discs, while the third one is used by DVDs.) As mentioned, this engine is not only capable of decoding 4K videos (4096 x 2304), but videos up to 4096 x 4096.

Figure 8: The MFX

For video encoding, Ivy Bridge uses a two-stage encoder. The first stage, called “ENC,” runs by software in the graphics engine processing cores (also called “EU,” Execution Units), while the second stage, called “PAK,” runs by hardware in the MFX pipeline. In Figure 9, you can see each part of the video encoding process that runs at each stage.

Figure 9: Video encoding

In Figure 10, you can see the video enhancement features supported by the Ivy Bridge video engine.

Figure 10: Video enhancement features

Inside the Intel Ivy Bridge Microarchitecture

For Performance

Everything you need to know

Reader Interactions

Leave a Reply Cancel reply

Footer

For Performance

Everything you need to know