NVIDIA is launching today two chips on the GeForce GTX 200 family: GTX 280 and GTX 260. In Figure 4, you can see a block diagram from the new GeForce GTX 280 and in Figure 5 a photo of the GeForce GTX 280 die showing the location of the main blocks.
click to enlarge
Figure 4: GeForce GTX 280 block diagram.
click to enlarge
Figure 5: Location of the main blocks on the GeForce GTX 280 die.
In the middle of the block diagram in Figure 5 you can see 10 blocks. These blocks are called Thread Processing Cluster or simply TPC and in Figure 7 you have a more detailed view of one of these blocks.
click to enlarge
Figure 6: Thread Processing Cluster (TPC).
Each TPC has one L1 memory cache and three sets of processing units. Each one of these sets is called Streaming Multiprocessors (SM) by NVIDIA. Each set has eight processing units (labeled as “core” by NVIDIA; they are also known as Streaming Processors or SP) sharing a small piece of RAM (labeled as “local memory” by NVIDIA). The addition of these small pieces of RAM is one of the main differences between the architecture used on the GeForce GTX 200 series and the one used by GeForce 8 and GeForce 9 series. You can learn more about the architecture of these two series in our article GeForce 8 Series Architecture (despite its name, GeForce 9 is based on GeForce 8 architecture).
The main idea behind DirectX 10 – i.e., Shader 4.0 programming model – is that each processing unit is a “generic” unit, allowing any kind of processing (this concept helped a lot GPGPU). Previously the GPU had specific units for each kind of possible processing (most notably specific processing units for pixel shaders and specific processing units for vertex shaders).
Since each set inside the TPC has eight processing units, each TPC has 24 processing units, for a total of 240 processing units (10 TPC’s) on GeForce GTX 280. GeForce GTX 260 has less units, 192, achieved by having eight TPC’s instead of 10.
Inside each TPC you can also find eight texture filtering units (labeled as “TF” in Figure 7), for a total of 80 texture units on GeForce GTX 280 and 64 on GeForce GTX 260.
As you can see in Figure 4, GeForce GTX 280 has eight memory interface units, each one being 64-bit wide. This means that GeForce GTX 280 has a 512-bit (64-bit x 8) memory interface – it was about time: GeForce 8800 GTX uses a 384-bit memory interface and GeForce 9800 GTX uses a 256-bit interface. This model supports 1 GB of video memory, with two 64 MB (512 Mbit) chips attached to each memory interface unit. GeForce GTX 260 has seven memory interface units, meaning that this version uses a 448-bit memory interface (64-bit x 7) and comes with 896 MB of video memory (64 MB per chip x 2 x 7).
GeForce GTX 200 series finally supports double floating-point precision (i.e., 64-bit floating point registers).
Chips from the new GeForce GTX 200 series bring the updated 2D video processing engine, called VP2 or “2nd generation PureVideo HD,” which has a fully hardware-based H.264 decoder (used to decode high-definition movies like Blu-Ray and HD-DVD), releasing the system CPU from this task. This same decoder is found on all video cards from GeForce 8 and 9 series but "G80" chips (GeForce 8800 GTS, GTX and Ultra), which are based on the previous PureVideo HD engine, VP1, still partially using the system CPU for decoding.
GeForce GTX 200 series also has more power saving modes. Four modes are available:
- Idle/2D power mode: used when you are working on Windows and working with regular programs, like word processing and internet browsing. The video card consumes around 25 W when it is in this mode.
- Video playback mode: used when you want to playback movies and use the hardware-based decoder incorporated in the graphics chip instead of using the system CPU for decoding. The video card consumes around 35 W when it is in this mode.
- Full 3D performance mode: When playing games the video card activates its 3D engine. The power consumption will be the maximum (maximum of 236 W on GeForce GTX 280 and 182 W on GeForce GTX 260).
- HybridPower: This is a technology where 2D video is produced by the motherboard (i.e., on-board video) and the video card is automatically turned off when you are not playing games. Thus power consumption from the video card is zero when you are not playing games. You need a HybridPower compliant motherboard in order to use this feature.