Home  |  News  |  Reviews  | About Search :  HardWare.fr 



ProcessorsMotherboardsGraphics CardsMiscellaneousHard DisksSSD
Advertise on BeHardware.com
Review index:
NVIDIA GeForce 7800 GTX
by Damien Triolet et Marc Prieur
Published on June 22, 2005

Branching
One of the main innovation introduced with the GeForce 6800 was dynamic branching in pixel shaders. It makes easier the writing of some shaders and increases the efficiency of other shaders by not calculating a piece of the shader for pixels that donít need it. For example, why apply a costly filter to smooth shadows edges if the pixel is in the middle of a shadow? Dynamic branching helps to detect if the pixel needs shading or not. Splinter Cell Chaos Theory uses this technique whereas Chronicles of Riddick calculates everything for all pixels. Performances drop by 10 to 15 % for the first and 50% for the second. Of course the algorithms arenít identical, but it does give an idea of what dynamic branching can do.

The current implementation isnít ideal, however, and this is only efficient in very specific cases. In a GPU, groups of hundreds or thousands of pixels are processed together. The instruction flow isnít handled per pixel but rather per groups of pixels, it means that the instruction suite has to be the same for all pixels of a given group. In other words, when there is a branching, all pixels have to take the same branch. In the other case, both branches have to be calculated for all pixels with masks to write only the result of the required branch. As branching instructions arenít free, it can be easy to reduce performances instead of improving them. Letís see now how the situation has changed from the 6800 to the 7800.

Branching instruction costs:

Case 1
6800: 3 cycles
7800: 2 cycles
6800 One year ago: 9 cycles

Case 2
6800: 5 cycles
7800: 5 cycles
6800 One year ago: 9 cycles

The difference between the first and the second case is the type of element used as output. In the first it is a constant color and for the second an interpolated color. You may have noticed the great progress in drivers in one year. Without going into the details, they can now finish the pixel shader before having to process all branching instructions.


Now we use a shader that gives us the possibility to specify the branching granularity (the number of consecutive pixels that take the same branch). If the block size correspond to the one handled by the GPU (or are bigger), branching will improve performances. As you may have noticed, with the 7800 branching improves performance for smaller blocs of pixels, meaning that itís more efficient. When asked about this improvement, David Kirk, NVIDIAís Chief Scientist, answered that this gain comes from a better distribution of blocs of pixels between the different quad engines. It means that quad engines are now more independent and can process blocks taking different branches. It wasnít the case for the GeForce Series 6, at least in this case, as one big block was used by all quad engines. Because of that, a GeForce 6600 GT could be more efficient than a 6800 GT (while still providing lower performances in most cases).

Without this optimisation, the GeForce 7800 GTX would have been even less efficient than the 6800 GT because of the higher number of quad engines. It would have had to process this type of branching in groups of 6000 pixels instead of groups of 1000. This optimisation is of course welcomed, but it is important to remind you that 1000 pixels is still quite large, and that this will have to be strongly reduced in the future for branching to be really efficient for a maximum number of cases.

Vertex Shader
Like the Pixel Shader, NVIDIA announced improvements in vertex shaders. This time it isnít question of something added in the calculation units but simply of improvements in efficiency. We donít have any additional information, because NVIDIA didnít go into detail. We tested the T&L, VS 1.1, VS 2.O and VS 2.X/3.0 performances in RightMark :


In Diffuse with one light source, performances in the four modes increased slightly more than the improvement from 6 to 8 vertex shading pipelines. In Diffuse & Specular with three light sources, performances increased a little bit more. Indeed there is an improvement of vertex shader efficiency (+7%).

<< Previous page
Pixel shader: theory and practice

Page index
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13
Next page >>
HDR, Textures, Up/Down  




Copyright © 1997- Hardware.fr SARL. All rights reserved.
Read our privacy guidelines.