Triangle throughputGiven the architectural differences between the various GPUs in terms of geometry processing, we obviously wanted to take a closer look at the subject. First of all we looked at triangle throughput in two different situations: when all triangles are drawn and when all the triangles are removed with back face culling (because they aren’t facing the camera):
Although the Radeon HD 7900s and 6900s are indeed able to process 2 triangles per cycle, the GeForce GTX 580 retains the advantage with 4 triangles per cycle. When the triangles have to be rendered however, performance is reduced as NVIDIA has limited them to differentiate the Quadros and the GeForces.
Next we carried out a similar test but using tessellation:
While the Radeon HD 7970 doesn’t really differentiate itself from the Radeon HD 6970 without tessellation, its performance is significantly better with, though the GeForce GTX 580 still retains an advantage.
The architecture of the Radeons means that they can be overloaded by the quantity of data generated, which then drastically reduces their speed. Doubling the size of the buffer dedicated to the GPU tessellation unit in the Radeon HD 6800s meant they gave significantly higher performance than the Radeon HD 5800s. Parallelisation of geometric processing allowed the Radeon HD 6900s to close the gap a bit on the GeForces and the Radeon HD 7970 reduces the gap even more.
The GeForce GTX 580 benefits from an architecture that handles geometry in a distributed way, at the processing units level, which means it avoids the centralization of geometric amplification and the resultant overload that can ensue.