Texturing performanceWe measured performance during access to textures of different formats in bilinear filtering: for standard 32-bit (4xINT8), 64-bit “HDR” (4x FP16) and 128-bit (4x FP32) and 32-bit RGB9E5, an HDR format introduced with DirectX 10 which enables to store 32-bit HDR textures with a few tradeoffs.
The GeForce GTX 500s can filter FP16/11/10 and RGB9E textures at full speed but the Radeon HD 6900s and the Radeon HD 7970 have such superior filtering power that even though they have to filter FP16 textures at half-speed, they aren’t far behind the GeForces.
Note that we have to increase the energy consumption limit of the Radeon HD 6900s to a maximum here, otherwise the clocks are cut during this test. By default the Radeons therefore seem incapable of fully benefitting from their texturing power! The good news is that this is no longer the case for the Radeon HD 7970.
We measured the fillrate without and then with blending, and this with different data formats:
In terms of fillrate, the Radeons have an advantage over the GeForce GTX 580s, above all with FP10s, a format processed at full speed while with the GeForces it is processed at half-speed. Given the limitation of the GeForces in terms of datapaths between the SMs and ROPs, it’s a shame that NVIDIA hasn’t given its GPU the possibility of benefitting from FP10 and FP11 formats.
Like the GeForces, the Radeons can process FP32 single channel at full speed without blending, but retain this speed with blending. Thanks to its 384-bit memory bus and the additional bandwidth relative to the number of ROPs, the Radeon HD 7970 does significantly better with blending. The memory bus therefore does have a use!