ROPs: backgroundROPs are the units which write pixels to memory and are part of the GPU memory subsystem. Being able to disactivate half of them without sacrificing the memory bus is therefore surpising and we don’t know exactly how AMD has done it. We carried out a few theoretical tests to try and find some sort of answer and we compared these results with results that we obtained previously:
During simple writes to memory, the Radeon HD 5830 attains its theoretical fillrate at 32 bits. At 64 and 128 bits however, performances are very weak, a good deal down on those for the Radeon HD 5770. Strangely, performance exactly corresponds to half what you get with the Radeon HD 4890 which however has a similar configuration in terms of ROPs and memory bandwidth.
During blending, pressure on memory bandwidth increases and the fillrate falls. In spite of its 256-bit bus, the Radeon HD 5830 struggles to differentiate itself from the Radeon HD 5770 and is some way behind the Radeon HD 4890.
Something isn’t quite right in terms of the 16 ROP / 256 bits pairing on this Radeon HD 5830. Of course we questioned AMD on this but the manufacturer sort of made out it didn’t understand what we were getting at and simply repeated that 16 ROPs have been disactivated and that there is indeed a 256-bit memory bus. As different simplifications have been carried out in each of its illustrations of the memory architecture provided by AMD, these contradict each other a bit and it’s difficult to make much of them.
If the ROPs truly are uncoupled from the memory controllers and these memory controllers all remain active, as claimed by AMD, we can suppose that each group of ROPs only has a bus of limited width for communication with the memory controllers. This bus would then have been designed during chip design so that all groups of ROPs can supply the 256-bit memory bus combined with GDDR5 memory.
Bandwidth limitations for the ROPs would then be internal to the GPU, which would prevent Radeon HD 5830 type Cypress configuration from exploiting all available bandwidth. Other units (display engine, CrossFire etc.) could certainly use the additional memory bandwidth but the biggest users, and those who really do need it, wouldn’t be able to and would be sort of limited to a 128-bit memory bus, like with the Juniper GPU that equips the Radeon HD 5770. This would explain our results as well as the advantage of the Radeon HD 4890, as its 16 ROPs have been designed to link up with a 256-bit bus.
Note that we don’t yet know if any of the L2 caches have been disactivated nor if the same goes for their link to the memory controllers.