The architecture in practice
What is the GeForce 6600 GT performance level in practice? We started with a couple of theoretical tests. The first one is a fillrate test measuring the chip’s performance for simple operations. The objective was to measure the 6600’s efficiency compared to the 6800 GT and 6800.

Figures are expressed in pixels per clock. The first test consisted simply in filling the screen’s colored pixels. This type of test isn’t restricted by bandwidth or texturing units and shows the maximum number of pixels calculated by the chip. Only the GeForce 6600 reached its theoretical maximum. The 6800GT and 6800 are not far from their max (16 and 12).
The Z fill test measures GPU speed with Z data. The maximum theoretical number is twice as important as the Color Fill test with a 1/10th difference.
The third test combines Z fill and Color Fill simultaneously. This test requires more memory and is closer to reality without using textures. With no bandwidth limitations the score should be identical to the Color Fill, but in practice results are slightly different. Indeed if the GeForce 6600 reached 94% of the Color Fill performance for this test, the 6800GT reached 76.6% and the 6800 81.7%.
What does this test mean? Well simply that the 6600 GT has the right amount of memory and that the 6800 and mainly the 6800GT don’t have enough. Due to memory, in practice the 6800GT only has only 12 pipelines instead of 16, and the 6800 has 10.
We continue this series of theoretical tests with textures.

It’s not surprising that the test with a single texture in bilinear filtering provides the same results as the Color Fill test. The next test uses Alpha blending (mix with transparency). This test requires more memory resources. Once more, the GeForce 6600 GT is the chip with the least memory restriction. The result is 80% of the performance without Alpha blending, instead of 51% for the 6800GT and 63 % for the 6800. These two chips would have the same performance if they used only 8 ROPs instead of 16.
The double texturing test shows the specificity of the 6600’s architecture compared to the 6800’s. Indeed we are still at 4 clocks, which is consistent with the architecture. The other chips’ performances are cut in half. The test with triple or quadruple texturing only confirms the first results and is the same with a texture in floating point.
The next pixel and vertex shading tests are a little more explicit. This time we chose not to represent the results in pixel / clock cycle and left the graphic cards’ gross performance. We have also integrated a Radeon 9800 for comparison with the GeForce 6600 GT.
Here are the results in Pixel Shading of an in-house test using the pixel shader 2.0. The pixel shader are extracted from current applications and are processed with another of our applications (results in Mpixels /s) :

The GeForce 6600 GT has excellent results! Indeed it is slightly faster than the GeForce 6800 and much faster than our previous generation chip of reference, the Radeon 9800 Pro. It has to be said that the 6600’s performance is consistent with the 6800. As for pipeline number (8 and 12) and their frequencies (500 and 325 MHz), the 6600 GT should be in theory 2.6% faster. In practice, it is respectively 4.1, 0.7 and 3.6% higher.
The next test is with Vertex Shader. We used a fixed function (T&L) Rightmark 3D geometrical test, in VS1.1, VS2.0, VS2.x and in VS3.0. Each time the same thing is calculated.

With the 65.76 drivers, performance in this test falls noticeably in VS1.1 compared to the 61.77. Performance is illogically inferior to the VS2.0 mode. In every situation the GeForce 6600 GT’s performance is consistent with the number of units and the chip’s frequencies.