NVIDIA GeForce 6600 GT - BeHardware
>> Graphic cards
Written by Marc Prieur
Published on September 7, 2004
URL: http://www.behardware.com/art/lire/514/
Page 1
Introduction, GeForce 6600 GT
Five months ago NVIDIA released the GeForce 6800 in a marketing blitz. Based on new architecture, the GeForce 6800 comes in three versions, the cheapest costing you around $300. Of course, we are not all prepared to invest that amount of money in a graphic card. NVIDIA knows that perfectly well and decided to launch a new graphic card based on the GeForce 6 architecture. This mid-level graphic card’s name is the GeForce 6600.
The GeForce 6600 GT The main elements of the GeForce 6800 are in the GeForce 6600. All the GeForce 6800 units are in the GeForce 6600, and are almost identical but lesser in number. Indeed, the GeForce 6600 GT is made of 144 millions transistors with a 110 nm fabrication process instead of 222 million and 130 nm fabrication process for the GeForce 6800. To give you a better idea, a GeForce FX 5900 has 130 million transistors and a GeForce FX 5700 82 million.
Here is a reminder of the GeForce 6800 3D architecture:
 There are 6 Vertex Shaders followed by 16 pixel pipelines. Each one of them has a large unit which handles both shaders and standard texturing, and a smaller unit dedicated to shaders. These pipelines are connected to a 16 ROP Engine (ROP for Raster Operation), which manages anti-aliasing, blending, Z-Buffer compression and double Z (possibility to calculate twice more Z data than the standard data / clock with the 6800). The standard GeForce 6800 has 5 Vertex Shader and 12 Pixel Pipelines. All this information is then transmitted to the 256 bit memory.
What about the GeForce 6600’s architecture?
 As you can see this time, units were reduced. The number of pixel shaders was reduced from 6 to 3, the pixel pipelines from 16 to 8 and the ROP engine from 16 to 4. The memory drops to 128 bits. With this architecture, the GeForce 6600 has performance restrictions for simple pixels, for example, with a shader of one instruction or a simple texture with bilinear filtering. If the pixel pipelines are able to process 8 simple pixels per clock cycle, in practice the ROP will restrict the chip to 4 pixels per clock cycle.
Of course, with more complex pixels, the problem is different. With trilinear filtering, double texturing or a more complex shader, the pixel pipelines will only process 4 pixels per clock cycle. It seems obvious that this choice has been made regarding the size of the available bandwidth. With a 128 bits bus, 8 ROPs would have been significantly restricted by the bandwidth and performance gains couldn’t have been visible in practice (only the more complex pixels are affected). It is better to save some money on transistors, now that pixels displayed in games are more and more complex and can’t be calculated in only one clock cycle.
Here is a table with the main specifications of the 6800 and the 6600 GT.
 The figures speak for themselves. In terms of sheer power the 6600 GT is close to the standard 6800. Only the bandwidth is smaller and will probably provide a lesser performance in practice for games in high resolution and with anti-aliasing / anisotropic filtering. We will check this assumption in the following pages. A standard GeForce 6600 is also expected out. The differences will be mainly on the frequency level with 300 MHz for the GPU and at least 275 MHz for the memory. Several manufacturers intend to go beyond these specifications.
The GeForce 6600GT functions are identical to the GeForce 6800’s. The GeForce 6600GT uses the Shader Model 3.0 enabling longer shaders and also introduces dynamic branching, that are however costly to performance. We are waiting to see how developers use this function to know if it’s really a plus for the current graphic card generation.
The GeForce6 is also the first chip to provide a full FP16 format support. It has a Buffer in floating point and is also able to process operations like blending or filtering. It makes it possible to manage the HDR with more efficiency and flexibility.
The 6800 video engine is also in the GeForce 6600. It’s able to decode in hardware the WMV9 and encode in hardware the MPEG1/2/4. It is, however, not yet used by the drivers. The manufacturer said they would talk to us soon about this issue. Is this good news?
Page 2
The architecture in practice The architecture in practice What is the GeForce 6600 GT performance level in practice? We started with a couple of theoretical tests. The first one is a fillrate test measuring the chip’s performance for simple operations. The objective was to measure the 6600’s efficiency compared to the 6800 GT and 6800.
 Figures are expressed in pixels per clock. The first test consisted simply in filling the screen’s colored pixels. This type of test isn’t restricted by bandwidth or texturing units and shows the maximum number of pixels calculated by the chip. Only the GeForce 6600 reached its theoretical maximum. The 6800GT and 6800 are not far from their max (16 and 12).
The Z fill test measures GPU speed with Z data. The maximum theoretical number is twice as important as the Color Fill test with a 1/10th difference.
The third test combines Z fill and Color Fill simultaneously. This test requires more memory and is closer to reality without using textures. With no bandwidth limitations the score should be identical to the Color Fill, but in practice results are slightly different. Indeed if the GeForce 6600 reached 94% of the Color Fill performance for this test, the 6800GT reached 76.6% and the 6800 81.7%.
What does this test mean? Well simply that the 6600 GT has the right amount of memory and that the 6800 and mainly the 6800GT don’t have enough. Due to memory, in practice the 6800GT only has only 12 pipelines instead of 16, and the 6800 has 10.
We continue this series of theoretical tests with textures.
 It’s not surprising that the test with a single texture in bilinear filtering provides the same results as the Color Fill test. The next test uses Alpha blending (mix with transparency). This test requires more memory resources. Once more, the GeForce 6600 GT is the chip with the least memory restriction. The result is 80% of the performance without Alpha blending, instead of 51% for the 6800GT and 63 % for the 6800. These two chips would have the same performance if they used only 8 ROPs instead of 16.
The double texturing test shows the specificity of the 6600’s architecture compared to the 6800’s. Indeed we are still at 4 clocks, which is consistent with the architecture. The other chips’ performances are cut in half. The test with triple or quadruple texturing only confirms the first results and is the same with a texture in floating point.
The next pixel and vertex shading tests are a little more explicit. This time we chose not to represent the results in pixel / clock cycle and left the graphic cards’ gross performance. We have also integrated a Radeon 9800 for comparison with the GeForce 6600 GT.
Here are the results in Pixel Shading of an in-house test using the pixel shader 2.0. The pixel shader are extracted from current applications and are processed with another of our applications (results in Mpixels /s) :
 The GeForce 6600 GT has excellent results! Indeed it is slightly faster than the GeForce 6800 and much faster than our previous generation chip of reference, the Radeon 9800 Pro. It has to be said that the 6600’s performance is consistent with the 6800. As for pipeline number (8 and 12) and their frequencies (500 and 325 MHz), the 6600 GT should be in theory 2.6% faster. In practice, it is respectively 4.1, 0.7 and 3.6% higher.
The next test is with Vertex Shader. We used a fixed function (T&L) Rightmark 3D geometrical test, in VS1.1, VS2.0, VS2.x and in VS3.0. Each time the same thing is calculated.
 With the 65.76 drivers, performance in this test falls noticeably in VS1.1 compared to the 61.77. Performance is illogically inferior to the VS2.0 mode. In every situation the GeForce 6600 GT’s performance is consistent with the number of units and the chip’s frequencies.
Page 3
100% PCI Express ?, SLI , Detonator 65.76 The architecture in practice The GeForce 6600 manages only one interface, the PCI Express, and so the first GeForce 6600 graphic cards and the one used for this test will be exclusively in this format.
Just a reminder; the PCI Express bus is a bus series that provides a doubled bandwidth compared to the AGP, from the system to the graphic card and from the graphic card to the system. The bandwidth goes from 2 GB /s to 4 GB / s. In addition, this bandwidth is bidirectional and cumulative, unlike the AGP. Finally the PCI Express is able to provide 75W to the graphic card instead of 40 W for the AGP and is more easily integrated to the motherboard.
 In practice, however, the performance gain is limited. We have already discussed this issue in a previous article. So you don’t have to change to the PCI Express unless if you want to buy a GeForce 6600.
Is this choice definitive? No! NVIDIA almost covered every angle with the HSI. This chip, which equips the GeForce PCX (GeForce FX PCI Express), is also able to make an AGP bridge to the PCI Express. These bidirectional chips will allow NVIDIA to release the AGP format GeForce 6600 at the end of October.
SLI  Unlike the PCI Express versions, the AGP won’t have a SLI port. With this technology it is possible to connect two PCI Express NVIDIA graphic cards to combine their performances. Of course to use this capability you have to possess a card with two PCI Express 16x ports. This type of configuration will be more common with the release of the next nForce 4. This is very interesting for the mid-range line of products, because you will be able to buy a GeForce 6600 GT and another one later to use the two graphic cards side by side. Of course the cost of a bi PCI Express x16 motherboard also needs to be included.
Detonator 65.76 For this GeForce 6600 GT test we used the Detonator 65.76, on which the next official drivers should be based. Compared to the 61.77, performance gains were variable depending on the game. In our test replays with the GeForce 6800 GT, we noticed no improvement with Fifa 2004, Warcraft III or Colin Mc Rae Rally 04. Performance gains were 2.7% and 5.6% with UT2004 and Far Cry in 1600 x 1200 AA 4X/Aniso, 3% with Splinter Cell and 6 to 8 % with Doom 3. The most impressive gain was with IL-2: 20% in Excellent 1600 x 1200 4X/8X, 13% in Maximal 1600 x 1200, 34 % in 1024 4X/8X and 28% in 1600 4X/8X! We will update our top-of-the-line graphic card test as soon as NVIDIA releases official versions and ATI the final Catalyst 4.9 version. This shouldn’t be long.
 Regarding performance gains observed in games, it’s generally not due to an innovation of a chip’s units. In our last test we mentioned filtering optimizations, or “compromises” integrated to ATI and NVIDIA’s drivers. These optimizations come in two forms. The first is a new option called anisotropic sample optimization, accessible via the control panel. It “optimizes” the sample position for anisotropic filtering on all but level 1 textures. However, it is not the only optimization as you can see in this test with a fixed image of UT2004, less dependant on the CPU than our test scene in 1600*1200 Aniso 8x :
6600GT, 61.77, Quality, Optis off, 62.1 fps 6600GT, 61.77, Quality, Optis on, 73.9 fps 6600GT, 65.76, Quality, Optis off, 65.2 fps 6600GT, 65.76, Quality, Optis on, 72.3 fps
6800GT, 61.77, Quality, Optis off, 79.5 fps 6800GT, 61.77, Quality, Optis on, 91.9 fps 6800GT, 65.76, Quality, Optis off, 85 fps 6800GT, 65.76, Quality, Optis on, 95 fps
Unlike ATI, NVIDIA allows the total deactivation of these compromises except anisotropic filtering regarding angles. To make this change activate driver High Quality mode.
6600GT, 61.77, High Quality, 59.1 fps 6600GT, 65.76, High Quality, 59.1 fps
6800GT, 61.77, High Quality, 74.6 fps 6800GT, 65.76, High Quality, 74.6 fps The equal performances obtained in High Quality mode show that the gain probably comes from a filtering compromise. There are two types of optimizations, the first one is documented, the second one isn’t. The latter permits an improvement of the 65.76 performances by 6.9% with the 6800 GT and 5% with the 6600 GT in quality mode and all optimizations deactivated in the driver panel control.
A closer study of screenshots allowed us to conclude that it’s probably due to a slight trilinear filtering modification not activated when using the color tool and the mips maps. In practice, the user won’t notice a difference, but it could be helpful if NVIDIA lets us know more on this optimization and gives us the possibility of deactivating it.
With more and more “compromises” comes the option for their deactivation. Luckily with NVIDIA this is already possible to deactivate these optimisations via the drivers, however a little more information on them would be nice. ATI first needs to start with giving us the option to deactivate them and then the information.
If 61.x Anistotropic Optimization setting had no effect on OpenGL (it forces the bilinear on textures other than 1 level in D3D), with the 65.x the equivalent "Anisotropic mip filter optimization" setting force the bilinear in OpenGL for all textures. It’s useless to mention that it shouldn’t be activated for good graphic quality...
Page 4
Anti aliasing and anisotropic filtering Anti aliasing and anisotropic filtering We verified that the algorithm used by NIVIDA for the GeForce 6600 GT and the GeForce 6800 were identical on anti-aliasing or anisotropic filtering. We couldn’t find any fault with these two functions and anti aliasing is clearly better than with the GeForce FXs because of the Rotatet Grid in AA 4x.
To get an idea of the difference in practice, here are a couple of images from UT 2004, Colin McRae 04 and Far Cry in 1600*1200 with 4x anti aliasing and 8x anisotropic filtering forced via the driver (the original non compressed versions except for Far Cry are available here). Of course these are frozen images and the overall quality can only be evaluated in motion. The images are in the following order; GeForce 6800GT (65.76), GeForce 6600GT (65.76) and Radeon 9800 Pro (4.8) :
  
  
   The following table shows the impact of 4X anti-aliasing and anisotropic filtering on performance with Colin Mc Rae. Anisotropic filtering is forced via the driver and optimizations were left activated. Results compared to those without AA / Aniso are expressed in percentage:
 Logically, the GeForce 6600 GT registers the biggest drop in performance due to 4X anti aliasing activation, however not as much as we thought. The most surprising result was regarding anisotropic filtering. The GeForce 6600 GT was the card with the smallest drop in performance in this mode!
It’s difficult to understand this result as the 6600 and 6800’s algorithms are similar. This might be due to better cache and the presence of 4 ROPs for 8 Pipelines. Anisotropic filtering requires more work from the 8 pipelines than a simple “optimised” trilinear filtering.
In conclusion, activating 4X anti aliasing and the 8X anisotropic filtering requires more effort from the GeForce 6600GT, but thanks to excellent performance with anisotropic filtering the difference is reduced. This first result is a great start and foretells excellent performance in game tests.
Page 5
The test The test We used the same test protocol as the previous article. (If you would like more information on this procedure go and take a look at the NVIDIA GeForce 6800 vs ATI Radeon X800, Round 2 article) NVIDIA’s drivers changed from 61.77 to 65.76. The 61.77 is not fully compatible with the GeForce 6600GT.
With the PCI card release, we had to use two different platforms:
- ASUSTeK P4C800-E Deluxe (i875P – AGP) / ASUSTeK P5GD1 (i915P – PCI Express) - Intel Pentium 4 3.4E GHz - 2x512 MB DDR PC3200 en 2-3-3-8 - Windows XP SP1 French - DirectX 9.0c
 We included the following graphic cards in our test:
- Radeon X600 XT - Radeon 9800 Pro - Radeon X800 Pro - GeForce FX 5700U - GeForce PCX 5900 - GeForce FX 5900 XT - GeForce 6600 GT - GeForce 6800 - GeForce 6800 GT
These cards have 128 MB of memory except for the X800 Pro and 6800 GT, which have 256 MB. It should be mentioned that the GeForce 6600GT uses a relatively expensive 500 MHz memory. If the 256 MB version is released it should also be quite expensive. Furthermore, if the memory upgrade was justified with the previous top-of-the-line graphic cards it is not necessarily so with the current mid- range of products. Graphic card performances start to be restricted by this type of memory in 1280 or 1600 with anti aliasing 4X and most of the time mid-range graphic cards provide poor results at this resolution. Even if there was no memory restriction, resolution should be 1024*768 or 1280*1024 in 4X or even 1600*1200 in 2X.
 The Radeon X600 XT is a version of the Radeon 9600XT in PCI format. The VPU frequency remains at 500 MHz, but the memory was improved from 300 to 370 MHz. The GeForce PCX 5900 is the PCI Express version of the 5900 with HIS. The frequency is 350/275 instead of 390/350 for the 5900XT AGP.
The other cards are in the AGP version. However, X800/6800/6800GT frequencies are similar for AGP and PCI versions. Results will be almost identical to PCI Express graphic cards because the interface doesn’t bring any performance gains and platforms are equivalent.
 For this test we lowered the GeForce 6600 GT’s frequency to 300/300 instead of 500/500 to simulate a standard GeForce 6600 performance. NVIDIA shares information only on the 300 MHz GPU frequency insisting on the fact that the integrators are free to choose the type of memory.
However, according to information we just received, it seems that 275 MHz is the minimum standard frequency required for a 6600. Such a graphic card will have inferior performances to the 300/300 6600 used for this article, but most manufacturers almost certainly won’t restrict their memory to this frequency.
Overclocking fans will be glad to know that our reference card reached 570/580 instead of 500/500. Of course we will have to verify if retail graphic cards will be able to reach the same performance level.
Page 6
UT2004, Far Cry Unreal Tournament 2004

We start this series of tests with results from Unreal Tournament 2004. The graph’s colors are indicative of an NVIDIA or ATI graphic card and not results.
The GeForce 6600GT made a great start with a performance level almost equivalent to our card of reference, the GeForce 6800. It’s slightly better than a Radeon 9800 Pro and clearly above the GeForce FX 5900XT. A comparison with the previous “top of the mid-range” card, the GeForce FX 5700U, shows the improvement made one generation later. The standard GeForce 6600 has performances equivalent to this chip.
There were two noteworthy points during this test. First, the PCI Express has trouble when video memory is exceeded and the game starts using AGP texturing: PCX5900 and X600 XT (each 128 MB) performances drop abnormally in 1600*1200 AA 4X/Aniso 8X compared to the other cards. As for the top-of-the-line X800 Pro / 6800 GT, it’s interesting to note that the PCI Express permits to play in 1600*1200 AA 4X/ Aniso 8X with an equivalent performance to the 6600GT in 1024*768 under equivalent settings. Does this improvement in resolution deserve an additional investment? It’s your choice.
Far Cry For Far cry we used the SM2.0B or SM3.0 path when possible. This path uses longer shaders to accelerate dynamic lighting management and also Geometry Instancing for rendering the grass outdoors.
 GeForce 6600GT performances are excellent with Far Cry even if they are not as good as the 6800 (except in 1600 4X/8X however it’s possible to play at both resolutions). Critiquing 6600 GT performances compared to the 6800’s really isn’t justified, though, as they represent different parts of the market. Results are clearly above the GeForce FX 5900XT or 9800 Pro and way above the PCX 5900 and FX 5700U. The standard 6600 has better performances than the 5900XT, and is close to the 9800 Pro.
Page 7
Doom 3, Splinter Cell : PT Doom 3 Doom 3, a game with excessive volumetric shadows uses the latest graphic engine developed by ID software and John Carmack, OpenGL. This type of operation is easier for NVIDIA’s GPU, which is able to process twice as much Z data per cycle than ATI’s VPU without FSAA. Doom III has been tested in Very High with 8X mode anisotropic filtering activated.
 The Radeon was not very comfortable with Doom 3. The 6600 GT reached a performance comparable to the X800 Pro in 1024 x 768 4X/8X. ATI’s Catalyst drivers only improve the X800 XT performance with Doom 3. We hope that the X800 Pro will benefit from the final 4.9 version.
As with UT2004, the 6800 GT performance in 1600 4x/8x was close to that of the 6600 GT in 1024 4x/8x. The standard GeForce 6600 provided good results superior to the GeForce FX 5900XT even at a resolution of 1600 x 1200.
Splinter Cell : Pandora Tomorrow
 The performance of the GeForce 6600 was once more close to the standard GeForce 6800 with Splinter Cell. The Radeon 9800 Pro had almost equivalent results in 1600 and in 1024 8x, and slightly better results in 1600 x 1200 8X but with a framerate too low for easy play. The GeForce FX 5900 XT and the GeForce PCX 5900 were slightly behind. The standard GeForce 6600 had a performance equivalent to the GeForce 5700 Ultra and the X600 XT.
Page 8
IL-2 FB AEP, Warcraft III IL-2 Sturmovik Forgotten Battles Aces Expansion Pack For IL-2, here is (as promised) two benchmark series with the new add-on, a new map and a new demonstration saved by Michel Vebert, whom we thank once more.
 With the landscape in “excellent” mode, the standard IL-2 parameter for the simplified highest graphic settings, the GeForce 6600 GT already has excellent results better than the FX 5900 XT and the Radeon 9800 Pro. The GeForce 6600 is faster than the X600 XT and even faster than the FX 5900 XT, with exception in 1600 x1200 without AA /Aniso.
  With advanced settings, landscape details in perfect, IL-2 activates – amongst others functions- the Shader 2 for the water. The result is very impressive but requires a lot of resources.
Even if a flight simulator requires a smaller framerate than a Quake-like game, all results were relatively low and you will probably want to reduce your resolution compared to the test resolution. Performance gains due to the Detonator 65 with NVIDIA are significant and the gap with supposedly equivalent graphic cards widens.
This time the 6600 GT isn’t far from the 6800, except in 1600*1200 4X/8X. Its performance was slightly inferior to the X800 Pro in 1600 x 1200 and superior in 1024*768 4X/8X. This is a good result!
Warcraft III  Even if performance is restricted by the internal CPU with most strategy games like Warcraft III, performance gaps are visible when increasing the resolution. The GeForce 6600 GT is clearly above the GeForce FX 5900 XT. The standard version is slightly below the 5700 Ultra and the X600 XT. The 6600 GT’s performance was comparable to the 9800 Pro.
Page 9
Colin Mc Rae Rally 04, Fifa 2004Colin Mc Rae Rally 04
 We now come to a car racing simulation with the Colin Mc Rae Rally 2004 version. The GeForce FX was never very comfortable with this type of game and the situation hasn’t changed. The standard 6600 version is faster. With the GT version they provide good performances superior to the Radeon 9800 Pro.
Fifa 2004  Last is Fifa 2004. Once again, the 6600 GT performance is comparable to the Radeon 9800 Pro with scores equivalent in 1600 4x/8x, superior in 1024 4X/8X and inferior in 1600. The GeForce 6600 has results comparable to the GeForce FX 5900 XT.
Page 10
Overall results Overall results In order to have an overall view of performances, we normalized the performance for each tests according to the best score, giving it 100%. We then accumulated all scores to get a rating compared to the Radeon 5900 XT.
 Figures speak for themselves, and the GeForce 6600 GT performance was quite impressive. The standard version is faster than a GeForce FX 5900 XT and GeForce FX PCX 5900. For the price of a PCX 5900 or X600 XT (sometimes even cheaper) you can get a 6600 GT, which gives better results. The 6600 GT is even more efficient than the 9800 Pro, the graphic card with the current best price /performance ratio.
Page 11
Conclusion Conclusion When we first saw these chips we were quite surprised with their characteristics, especially the replacement of a large 256 MB bus by a high frequency memory. The presence of a 4 ROP engines instead of an 8 led us to believe that the performance wouldn’t be as good. Results proved the opposite.
Indeed, during tests the GeForce 6600 GT showed a great performance for a “mid level” graphic card. After tests it proved that putting the GeForce 6 name on this line was correct. This architecture was conceived to provide the best performance with the available bandwidth.
The 6600 GT’s numerous capabilities are similar to top-of-the-line graphic cards with the Shader Model 3.0 and the Floating Point Buffer. The included video engine should also start to provide great performance. Finally, the SLI is an appreciable bonus even if we still have to see how it affects performance and the price of motherboards.
The former NV43 graphic card should be a success on the PCI Express market. The AGP version should be available at the end of October (so most likely November), and a success if released on schedule. The competition will be the 9800 Pro and to a lesser extent the GeForce FX 5900 XT.
In NVIDIA’s shadow ATI is currently planning its next move. Indeed, according to the latest rumours the Canadian manufacturer should release a Radeon X700, a serious competitor for the GeForce 6600 GT. The ball is in ATI’s court.
Copyright © 1997-2008 BeHardware. All rights reserved.
|