Report: AMD Radeon HD 6970 & 6950, single and in CrossFire X - BeHardware
>> Graphics cards
Written by Damien Triolet
Published on December 15, 2010
After several weeks of delays, the Radeon HD 6900s are finally here and usher in Cayman, a development of the GPU used for the Radeon HD 5800s and designed to push the bar even higher in terms of performance. Will it give enough to challenge the GeForce GTX 500s?
Not really…AMD announced the situation loud and clear when presenting the Radeon HD 6900s: the GeForce GTX 580 is out of reach.
This is bound to disappoint a good section of the public as many users were hoping to see AMD overtake NVIDIA with a higher performance GPU. AMD no doubt believed it was on course too but has had to revise its ambitions. The reason for this may well be that it has been more difficult than previously thought to get the most out of its new architecture, but also, of course, the release of the excellent GeForce GTX 500s must have upset the initial plans.
Thankfully, AMD has been very pragmatic in terms of its price positioning and has decided to attack NVIDIA again when it comes to price / peformance ratios: the Radeon HD 6970 has been priced at €330 against €350 for the GeForce GTX 570. The Radeon HD 6950, at €270, has no direct competition… except for the Radeon HD 5870, that is.
CaymanWith Cayman, AMD took a risk by deciding to review an aspect of the architecture of its processing units which hadn’t changed since the Radeon HD 2900 XT. Broadly speaking, the AMD processing units can be described as vectorial 5d or vec5, which means they can execute up to 5 instructions in parallel. However, with such an architecture, if the code to be processed doesn’t allow the parallelisation of so many instructions, the units aren’t fully exploited, in contrast to the scalar NVIDIA architecture which can maintain a high yield in a maximum of situations. Both approaches are as valid as each other.
Note however that a processing unit isn’t the same as a “core”, which is a marketing notion used by NVIDIA to give a comparison with CPUs and adopted by AMD to count 5 cores per vec5 processing unit. Overall, you can look at things in two ways: you get more out of an AMD vec5 unit than an NVIDIA scalar unit or you get less out of an AMD core than an NVIDIA core.
Close up on a Cypress vec5 processing unit.
Close up on a Cayman vec4 processing unit.
With the GF104 used on the GeForce GTX 460 and its derivatives, NVIDIA moved towards vector-like functioning to increase yield and AMD has tried to do the same with Cayman, but in the other direction, dropping down from vec5 to vec4. The Cayman processing units are therefore less powerful than those on pevious AMD GPUs. They are statistically more efficient, though don’t give higher performance. This is an important nuance. As they are simpler however, they take up less space and draw less energy, which means you can have more of them, all other things being equal.
To look a little more closely, the previous Radeon GPUs were based on 4+1 type processing units, with an execution line able to handle complex instructions. It’s this “+1” that AMD has decided to get rid of. This means that these complex instructions must be processed on other lines via a succession of simpler operations. These instructions will now take up 3 of the 4 execution lines, which will make them much more demanding in terms of resources as it’ll be possible to process only a simple instruction at the same time compared to 4 before.
BR>Without this slightly unusual “+1”, that was difficult to supply correctly under certain circumstances, the compiler’s task is greatly simplified, which even means these vec4 units will show higher performance than the vec5s in some cases but which also means overall that AMD now needs more vec4 processing units to maintain the same level of performance.
While Cypress, the Radeon HD 5800s' GPU, had 20 blocks of 16 vec5 processing units, Cayman has 24 blocks of 16 vec4 processing units. This means that instead of 320 vec5 units we now have 384 vec4s, which gives a lower total of “cores” (only 1536 for Cayman against 1600 for Cypress). We mustn’t however forget the texturing units, of which there are four per block. This means that Cayman gives 20% more power here at equal clocks. Note that AMD says that it has increased double precision calculations performance but in fact this is simply a twisted way of interpreting the fact that the “+1” didn’t handle double precision. A Cayman unit is identical to a Cypress unit here.
AMD didn’t stop there and has introduced some other small improvements to its architecture. The first concerns geometry processing, which is parallelised so that it’s no longer limited to one triangle per cycle. Note however, it’s parallelised but not distributed between blocks of processing units as is the case with the GeForces. Here’s a simplified way of looking at things:
Cypress: 1 complex geometry processing unit -> 1 triangle of 32 pixels per cycle,
Cayman 2 complex geometry processing units -> 2 triangles of 16 pixels per cycle,
GF100/GF110: 16 simple geometry processing units -> 4 triangles of 8 pixels per cycle
NVIDIA retains an advantage with small triangles and, above all, with more simple geometry processing units, the GPU doesn’t get stalled when there’s a lot of data generated by tessellation. To combat this problem, AMD enlarged the dedicated buffer in Barts, the GPU used for the Radeon HD 6800s, and Cayman goes further allowing it to transfer all this data into the video memory temporarily so as to avoid stalling the GPU. This feature isn’t however directly exposed and we don’t know if it kicks in automatically at certain loads or if AMD has recourse to it manually on a case by case basis.
BR>AMD has also improved its ROPs to increase the speed of 16-bit integer and 32-bit floating point formats. Antialiasing efficiency has also been improved, as has writing to memory in “compute” mode. Here, AMD has taken a close look at what NVIDIA are offering and enabled simultaneous processing of several different kernels while previously the GPU had to attribute successive processing periods. The same goes for communication with the CPU which will be enabled in both directions at the same time thanks to the fact that there are two DMA engines, like with the GF100/110. The memory controllers have been revisited to support high speed GDDR5 more easily.
Cayman: 2.64 billion transistors
Finally, AMD has taken an important step in including, for the first time in a GPU, a unit to handle energy consumption monitoring. Using hundreds of sensors distributed across all the GPU blocks, Cayman can monitor its own energy consumption and limit clocks to remain within the maximum energy consumption that has been defined. AMD is talking about fine tuning down to the precision of one frame. NVIDIA has nothing to compare to this sort of precision and responsiveness on its GeForce GTX 500s. Here Cayman is giving the sort of thing you get on recent CPUs, though without the Turbo feature that would allow it to increase clocks when energy consumption is lower than the limit. The technology used on Cayman, known as PowerTune, should be rolled out across all forthcoming AMD GPUs and will really come into its own on the mobile market!
All these changes have increased the number of transistors and Cayman now has 2.64 billion as opposed to 2.15 on the Cypress. This represents an increase of 23%, mirrored in the size of the die, up from 334 to 389mm². The increase in die size is just 16%, showing that the density of the transistors has gone up.
Specifications, the cards
With a slightly higher GPU clock, the Radeon HD 6970 has almost exactly the same processing power as the Radeon HD 5870. However, it gives higher triangle throughput and has a higher filtering rate. The good news is that the Radeon HD 6950 is not as significantly cut down in comparison to the Radeon HD 6970 as the Radeon HD 5850 was in comparison to the Radeon HD 5870.
The stock Radeon HD 6900sFor this test, AMD supplied us with a Radeon HD 6970 as well as a stock Radeon HD 6950. The two cards are identical as you can see:
These two new Radeons share the same cooling system, the same PCB and the same power stage. There are some small differences here but they’re purely revision and manufacture details – the two PCBs seem to have been made in different factories. It is however possible that the Radeon HD 6950s sold in stores, especially once the first lots have been sold, will be slightly modified as they obviously don’t require such a sturdy power stage. Moreover, they only have two 6-pin power connectors against 8+6 for the Radeon HD 6970.
In addition to using very fast GDDR5, made by Hynix and certified at 6 Gbps or 1.5 GHz for commands and 3 GHz for data, AMD has gone for no less than 2 GB by default on the Radeon HD 6900s.
For the rest, we’re looking at a standard AMD design with a blower fan and a vapour chamber. Like the Radeon HD 5870 but in contrast to the Radeon HD 5850, an aluminium plate covers the back of the PCB.
In terms of connectivity, AMD has gone for the same set-up as it introduced with the Radeon HD 6800s, namely two DVI Dual-Link outs, an HDMI out and two mini-DisplayPort outs. Tri and quad CrossFire X is supported via two dedicated connectors. Besides these, AMD has integrated a small switch to launch a backup bios. The user can flash the first bios but the second is protected, which means you can fall back on it if there are any problems following an update.
The maximum authorised level of energy consumption is fixed at 250W for the Radeon HD 6970 and 200W for the Radeon HD 6950. If the GPU monitoring system records energy consumption over this limit, it gradually reduces the clock via the PowerTune technology. AMD says however that it has fixed the limits of these cards so that this protection doesn’t kick in during the most demanding games and under the most difficult circumstances, namely with a GPU that has high leakage that is in a very hot environment.
You can modify the energy consumption limit in the Catalyst Overdrive panel. AMD allows you to adjust it between –20% and +20%, or between 200 and 300W for a Radeon HD 6970. Reducing it allows you to limit energy consumption and noise levels in gaming in the most demanding situations, while increasing it means you’ll have a bigger margin for overclocking, without which increasing the clock would actually change nothing if the GPU were still to be limited in terms of the amount of power it could draw.
Energy consumption, noise
Energy consumptionWe did of course use our new test protocol that allows us to measure the energy consumption of the graphics card alone. We took these readings at idle, in 3D Mark 06 and Furmark. Note that we use a version of Furmark that isn’t detected by the stress test energy consumption limitation mechanism put into place by NVIDIA in the GeForce GTX 580 drivers.
At idle, the new Radeons draw the same amount of power as the previous generation. In load the Radeon HD 6970 is similar to the Radeon HD 5870 2GB while the Radeon HD 6950 consumes 10 to 20W more than the Radeon HD 5850.
Note however that both in 3Dmark06 and Furmark, the Radeon HD 6900s reduced their clocks (varying for example between 570 and 700 MHz instead of 880 MHz for the Radeon HD 6970), something we don’t understand as the limit is supposed to be between 250 and 200W and not 210 and 160W as seems to be the case in these applications. Are we to think that AMD has added an additional limitation for certain applications? That they have done so on those used by testers to measure energy consumption? Is it possible that PowerTune might be mis-estimating the true power consumption? We haven’t had any answers from AMD yet…
Noise levelsWe place the cards in an Antec Sonata 3 casing and measure noise levels at idle and in load. We placed the sonometer 60cm from the casing.
Although the Radeon HD 6900s are quiet at idle, in load they’re noisier than the previous generation cards. Their fans are also set up to increase in speed in steps, which is a more annoying system. Here are the variations we noted in our tests:
HD 6950: 44.6 dB -> 47.1 dB
HD 6970: 47.6 dB -> 48.9 dB
HD 6950 CFX: 52.7 dB -> 54.1 dB
HD 6970 CFX: 53.7 dB -> 56.4 dB
Of course we retained the highest value across the spread. At the end of the day, then, NVIDIA has managed to turn the situation to its advantage, expecially in SLI where two GeForce GTX 570s make a lot less noise than two Radeon HD 6870s.
TemperaturesStill in the same casing, we took temperature readings for the GPU with an internal sensor:
The Radeon HD 6900s heat up a little more than the Radeon HD 5800s, which leaves less of a margin when trying to calibrate the cooling system so as to reduce noise levels.
Here’s what the infrared thermography shows:
Radeon HD 6950 at idle
Radeon HD 6970 at idle
Radeon HD 6950 CrossFire X at idle
Radeon HD 6970 CrossFire X at idle
Radeon HD 6950 in load
Radeon HD 6970 in load
Radeon HD 6950 CrossFire X in load
Radeon HD 6970 CrossFire X in load
Theoretical tests: pixels
Texturing performanceWe measured performance during access to textures of different formats in bilinear filtering. Here are the results for standard 32-bit (4xINT8), 64-bit “HDR” (4x FP16) and 128-bit (4x FP32). We have also added performance for 32-bit RGB9E5, a new HDR format introduced in DirectX 10 which allows 32-bit HDR textures to be stocked, give or take a few compromises.
As announced by NVIDIA, the GeForce GTX 500s can filter FP16/11/10 and RGB9E textures at full speed and give 2.35x and 2.58x better performance in comparison with the GeForce GTX 480 and 470.
The filtering power of the Radeon HD 6900s is so far superior however that even though they have to filter FP16 textures at half speed, they aren’t too far behind the GeForces.
Note that we had to raise the energy consumption limit of the Radeon HD 6900s to the max, as otherwise clocks were reduced in this test. These new Radeons seem, then, incapable of fully using all their texturing power by default!
FillrateWe measured the fillrate without and then with blending, and this with different data formats:
In terms of fillrate, the Radeons have a big advantage over the GeForce GTX 400s/500s, above all with FP10s, a format processed at full speed while with the GeForces this format is processed at half-speed. Given the limitation of the GeForces in terms of datapaths between the SMs and ROPs, it’s a shame that NVIDIA hasn’t given its GPU the possibility of benefitting from FP10 and FP11 formats.
Like the GeForces, the Radeon HD 6900s can process FP32 single channel at full speed without blending, but retain this speed with blending.
Theoretical tests: geometry
Triangle throughputGiven the architectural differences between the various GPUs in terms of geometry processing, we obviously wanted to take a closer look at the subject. First of all we looked at triangle throughput in two different situations: when all triangles are drawn and when all the triangles are removed with back face culling (because they aren’t facing the camera):
AMD has doubled triangle throughput with its Radeon HD 6900s, which easily outdo the GeForces when the triangles are actually displayed – NVIDIA has limited the GeForces to give an advantage to the Quadros. When it comes to rejecting triangles from the rendering via culling, no Radeon comes close to the high-end GeForces however.
Next we carried out a similar test but using tessellation:
The advantage of the GeForces over the Radeons is there for all to see. AMD and NVIDIA have very different approaches and the throughput on the GeForces varies according the number of processing units on the card.
The architecture of the Radeons means that they are rapidly overloaded with the quantity of data generated, which then drastically reduces their throughput in this case. Doubling the size of the GPU tessellation unit buffer on the Radeon HD 6800s means they give significantly higher performance than the Radeon HD 5800s and the parallelisation of geometry processing allows the Radeon HD 6900s to catch the GeForces up slightly, though without getting on a par with them.
Strangely the GeForce GTX 580 only posts a very slight gain on the GeForce GTX 480 and is even down on it in this test when triangles are ejected from the rendering via culling. This may be because of a different software configuration which favours certain cases (the most important probably) but can penalise others. This is something that we’ve observed with certain profiles with the Quadros. In any case, this isn’t something the GeForce GTX 570 is subject to. It behaves here like the GeForce GTX 480 with higher clocks. Note that we did of course retest the GeForce GTX 580 with the same driver as the GeForce GTX 570.
Displacement mappingWe tested tessellation with an AMD demo that is part of Microsoft’s DirectX SDK. This demo allows us to compare bump mapping, parallax occlusion mapping (the most advanced bump mapping technique used in gaming) and displacement mapping that uses tessellation.
Basic bump mapping.
Parallax occlusion mapping.
Displacement mapping with adaptive tessellation.
By creating true additional geometry, displacement mapping displays clearly superior quality. Here we activated the adaptive algorithm that allows you to avoid generation of useless geometry and too many small triangles that will not fill any quads, the minimum rendering area on all GPUs and therefore waste a lot of ressources.
We also measured performances obtained with the different techniques:
It is interesting to note that tessellation doesn’t only improve rendering quality but also performance! Parallax occlusion mapping is in fact very ressource heavy as it uses a complex algorithm that attempts to simulate geometry realistically. Unfortunately it generates a lot of aliasing and this is noticeable on the edges of objects or surfaces that use it.
Note however that in the present case the displacement mapping algorithm is aided by the fact that it is dealing with a flat surface. If it has to smooth geometry contours and apply displacement mapping at the same time the demands are of course much higher.
The GeForces do a lot better with the high tessellation load than the Radeons and strangely, the Radeon HD 6900s only give a moderate gain, similar to what you get with the Radeon HD 6800s.
The use of an adaptive algorithm which regulates the level of tessellation acording to more or less detailed areas (depending on distance, screen resolution and so on) gives significant gains across the board and is more representative of what developers will put into place. The gap between the GeForces and the Radeons is then reduced and the Radeons sometimes even move in front in certain cases.
Is AMD sacrificing quality?
Is AMD sacrificing quality?With the arrival of the Radeon HD 6800s, AMD decided to review the functioning of the Catalyst A.I.s, the driver option which handles activation of certain generic optimisations as well as optimisations specific to each game.
Previously it was only possible to enable certain more aggressive optimisations in terms of filtering or to completely deactivate Catalyst A.I. and all the optimisations made by AMD. Henceforth you’ll be able to set texture filtering optimisations independently. A “quality” mode is activated by default, which however integrates slightly more aggressive optimisations than previously and introduces more flickering in some textures which then results in lower quality. When AMD introduced the Radeon HD 6900s, they told us that they didn’t intend to go back to this. However an intermediary compromise could well see the light of day.
Note that the problem isn’t really linked to these software optimisations, but, we imagine, to the fact that the texturing units are under-sampling, or in other words a hardware optimisation. The software optimisations introduced however amplify the shimmering phenomenon.
So we could observe the gain given by these optimisations that we judge as being too aggressive, and which reduce quality somewhere where NVIDIA already had a slight advantage, we measured performance across our entire test protocol with a Radeon HD 6970 and both texture filtering quality modes:
Depending on the game, the gain in performance varies between 0 and 6% with an averge gain of between 1 and 1.4%. Note that the average gain corresponds to the impact of the optimisation on our performance index. It is therefore relatively modest and doesn’t affect our conclusions in terms of these products as a difference of 2% in performance isn’t going to alter whether we recommend one model over another.
Note also that during our tests we accept the maximum filtering setting available in the games themselves and don’t force 16x anisotropic filtering in the driver control panel. Had we done so, the impact of these optimisations on results would no doubt have been more significant.
As things stand in the here and now, there’s no reason to sacrifice default graphics quality in this way and we hope that AMD will correct this with future drivers. In any case, as you can see, we used the high quality mode in this test.
Antialiasing, the test
AntialiasingWith the Radeon HD 6900s, AMD has introduced a new antialiasing mode, which is linked to the hardware here and won’t therefore be available on other GPUs. This new mode is called EQAA and is a clone of NVIDIA’s CSAA which allows the GPU to balance the weight of various samples more precisely in certain cases, allowing you to approach the quality you’d get with the next level of antialiasing up.
AMD 4x = Nvidia 4x
AMD 4xEQ = Nvidia 8x
AMD 8x = Nvidia 8xQ
AMD 8xEQ = Nvidia 16xQ
Like NVIDIA, AMD has also introduced the ‘Enhance application settings’ option. With the Radeons, this option will add EQAA to the antialiasing enabled in the game.
The testFor this test, we used the same protocol as the one we used in the GeForce GTX 580 test, though we haven’t retained H.A.W.X. 2. After experiencing the game, disappointing both in terms of gameplay and graphics, we didn’t see the point in keeping it. Except for detailed terrains, rendering is poorer and less successful overall than in the first version. No reason, then, to encumber our protocol with this game and its partisan technical choices which at the end of the day tell us no more than theoretical tessellation tests.
The tests were carried out at 1920x1200, without FSAA, with 4x MSAA and 8x MSAA. We also carried out high-end bi-GPU tests at 1920 4x MSAA and 8x MSAA and at 2560 4x MSAA. We therefore separated the high-end and ultra high-end and drew up separate graphs, with the exception of Far Cry 2 and Crysis Warhead, which were tested at 1920 and 2560, without AA, with 4x MSAA and with 8x MSAA.
Note that we made sure to test this mode on the GeForces, which isn’t always easy to work out. In the NVIDIA drivers, 8x antialiasing is in fact MSAA 4x with CSAA 8x which doesn’t give the same quality as MSAA 8x, which, for its part, is called 8xQ antialiasing. This is therefore what we tested.
Now one year after its release, we’ve decided that we can no longer treat DirectX 11 separately, especially as all the cards tested here are compatible with this API. DirectX 11, 10 and 9 games are therefore mixed together and we opted for very high settings even in the most demanding games. The most recent updates were installed and all the cards were tested with the most recently available drivers.
We decided to stop showing decimals in game performance results so as to make the graph more readable. We nevertheless note these values and use them when calculating the index. If you’re observant you’ll notice that the size of the bars also reflects this.
The Radeons were tested with texture filtering at the high quality setting. All the Radeons were tested with the beta 18.104.22.168RC2 driver and the Catalyst 10.12 CAPIs.
Test configurationIntel Core i7 975 (HT and Turbo deactivated)
Asus Rampage III Extreme
6 GB DDR3 1333 Corsair
Windows 7 64-bit
Catalyst beta 22.214.171.124RC2 + CAP1 (Catalyst 10.12)
Need for Speed Shift
Need for Speed Shift
To test the most recent in the Need for Speed series, we pushed all options to a max and carried out a well-defined movement. Patch 1.1 was installed.
Note that AMD has implemented an optimisation that replaces certain 64-bit HDR rendering areas with others at 32-bit. NVIDIA naturally has jumped on this but it doesn’t spoil quality. In reality AMD is taking advantage of an architecture particularity that can process 32 bit HDR formats at full speed. In fact the GeForces can also do so and NVIDIA supplied a tool to the press which allows it to be set up and to obtain comparable performance.
In this first test, the Radeon HD 6900s bring a significant gain on the Radeon HD 5800s.
The high-end multi-GPU systems are limited here by the CPU.
To test Arma2, we carry out a well-defined movement on a saved game. We used the high graphics setting in the game (visibility at 2400m) and pushed all the advanced options to very high.
ArmA 2 allows you to set the display interface differently to 3D rendering which is then aligned with the display via a filter. We used identical rendering for both.
The antialiasing settings offered for this game aren’t clear and are different between the AMD and NVIDIA cards. 3 modes are offered by AMD: low, normal and high. These correspond to MSAA 2x, 4x and 8x. With NVIDIA things get more complicated:
- low and normal = MSAA 2x
- high and very high = MSAA 4x
- 5 = MSAA 4x + CSAA 8x (called 8x in the NVIDIA drivers)
- 6 = MSAA 8x (called 8xQ in the NVIDIA drivers)
- 7 = MSAA 4x + CSAA 16x (called 16x in the NVIDIA drivers)
- 8 = MSAA 8x + CSAA 16x (called 16xQ in the NVIDIA drivers)
Patch 1.5 was installed.
Strangely, performance actually drops slightly across the Radeon range with the latest drivers. The Radeon HD 6900s don’t do so well here and finish behind the Radeon HD 5870 and 6870.
The GeForces have a small advantage; with their CPU limitation slightly lower.
To test Starcraft 2, we launched a replay and measured performances following one player’s view.
All graphics settings were pushed to a maximum. The game doesn’t support antialiasing which is therefore activated in the control panels of the AMD and NVIDIA drivers. Patch 1.0.3 was installed.
Note the better efficiency of the Radeons at 8xAA.
The high-end multi-GPU systems are limited by the CPU here.
The Mafia II engine passes physics handling over to the NVIDIA PhysX libraries and takes advantage to offer high physics settings which can be partially accellerated by the GeForces.
To measure performance we used the built-in benchmarks and all graphics options were pushed to a maximum, first without activating PhysX effects accelerated by the GPU:
The Radeon HD 6970 is around 10% up on the Radeon HD 5870 here.
Next, we set all PhysX options to high:
With PhysX effects pushed to a maximum, performance levels dive. Note that they are in part limited by the CPU, as not all additional PhysX effects are accelerated. Of course the Radeons remain a long way behind.
Crysis Warhead replaces Crysis and has the same resource-heavy graphics engine. We tested it in version 1.1 and 64-bit mode as this is the main innovation. Crytek has renamed the different graphics quality modes, probably so as not to dismay gamers who may be disappointed at not being able to activate the highest quality mode because of excessive demands on system resources. The high quality mode has been renamed as “Gamer” and the very high is called “Enthusiast”. This is the one we tested.
The Radeon HD 6900s do pretty well here in comparison to the GeForces and as quality goes up the Radeon HD 6970 gets closer to the GeForce GTX 580.
Far Cry 2
Far Cry 2
This version of Far Cry isn’t really a great development as Crytek made the first episode in any case. As the owner of the licence, Ubisoft handled its development, with Crytek working on Crysis. No easy thing to inherit the graphics revolution that accompanied Far Cry, but the Ubisoft teams have done pretty well, even if the graphics don’t go as far as those in Crysis. The game is also less resource heavy which is no bad thing. It has DirectX 10.1 support to improve the performance levels of compatible cards. We installed patch 1.02 and used the “ultra high” quality graphics mode.
The GeForces do particularly well with Far Cry 2!
H.A.W.X. is a flying action game. It uses a graphics motor that supports DirectX 10.1 to optimise results. Among the graphics effects it supports, note the presence of ambient occlusion that’s pushed to a max along with all other options. We used the built-in benchmark and patch 1.2 was installed.
Here the Radeon HD 6900s bring a very slight gain.
The first game with DirectX 11, or more precisely Direct3D 11 support, we couldn’t not test BattleForge. An update added in September 2009 gave support for Microsoft’s new API.
Compute Shaders 5.0 are used by the developers to accellerate SSAO processing (ambient occlusion). Compared to standard implementation, via the Pixel Shaders, this technique allows more efficient use of the available processing power by saturating the texturing units less. BattleForge offers two SSAO levels: High and Very High. Only the second, called HDAO (High Definition AO), uses Compute Shaders 5.0.
We used the game’s built-in benchmark and installed the latest update available (1.2 build 298942).
With antialiasing, the Radeon HD 6970 is very slightly overtaken by the Radeon HD 5870.
The GeForces dominate things quite clearly here.
Pretty successful visually, Civilization V uses DirectX 11 to improve quality and optimise performance in the rendering of terrains thanks to tessellation and in implementing a special compression of textures thanks to the compute shaders. This compression allows it to retain the scenes of all leaders in the memory. This second usage of DirectX 11 doesn’t concern us here however as we used the benchmark integrated on a game card. We zoom in slightly so as to reduce the CPU limitation which has a strong impact in this game.
All settings were pushed to a max and we measured performance with shadows and reflections. Patch 1.2 was installed.
The Radeon HD 6900s bring a significant gain here but are still behind the GeForce GTX 500s.
Strangely the Radeon HD 5870 2 GB seems to have a hard time getting the most out of a multi-GPU set-up in this game with the most recent drivers.
S.T.A.L.K.E.R. Call of Pripyat
S.T.A.L.K.E.R. Call of Pripyat
This new S.T.A.L.K.E.R. suite is based on a new development of the graphics engine which moves up to version 01.06.02 and supports Direct3D 11 which is used both to improve performance and quality, with the option to have more detailed light and shade as well as tessellation support.
Maximum quality mode was used and we activated tessellation. The game doesn’t support 8x antialiasing. Our test scene is 50% outside and 50% inside and inside it is surrounded with several characters.
The Radeon HD 6900s show a slight gain with antialiasing in this game.
CrossFire X brings more of a gain than SLI here, giving a slight advantage to the Radeon HD 6900s over the GeForce GTX 500s.
The latest Codemaster title, F1 2010 uses the same engine as DiRT 2 and supports DirectX 11 via patch 1.1 that we installed of course. As this patch was developped in collaboration with AMD, NVIDIA told us that they had only received it late in the day and haven’t yet had the opportunity to optimise its drivers for the game in the DirectX 11 version.
We pushed all the graphics options to a max and we used the game’s own test tool on the Spa-Rancorchamps circuit with a single F1.
In F1 2010, the Radeons are particularly at ease and the GeForce GTX 580 is slightly behind the GeForce GTX 6970.
In multi-GPU systems, the Radeons are a good deal ahead.
Probably the most demanding title right now, Metro 2033 forces all recent graphics cards to their knees. It supports GPU PhysX but only for the generation of particles during impacts, a quite discrete effect that we therefore didn’t activate during the tests. In DirectX 11 mode, performance is identical to DirectX 10 mode but with 2 additional options: tessellation for characters and a very advanced, very demanding depth of field feature.
We tested it in DirectX 11 mode, at a very high quality level and with tessellation activated, both with and without 4x MSAA. Next we measured performance with MSAA 4x and Depth of Field.
The Radeon HD 6900s are significantly up on the Radeon HD 5800s here, putting the Radeon HD 6970 on a par with the GeForce GTX 570.
The Radeons suffer a little less than the GeForces at 2560x1600.
Performance recapAlthough individual game results are worth looking at, especially as high-end solutions are susceptible to being levelled by CPU limitation in some games, we have calculated a performance index based on all tests with the same weight for each game. Mafia II is included with the scores obtained without GPU PhysX effects.
We attributed an index of 100 to the Radeon HD 5870 2 GB at 1920x1200 with 4xAA:
Hold the mouse over the graph to view cards by performance at 1920 4xAA.
Overall the Radeon HD 6970 finishes slightly behind the GeForce GTX 570, except at 8xAA where they are on an equal footing. Looking at the Radeon HD 6950, it’s on a par with the Radeon HD 5870. Note that at 1920x1200, the 2 GB of memory only have limited application in current games.
At the end of the day, Cayman only produces between a 7 and 14% gain on Cypress. In comparison to Barts, the mid-range GPU, the difference varies between 17 and 26%, which is lower than usual between two GPUs. The Radeon HD 6950 therefore only has an advantage of between 7% and 14% on the Radeon HD 6870.
Note also, there’s a lower difference between the Radeon HD 6970 and 6950 than between the Radeon HD 5870 and 5850.
Hold the mouse over the graph to view cards by performance at 1920 4xAA.
In a bi-GPU set-up, the Radeon HD 6970 and 6950 do much better in some of the games limited by CPU performance. Combined with higher gains in some other titles they’re able to position themselves better than in the single card classification. In CrossFire X, the Radeon HD 6970 gets pretty close to the GeForce GTX 580 SLI and the Radeon HD 6950 in CrossFire X closes the gap on the GeForce GTX 570 in SLI.
Note also that there are bigger gains with the multi-GPU set-up on the Radeon HD 6950s than on the Radeon HD 5870s though mono-GPU performance is similar. AMD says that it has improved CrossFireX support but hasn’t given any detail on how. We did however also note lower gains in some games with the Radeon HD 5800s and 6800s than in a previous test which used different drivers.
ConclusionIt’s difficult not to have a mixed opinion with respect to the new GPU that equips the Radeon HD 6900s, so much does performance vary from one game to another. At best we recorded a gain of 35% over the Radeon HD 5870 and the lowest gain showed a deficit of 6%.
Of course, there have been enough architectural changes to suppose that AMD has room for progress, especially by improving how the compiler works. We are nevertheless sceptical with respect to the new processing unit set-up, especially as it actually reduces the processing to texturing power ratio, in contrast to the trend we’ve seen from AMD for a few years now. We are even more sceptical as to the ability of the Radeon HD 6900s to fully benefit from all these texturing units as this takes the cards over the energy consumption limit fixed by AMD.
Paradoxically, this is one of the major advances made by AMD: a unit able to control a GPU’s energy consumption and prevent it from going over a given limit by reducing clocks, just as CPUs have been doing for some time now. Energy consumption can no longer explode and get out of control. Although we still need to look at the impact of this technology more closely, in the long run it can only be beneficial by opening the way for a Turbo mode for GPUs and providing graphics solutions for the mobile market with tailor made thermal envelopes.
When it comes down to making your decision, once again AMD has decided to let pricing do the talking. At €330, the Radeon HD 6970 is thus slightly cheaper than the GeForce GTX 570 and gives a similar level of performance. Both solutions have their advantages: Eyefinity and 2 GB on the Radeon side, against PhysX and a more developed stereo 3D ecosystem for the GeForce. As far as we’re concerned, although there’s no doubt the Radeon HD 6970 is a solid alternative, we retain a small preference for the GeForce GTX 570 with its lower noise levels.
With identical performance across the board, 2 GB of memory ensuring longevity and aggressive pricing (€270), the Radeon HD 6950 is replacing the Radeon HD 5870 with something better. Although there’s a total absence of direct competition from NVIDIA (before the release of the GeForce GTX 560 Ti), we can only think that a 1 GB version, which would have been that little bit cheaper, would have been an even better buy, taking the Radeon HD 6870 out of the picture almost entirely.
Moreover, these new Radeons do very well in CrossFire X and with a default of 2 GB of memory, they’re probably very good candidates for a multi-GPU upgrade, when the right time arrives. As things stand, such a system will allow you to set yourself up with surround gaming, which is very demanding in terms of memory resources. Note however, that such a high-end multi-GPU solution will suffer from very high noise levels.
Lastly, we maintain our red card for AMD for the reduction in default graphics quality (with too aggressive optimisations of texture filtering) on the arrival of the Radeon HD 6800s and across the whole range. In 2010, this is no longer acceptable!
Copyright © 1997-2014 BeHardware. All rights reserved.