Crysis 2 DX11, a closer look at performance and tessellation - BeHardware
>> Graphics cards
Written by Damien Triolet
Published on July 7, 2011
Crytek and Electronic Arts recently published a significant update to Crysis 2 which includes patch 1.9, a DirectX 11 pack and a pack of high resolution textures. As a complement to our previous report on the performance of forty five graphics cards in this game, we of course wanted to look at performances in the new “Ultra” quality mode. We’ve also taken the opportunity to look in detail at what has been done with tessellation and present the modern graphics effects that have been introduced using some clear visuals.
A DX11 pack developed in partnership with NVIDIAAs Crytek mainly focused on the multi-platform side of things for Crysis 2, and therefore on console support, the PC version was somewhat sidelined in comparison to what we’re used to seeing from Crytek with Far Cry and the first Crysis. It has to be said that consoles are now something like five years behind the technological developments of modern graphics cards. Of course they can be used more efficiently without the burden of PC APIs and make do with lower resolutions (720p), but this doesn’t make up for the gulf that has appeared between consoles and graphics cards, especially as they don’t support the latest features such as programmable tessellation (its support is limited on the Xbox 360).
This isn't good for business for AMD and NVIDIA, who struggle at times to convince customers of the utility of updating their graphics cards or going for more powerful models when games don’t require them and haven’t been designed for them. NVIDIA has been relatively close to Crytek for several years now and is well aware of the influence a major title can have on card sales. The company therefore decided to collaborate with the developer so as to help or persuade them to put a DirectX 11 update out there.
Working with Crytek in this way helps NVIDIA to justify the power of its latest GPUs and to influence the way tessellation is used – tessellation is a rendering technique that the GeForces process more efficiently than the AMD Radeons. For Crytek, this DirectX 11 update is part of an attempt to stimulate the very disappointing PC sales for Crysis 2. By the developer’s own admission, neglecting the PC version somewhat was a strategic mistake...
While it may be supposed that some sort of financial agreement has been negociated between Crytek, EA and NVIDIA, none of the parties is expected to make any comment on this. Whether a big cheque, sports bags full of greenbacks, software engineering support, marketing support or a mix of all that, it seems clear that NVIDIA has put its hand in its pocket in some way or another, probably with very precise requirements in terms of the use of tessellation, which apparently didn’t feature particularly prominently in Crytek’s initial plans. Crytek says moreover that this is what took longest in terms of implementation.
New DirectX 9 features
New DirectX 9 featuresWhile patch 1.9 corrects numerous small bugs, we have confined ourselves to comment on the graphics here. This aspect has moved forward with the appearance of an “Ultra” quality mode that comes in above the high, very high and extreme modes (Crytek has a history of trying to attract gamers equipped with modest systems by introducing an optimistic naming system).
Patch 1.9 introduces a new tone-mapping algorithm. To recap, this algorithm reduces the HDR Image to a standard format that can be displayed on all screens, by trying to retain only the most useful data. With the new version of the algorithm, Crytek is trying to get closer to a film type feel, with more details retained in the darker tones and a slightly more contrasted image. This change is really quite beneficial in practice.
[ Old tone-mapping ] [ New tone-mapping ]
The Ultra setting also introduces two new features. We are still in DirectX 9 mode here. The first is called ‘Contact Shadows’ and is a development of SSAO type ambient occlusion. Known as Screen Space Directional Occlusion (SSDO), this technique is a little bit closer to reality and prevents artifacts in the contact area between objects and the shadows that are generated.
[ Standard rendering ] [ SSAO ] [ SSDO ]
The second new feature is called Realtime Local Reflections. It enables you to obtain, even in a deferred rendering engine such as that used in Crysis 2, an approximation of ray-tracing type reflections of close-up environments, including self-reflections.
[ Without RLR ] [ With RLR ]
Patch 1.9 also adds support for two new modules. The first is a pack of high resolution textures for which there wasn't enough room in the first version of the game, limited by the capacity of DVDs. Lots of textures therefore gain in definition. The main innovation however is of course the DirectX 11 pack support, which introduces numerous other graphics effects.
New DirectX 11 features
New DirectX 11 featuresThe DirectX 11 module adds a few small general optimizations and improvements in quality, but above all, additional effects with the Ultra Upgrade, enabling finer rendering in this setting.
First of all Motion Blur HDR is now in full resolution, which means you get a sharper image during rapid movements and the removal of blur bleeding on objects that shouldn't be affected by it, such as the player's weapon.
[ Motion blur low resolution ] [ Motion blur full resolution ]
Particles can now receive shadows and motion blur, on a a case by case basis decided by artists, which makes for increased accuracy.
Shadows are applied on smoke particles.
In the Ultra DirectX 11 version, shadows change with variable penumbra, a technique which is also called Contact Hardening Shadows in the AMD version and integrated for example in S.T.A.L.K.E.R. Call of Pripyat and DiRT 3. This gives closer to reality visuals with shadows that soften more and more as they get further away from the object that generates them, giving a more accurate result than fixed filtering across the whole shadow.
[ Standard shadows ] [ Shadows with variable penumbra ]
A new type of Deph of Field is introduced with the DirectX 11 Ultra Upgrade. Known as “Sprite-based Bokeh Depth of Field”, it’s similar to the implementation used in 3DMark 11. Instead of a simple post processing filter, the technique consists in generating a sprite for each pixel using geometry shaders. The size of this sprite depends on the circle of confusion radius and can thus represent objects that are either in focus or not. The technique also allows artists to use a mask to fake the end result produced by different types of lenses diaphragm (Bokeh), according to the judgement of the artist. As this effect is very demanding in terms of processing resources, it is implemented at half resolution with optimisations to prevent its bleeding onto objects in focus. Note that it isn’t really used outside of cinematic sequences.
[ Standard DoF ] [ DX11U DoF circular Bokeh shape ] [ DX11U DoF pentagonal Bokeh shape ]
After improving all these effects, Crytek set to adding details to objects and environments. This is where the use of tessellation comes in, one of the new features of DirectX 11. Crytek has included across the board tessellation support in its CryENGINE 3 engine, but in practice how it is used depends on artists as both textures and models have to be revisited. Moreover Crytek says that current GPUs are not yet sufficiently powerful to apply tessellation on the whole level and that it must therefore be used sparingly. In addition, Crytek says that some GPUs (the Radeons) present performance issues when the level of tessellation is too high, which is why pre-tessellated geometries have been introduced in instances where artists consider them to be the best compromise for performances. Tessellation is used on some walls, broken-up floors, stones and other objects:
[ Without tessellation ] [ With tessellation ]
[ Without tessellation ] [ With tessellation ]
Given that, for performance reasons, tessellation cannot be over-used, Crytek has reintroduced Parallax Occlusion Mapping in DirectX 11 Ultra mode. To recap, this effect enables pretty realistic simulation of reliefs, though not on silhouettes. It is mainly used for the ground, to very good effect, except when other objects or characters appear on top. These objects/characters then seem to be floating as it isn’t possible to sink them into the relief created for the ground.
[ Without POM ] [ With POM ]
Lastly, tessellation adds more details in water scenes. Here Crytek says that tessellation adapts to the distance of the camera from the scene to prevent the generation of too much aliasing in the background. Thanks to the more complex geometry used, the movement of the sea is more impressive and more detailed. The smallest areas of water that appear in the game also benefit from tessellation, giving in particular a more realistic rendering when these areas of water interact with an object or character. How the water itself is rendered has also changed in terms of imrpoved foam and sub-surface scattering support to give more realistic lighting.
[ Standard sea ] [ DX11U sea with tessellation ]
DirectX 11 performance
DirectX 11 performanceBefore entering into more detail on the impact of these new graphics effects on performance, we measured the performance of different graphics cards at the DirectX 11 Extreme and Ultra settings, so as to get an overall vision of the situation and calculate the impact of the Ultra Upgrade on performance.
Note that we didn't use any special DirectX 11 level, nor did we attempt to measure performance in a specific scene in terms of these effects. We enabled the high resolution textures and used the same scene as used in all our previous tests in Crysis 2. We installed the latest official drivers and tested on our usual platform:
Intel Core i7 980X (HT deactivated)
Asus Rampage III Extreme
6 GB Corsair DDR3 1333
Windows 7 64 bits
Catalyst 11.6 + CAP1
Hold the mouse over the graph to view the impact of the Ultra Upgrade on performance.
The Ultra setting is a good deal more resource hungry than Extreme, which is to be expected given all the graphics improvements introduced in this mode. While the GeForce GTX 570 manages over 30 fps, the only Radeon that exceeds 30 fps is the bi-GPU Radeon HD 6990.
While the equivalent GeForces and Radeons are pretty much on a par at the Extreme setting, the GeForces are less affected by the Ultra Upgrade and therefore have the advantage here. There is just a 33% drop in performance with the GeForce GTX 570 and 580, but this rises to 48% for the Radeon HD 6900s! It’s surprising to see that these higher end parts are more impacted than the Radeon HD 6800s, while the opposite is true for the GeForces.
Note that high resolution textures can be handled by a 1 GB card without any problem, as the results for the Radeon HD 6950 2 GB and 1 GB cards show.
While at the Extreme setting, the Radeon HD 6950 gives an identical level of performance to the GeForce GTX 560 Ti, the GeForce has an advantage of over 20% with the Ultra Upgrade. The difference between the GeForce GTX 570 and the Radeon HD 6970 increases to 33% at this setting but is only 11% between the GTX 460 and the HD 6850 or between the GTX 550 Ti and the HD 5770.
In more detail...
In more detail…We have of course tried to uncover what lies behind these performance differences, supposing that they probably come from the use of tessellation, with which the GeForces are quite efficient, especially as NVIDIA had a hand in the development of the upgrade.
With patch 1.9, Crytek has at last introduced a menu of advanced graphics settings enabling the user to control whether various effects are activated or not:
Game effects: ??
Objects: Parallax Occlusion Mapping and Tessellation
Particles: Motion Blur and Shadows
Post-processing: full res Motion Blur and Sprite-based Bokeh DoF
Shading: Contact Shadows (SSDO) and Realtime Local Reflections
Shadows: Shadows with variable penumbra
Water: Tessellation and improved quality
Motion Blur Amount: amount of blur
While it isn’t possible to measure the impact on performance of each effect individually as some are grouped within a single setting, you can nevertheless obtain some interesting data by playing with the options. For this test we opted to use a GeForce GTX 560 Ti and a Radeon HD 6950, which give identical performance in Extreme mode, a setting to which we added each of the Ultra Upgrade options in turn:
Hold the mouse over the graph to view the impact of the Ultra setting on performance.
In our test scene, three settings had a significant impact: post-processing (mainly motion blur in our case), shading (mainly SSDO) and objects (POM and tessellation). It’s this last setting which had most impact and which separates the GeForces from the Radeons most of all. Although the level of performance on the GTX 560 Ti drops by just 20%, it falls by 38% on the Radeon HD 6950!
This difference may be linked to the Parallax Occlusion Mapping algorithm as well as tessellation though we definitely reckon that tessellation is what has most impact. To find out for sure, we fixed the maximum tessellation factor at zero in the AMD drivers. This doesn’t entirely suppress the impact of tessellation, but almost, as there will be no geometry amplification.
Hold the mouse over the graph to observe the impact of the Ultra Upgrade on performance.
In view of the jump in performance obained when this parameter is forced in the AMD drivers, it seems clear that the majority of the load on the Radeons in DirectX 11 Ultra does result from tessellation. Note also that this is the element that is at the source of the performance difference between the various Radeon cards with the DirectX 11 Ultra Upgrade as, without geometry amplification, all the Radeons show a similar performance drop.
A closer look at tessellation
A closer look at tessellationAfter ascertaining that tessellation was indeed responsible for the significant drop in performance observed with the Radeons with DirectX 11 Ultra, we wanted to take a closer look at where it is in fact used. Here are a few examples which benefit from tessellation, in order of what we estimate to be the most useful. To obtain the image in full resolution, click on its description:
[ Without tessellation ] [ With tessellation ]
[ Without tessellation ] [ With tessellation ]
[ Without tessellation ] [ With tessellation ]
While the rendering of the wall is obviously greatly improved by tessellation, giving it optimal quality from every point of view, not all walls get the same treatment. Given that the effect must be introduced manually by the artists and that this was done after the original design process, only some walls benefit from tessellation, which is a shame in terms of overall coherence. Another problem, linked to the fact that tessellation wasn’t planned at the start, is that some objects, mainly stones, can sometimes take on a strange appearance and almost transform themselves into stalagmites. Of course, there are then more polygons, but the rendering isn't necessarily more accurate.
To recap, tessellation isn’t used on most pathways but rather Parallax Occlusion Mapping is, including the paving stones or bricks that paths are made with:
[ Without POM ] [ With POM ]
That said, in our test scene, beyond the first second during which we can make out tessellation on some bricks, there aren‘t any walls, stones, water or any other objects that would seem likely to be chosen for tessellation. Nevertheless, we did note a significant impact on performance linked to this rendering technique. To find out more about what’s going on here, we got hold of GPU Perf Studio, the AMD analysis tool that allows you to get a view of how graphics rendering is built.
We were able to confirm that tessellation had been used on the bricks at the beginning of our scene and that the level of tessellation was rather high:
Next we noted that we had missed one highly tessellated object: the border of the path that we were on:
[ DirectX 11 Extreme ] [ DirectX 11 Ultra ] [ Tessellation wireframe ]
Above and beyond the pathway itself, on which POM is applied, switching to the Ultra Upgrade and enabling tessellation doesn’t make any visual difference on the border. As you can see in our view of the geometry on the left border however, an extreme (or ultra :p) level of tessellation has been applied. It looks like basic planar subdivision without any added detail or edge smoothing. This is of course difficult to justify…
This border represents one of the most resource hungry draw calls in the scene, or should we say two of the most resource hungry draw calls because in contrast to some other deferred rendering engines (such as the 3DMark 11 engine), CryENGINE 3 runs through the geometry twice (this is without factoring in the shadow maps). The fact that the rendering is structured in this way is probably linked to a limitation of DirectX 9 or to performance imperatives when tessellation isn’t used. As tessellation has been added on top of the rest, the base of the rendering hasn’t been adapted to take account of the new addition, which results in making the use of tessellation twice as costly in terms of resources.
Next we noted another tessellated area that is processed at the end of the scene. Rather astonishingly, Crysis 2 carries out a rendering of the sea across almost the entire test scene though the sea is never visible or even in the field of view but masked. Not a single pixel is written and yet this enormous surface represents a significant geometry load:
[ The sea … ] [ … in this scene? ]
This explains why we noted a 3% drop in performance on page 5 in comparison to Extreme mode when the Ultra water was activated on the Radeon HD 6950, even though there was no water in our scene. Is this a bug? Or poor optimisation in terms of the use of tessellation? In any case, the Radeons suffer.
Note that we also wanted to find out more, again using GPU Perf Studio, about why the Radeon HD 6900s give a lower level of performance than expected. Unfortunately, when this tool observes the scene and measures the rendering time of each draw call, it adds an additional load or reduces parallelism, which has an impact on performance and prevents us from giving an accurate answer to this question. According to GPU Perf Studio 2.5.1, all draw calls using tessellation represent:
15.4% of the rendering time on the Radeon HD 6950
14.5% of the rendering time on the Radeon HD 6850
15.7% of the rendering time on the Radeon HD 5870
Given that these draw calls are not only limited by tessellation performance and that the impact we observed on performance was higher, these figures are obviously over-evaluated, probably because the simplest calls are relatively more impacted by the additional cost of instrumentation.
For the real enthusiasts among you and although this doesn't give us an answer to our questions, note that GPU Perf Studio shows (or rather we calculated ourselves, as the AMD developers have an issue with percentages…) that during the processing of all of these calls, the tessellator is occupied during:
89.7% of the time on the Radeon HD 6950
95.3% of the time on the Radeon HD 6850
95.8% of the time on the Radeon HD 5870
What’s more, the LDS memory, used during geometry expansion, limits processing units during:
7.0% of the time on the Radeon HD 6950
7.8% of the time on the Radeon HD 6850
8.7% of the time on the Radeon HD 5870
This LDS memory is limited in part by conflicts in access to its different banks:
38.9% of the time on the Radeon HD 6950
35.9% of the time on the Radeon HD 6850
38.4% of the time on the Radeon HD 5870
Impact of tessellation and the level of tessellation
Impact of tessellationTo summarise our analysis of the impact of tessellation in Crysis 2 DirectX 11 Ultra, we have represented the total cost of tessellation in a graph comparing performance in Extreme mode as default and adding Ultra objects and water. Of course this includes the resource demands of Parallax Occlusion Mapping, but the impact of this is much smaller, which gives us an acceptable approximation:
The results are without mercy for the Radeons, which are a long way behind the GeForces when it comes to processing tessellation. We can see that the GeForce GTX 580 and 570 benefit from their additional dedicated units in comparison to the mid-range GeForce cards to reduce the impact of tessellation.
On the other hand, while the Radeon HD 6900s are supposed to double throughput when tessellation is used (we confirmed this in theoretical tests), in practice this isn’t the case in Crysis 2. Is this due to a GPU limitation in certain cases? Are the drivers limited in terms of support for the double tessellator? Or is it a bug? As during our tests AMD had no answer other than the drivers advised for use with Crysis 2 DirectX 11 were still the Catalyst 11.6s, we don't know.
More logically, the Radeon HD 6800s benefit from having a larger tessellation unit buffer to gain in efficiency over the Radeon HD 5800s. The Radeon HD 5770 actually has an identical tessellation unit to the Radeon HD 5870, but benefits here from having fewer processing units to reduce the bottleneck in terms of access to the memory shared by these units.
Impact of the level of tessellationThe max tessellation factor, which is limited to 64 by default in DirectX 11, can be controlled in the AMD drivers. This factor controls geometry amplification and represents not the number of triangles generated but the maximum number of subdivisions for each edge of the triangle. The number of new triangles is therefore much higher than the tessellation factor and depends on the algorithm used.
We observed both the cost of tessellation and the performance in DirectX 11 Ultra mode on the Radeon HD 6950 and the various MaxTessFactor limits:
Hold the mouse over the graph to view performance in the DirectX 11 Ultra Upgrade.
As expected, the impact of tessellation drops off but however still includes the cost related to Parallax Occlusion Mapping. In terms of performance, reducing the MaxTessFactor from 64 to 32 gives a 10% gain while dropping to 16 puts the Radeon HD 6950 on a par with the GeForce GTX 560 Ti.
There is a slight difference in quality when you set the MaxTessFactor at 32. It’s invisible on most elements of the image but noticeable on the walls. As of 16, quality drops more significantly on the walls but remains pretty much equivalent on other objects. Beyond 16, while some elements are still displayed with a decent level of quality, this isn't the case for the walls at all – they’re probably the objects that benefit most from tessellation.
We therefore advise Radeon users to go for a MaxTessFactor of 32 or even 16 if you're prepared to make a few compromises on quality. Note that AMD isn’t yet using the MaxTessFactor to limit the level of tessellation by default and we of course hope that this doesn’t change.
ConclusionWhen we use games to measure graphics performance, it’s important for us to have an idea of how the graphics engine works, especially when we’re dealing with new technologies such as tessellation. This allows us to differentiate between results that are significant for the game in question and those which may represent a future trend.
While Crysis 2 DirectX 11 Ultra confirms NVIDIA’s domination when it comes to tessellation processing, it isn’t nevertheless truly representative of what developers of this technology will do in forthcoming games with engines and creative tools designed to manage the technology natively and more efficiently.
Crytek has worked hard to control the level of detail for each effect as precisely as it can so as to avoid excessive processing demands, but tessellation is an exception to this with an unjustified level of the technology applied to some objects and a rather poor level of adaptiveness, which contrasts with the zeal with which the level of detail on other elements is reduced with distance. While this can partly be explained by the fact that tessellation was added at the end of development, it also seems clear that the fact that Crytek worked with NVIDIA on the Ultra Update had an influence on this, given that a high level of tessellation, whether serving the rendering or not, enables NVIDIA cards to mark themselves out from the competition.
In a perfect world we would obviously prefer these issues simply to be technical or artistic ones… We would thus have liked to see more tessellated walls, which would really add to the visual impact, and fewer borders tessellated without reason, except NVIDIA's commercial benefit. The release of the DirectX 11 pack is nevertheless of benefit to PC gamers as all the visual improvements it introduces transform Crysis 2 into one of those rare titles that can draw on the full capacities of modern GPUs, even if it has to be said that what we're now getting with this Upgrade is what we expected from Crysis 2 on its initial release. While the upgrade doesn't revolutionise an already fine rendering, it does represent a significant development and should renew interest in the title… especially if you own a high-end GeForce.
What about high-end Radeon users? Faced with the poorly optimised implementation of tessellation in Crysis 2, we advise them to play with the Catalyst Control Center setting so as to fix the MaxTessFactor in such a way as to contain its impact on performance.
Copyright © 1997-2015 BeHardware. All rights reserved.