Since the release of the Radeon X1800 one year ago, ATI spoke of other possibilities to use their GPUs besides 3D rendering. The last time was in August with the annoucement of DPVM, a low level access to the GPU. This was an important step to access the GPU as a general calculation unit.
Today, ATI pushes the concept further. If previous announcements and documents published were initially intended to fill in marketing brochures for the release of new GPUs and to inform developers of new possibilities, now there are practical applications. A few days ago in San Francisco, ATI presented in partnership with the University of Stanford, PeakStream, Microsoft, Adobe etc., an initiative called Stream Computing with in mind the objective to be recognised as a power calculation provider. PeakStream is a very young actor of the market who sells programming solutions that generally exploits processors such as GPUs. This is a very important partner for ATI to enter the market.
The marketing department chose another name for DPVM (Data Parallel Virtual Machine). It is now called CTM (Close To Machine). ATI developed several demonstrations for the professional market that use the GPU as an accelerator. For example, there are facial recognition systems that require high level of performances (Intel uses this example for future CPUs based on dozens of cores), financial simulations, data bases search engines …Everything is functional and ATI is working on convincing the market to trust this solution.
The University of Stanford just took a symbolic step with the release of a Folding@Home client accelerated by a GPU. It is only currently certified with the Radeon X1900 and X1950 as other GPUs do not provide high enough performances and are restricted by their architecture (this is the case of NVIDIA's chips). The drivers required for this application are the Catalyst 6.5 or 6.10 (apparently there is a bug for the current version of the beta Catalyst 6.10). To facilitate the certification (it is of course not possible to let these systems produce erroneous results), the development team is currently focusing on the hardware and drivers. Even if they are both using Direct3D 9, the program checks that they are indeed in use and close if it isn't the case. With a Core 2 Duo E6600 processor, we measured a calculation time of one step of approximately 18 minutes (one single core is used per application) instead of 7 minutes for a Radeon X1950 XTX (one core remains fully used). You should note that the Radeon runs in 2D as Folding@Home works in windows mode. In consequence frequencies are reduced: 500/600 instead of 650/1000.
ATI told us that a version using CTM instead of Direct3D was currently under development and would be released soon if performances provided were high enough (they could be two times higher). This would also make possible the support of the Radeon X1800.
CTM allows sometime huge performances gains since it is no longer necessary to round up some of the limitations required by the API. According to ATI, gains are sometime impressive and could be even higher thanks to more generic optimisations from one side than the other. Algorithms used for CPUs are generally known well in advance and are highly optimised. The GPU version has, however, just been released and the first objective is that it needs to work. Because of the complexity due to the round up of the primary use of Direct3D, it is logical to think that more work is done to optimise the code once the CTM is used. Also, an engineer of ATI told us to have been surprised by the easiness with which new employees who used to write code for CPUs were able to fully exploit GPUs via the CTM.
We asked ATI the question about the possibility to use CTM in Havok FX and the manufacturer answered that this solution was currently evaluated by Havok. If it is chosen, ATI will have a consequent performance advantage compared to NVIDIA. Havok would however have to support two versions of the software. There is another advantage to the CTM: it facilitates the use of GPU to process physics effects that influence the gameplay (GPUs are mainly perceived as capable of processing physics effects regarding visual effects).
The last important points to mention are the reliability and accuracy of calculations. ATI keeps the possibility to launch a range of GPUs that will be used as general calculators with several certifications to guaranty the reliability. The number of small mistakes is currently higher for general public GPUs than CPUs. They would produce one error every two days according to Stanford's team. You should know thought that these small mistakes are most of the time meaningless. Also, if ATI chose not hide the truth it means that the manufacturer studied this point in detail and knows where it is heading. If ATI releases a new range of GPUs, we can suppose that they will go thought the usual advanced tests just like the FireGL.
Current GPUs use simplified FP32 calculation units that do not always give the same result than a calculation unit fully respecting the IEEE norm. DirectX 10 GPUs will push the concept further but ATI said that they won't match the exact specifications despite being closer. The objective is of course to limit additional costs. We can however imagine that in the future, AMD will integrate units that fully respect the norm to have a product to be released with the Torrenza platform (and why not a high end GPU for which the die size is less important?).
ATI is a precursor of this new market. Now, the thing is that if NVIDIA is not far away and has an adapted infrastructure with Quadro Plex, the ageing architecture of the current GPUs strongly limits the possibilities as general calculation unit. Once the G80 will be included in the Quadro Plex, Nvidia will have a product ready to be released. There is one last detail. The founder of PeakStream and current Chief Technology Officer, Matthew Papakipos is a former employee of NVIDIA and was in charge of the conception of the GeForce 3, GeForce 4, GeForce FX and GeForce 6 architecture (additionally to have worked on other GPUs since the TNT). We can imagine the tight bond between NVIDIA and PeakStream still exist even if at this date the company chose ATI who currently has the most interesting and advanced solution.