Before really getting to grips with our subject and to help understand exactly what we’re comparing, let’s go back over a few points with respect to the subject of video.
Container, codecsWhat is wrongly called a video file is above all else a container. The concept is simple, it interlaces within it the various contents of the file, namely the video track, the audio track and potentially the subtitling tracks.
Left: view of principle. Right: interlaced tracks.
Interlacing the content facilitates the simultaneous playback of different tracks without requiring too much movement within the file. Related data (the audio that corresponds to an image for example) thus remains in close proximity. This is particularly important for video discs (DVDs, Blu-rays) or when transmitting a digital flow (via digital broadcast for example).
There isn’t really any relationship between a container and the format of the tracks it contains. An AVI file can contain video tracks encoded in many formats (MPEG-1, Xvid and so on), with the same going for audio (MP3, AAC and so on). A utility like MediaInfo will tell you exactly which tracks make up a file, and which format they are encoded in. There are also some technical limitations, which can for example prevent a video format such as H.264 from being integrated (straightforwardly) in an AVI file.
Therefore, when we talk about transcoding solutions, we’re actually talking about changing one or several of the formats in the original file. The compression formats can be changed to give a smaller file, converting, say, a DVD (VOB container, MPEG-2 video, Dolby Digital audio) to a video file (AVI, XviD, mp3). Sometimes you might want to retain the same video format but reduce the size of the files, by, for example, converting a Blu-ray (.m2ts, H.264, DTS-HD) into a file that can be played on a tablet (.mp4, H.264, AAC) or games console. Even if you don’t change the video format (H.264 on each side), you might want to recompress the video, either to change its size (Blu-rays take up a lot of space), screen size (reduce to 1280 x 720 instead of 1920 x 1080) or specificity of the destination device (iPads, for example, can't read all H.264 compression profiles).
There are, then, a large number of stages to manage when it comes to transcoding a file, and not all these stages can be accelerated by the GPU. In the diagram below, from left to right, you can see the stages necessary for the transcoding of a video file:
Click to enlarge.
Of all these stages, only two can currently be accelerated by a GPU, decoding the original video track to a raw video format (using dedicated GPU circuits, the same that are used during accelerated playback of a DVD or a Blu-ray) and encoding (using either the GPU processing units [CUDA – NVIDIA / Stream – AMD] or a part of the GPU dedicated to this task [Intel method for Sandy Bridge/HD 3000].
Only these two stages can be accelerated then, but they are by far the most resource hungry. Here is, for example, a breakdown of an encode that we carried out for our tests (file extracted from a Blu-ray to an MKV 720p file):
This is a very high quality encode that we processed to create a source file (we’ll come back to this). Video encoding time is therefore significant. To conclude on the subject of transcoding, note two points it’s important to bear in mind for the rest of this article:
- Decoding and encoding of video are tasks that are carried out in parallel, frame by frame. In theory encoding takes longer than decoding, but this isn’t always so in certain cases with respect to GPU acceleration.
- Although some tools allow you to break down the time taken for each stage, not all the software that we tested does so. When we talk about transcoding times further on, this will therefore consist of the time taken for all stages (demux, video encoding, audio, remux) and not only video encoding time!
Let's now move on to the specificities of H.264.