Mobile CPUs: AMD A8 and A10 vs Core i5 and i7 (Llano, Trinity, Sandy and Ivy Bridge) - BeHardware
Written by Guillaume Louel
Published on June 11, 2012
Niche, expensive and for the enterprise market in the past, for the last ten years laptops have become accessible to all. Numerous factors have contributed to this – the netbook effect having had a clear impact on the average sales price of machines – to the point where laptops have become popular for reasons other than portability. While some users do still need mobility, most laptops sold for the general consumer satisfy other criteria: large screen, small battery and as high a level of performance as possible for a low price. These machines are often used as transportables – from one room to another – rather than portables.
This variability in needs has created strong segmentation, energy consumption being the most important factor. While most mass market machines use processors with a TDP of between 35 and 45 Watts, those designed with an eye on portability (Intel calls them Ultrabooks) use 17 Watt processors. The introduction of the GPU into the processor, both in Intel and AMD products has added a new segmentation possibility and manufacturers often highlight either the CPU or GPU according to chip performance.
We therefore wanted to take the opportunity provided by the launch of the new AMD and Intel mobile platforms, Trinity and Ivy Bridge, to assess the performance of the current offering. We’re going to do our best to find out what they’re capable of in terms of CPU, GPU, OpenCL and energy consumption performance.
A very opaque world
Before starting it’s important to note that the world of laptops is very different to that of desktop platforms. In the first place because OEMs rule the roost here! The Laptops on sale in stores are mostly sold by OEMs and while you can create laptops à la carte, you’re often limited by the availability of components. In practice, this option is above all made available to online stores and assemblers.
The joy of pre-installed software. A German version for our Llano test machine.
Laptop processors are rarely, if at all, sold at retail (sometimes you can find a few entry-level references). It's generally impossible to find information on the real price of chips – Intel’s price list doesn’t include all models. For example, on Intel’s site you can find four Ivy Bridge quad cores at $378, the Core i7 3610QM, 3612QM, 3615QM and 3720QM (500 MHz difference between the 3612 and the 3720 with a TDP difference of 10W, 300 MHz between the 3610 and the 3720 at the same TDP)! These ‘consumer’ prices are far higher in practice than the prices at which these chips are sold to OEMs. Things aren’t necessarily any easier to understand on the AMD side as AMD doesn’t supply a price list for its mobile processors.
Comparing processors that are priced the same therefore quickly becomes impossible and while you can go by the price of laptops to make your choice, at least partially, the variability of so-called ‘equal’ specs from one brand to another means that it is extremely difficult to get a clear picture of the situation. We have tried to choose as well as we can among the available models, basing our choice on the price of laptops, the TDP and the number of cores to give you an overview of the relative Intel and AMD offers.
AMD Llano and Trinity; AMD A8-3520M and A10-4600M
A year ago, AMD launched its first mobile APUs for the PC laptop market. Known as Llanos, these chips represented the concretisation of the AMD strategy of buying out ATI as they included both an x86 processor part and a GPU. This ‘vision’ was delayed several times and in fact Intel beat AMD to it, releasing Sandy Bridge at the beginning of 2011.
Technically speaking, Llano is based on the integration of x86 cores, built using the K10.5 architecture, namely the architecture AMD used for the Athlon IIs. Engraved at 32nm (against 45 for the Athlon IIs), these chips however have a 1 MB L2 cache per core rather than 512 KB. The Llanos don’t have a level 3 cache.
On the GPU side, Llano was based on Redwood, the GPU which the Radeon HD 5570s ran on. To recap, it used the AMD VLIW5 architecture. Note finally that while the CPU did have a Turbo mode, the GPU didn’t.
Trinity, blank page?
In mid-May AMD officially launched Trinity, its second version of its APU for laptop PCs, and it’s difficult to think that the designers didn’t start with a blank page, so different is Trinity from Llano.
Nevertheless, let’s start with what they have in common. Llano and Trinity are both APUs engraved at 32nm SOI by GlobalFoundries. The last point in common is that neither chip has an L3 cache.
On the CPU side, it’s out with the K10.5 x86 cores and in with ‘Piledriver’ modules. To recap, Piledriver is version 2 of the Bulldozer architecture, launched on the desktop side with the AMD FXs last year. The basic concept of Bulldozer and Trinity is the fusion of two cores in a single module so as to share a certain number of resources. Thus the module has two processing units for integers and a shared unit that is used for floating point calculations. And while each core has a level 1 cache, they share a level 2 cache as well as other resources (instruction decoders, prefetchers and so on).
AMD announced lots of changes to Bulldozer with its Piledriver architecture, or more exactly lots of little corrections pretty much everywhere. The details released by AMD are however relatively high level. First of all with respect to the instruction set, Piledriver supports FMA3s (Fused Multiply/Add on 3 operands, a = a * b +c ) in addition to FMA4 (a = b * c + d) that was already supported previously (Intel will use FAA3 as of next year), as well as the 16/32 bit floating point conversion instructions introduced by Intel in Ivy Bridge.
For the rest, the changes are mostly small touches at all levels. With the branch prediction mechanisms and schedulers announced as more efficient, gains have been announced for divisions. Overall, it’s quite difficult to judge the impact that these changes will have on performance and the absence of a level 3 cache stops us from carrying out a comparison at equal clocks with the AMD FX ‘Bulldozer’ (we’ll have to wait for the AMD FX ‘Piledriver' Desktops, the Visheras, which are planned for the third quarter). To recap AMD announced a gain of less than 10% between Bulldozer and Piledriver. It's on the energy consumption and energy efficiency side that AMD is claiming the biggest gains, with a 10% to 20% energy economy improvement over the Bulldozer architecture.
On the GPU side it’s also all new as AMD has gone from a VLIW5 architecture to a VLIW4 one, as used on the Cayman GPUs (Radeon HD 6900) last year. The number of processing units depends on the model. Thus the A10 that we have tested here has 384 shader units, while the A8, A6 and A4 have 256, 192 and 128 units respectively.
The final noteworthy change is the Turbo. AMD now has both a CPU and GPU Turbo, both of which can be used according to need.
In practice all these changes have allowed AMD to reduce the TDP significantly in comparison to Llano. Out then with the high-end 45 Watt models. AMD is now targeting TDPs of 17, 25 and 35 Watts. The best of the Trinity APUs, the one we’re testing today, is the A10-4600M. Its closest equivalent, the best of the 35 Watt Llanos launched last December, is the A8-3520M. We also tested this solution.
AMD has added around 125 million transistors between the two chips, but both are quad-core chips. However Trinity represents a significant increase in clock of 700 MHz, whether this be the base clock or Turbo mode. The clock of the IGP hasn’t increased by much, going from 444 to 468 MHz but its Turbo mode will allow it to increase to 686 MHz. Is this enough for Trinity to give the extra 20% CPU performance and extra 30% GPU over Llano, as announced by AMD? This is something we’re going to check now!
SNB & IVB; Core-i5 2410M, 3510M and i7-2630QM, 3610QM, 3612Q
Of course Intel also has its mobile processors. Firstly it introduced Sandy Bridge last year and this is still the platform used on many machines currently on the market. No surprises on the architecture side as the mobile solution gives the same as on desktop as described here. Among the particularities note that, given huge production volumes, Intel has produced distinct dies for the dual and quad core versions of its mobile processors.
Remember, among the main Sandy Bridge features was the inclusion of the IGP on the die, an integration that was also made at resource level with the level 3 cache (called Last Level Cache) shared between the x86 cores and the IGP. Intel also introduced the AVX instruction set as well as a Turbo Boost mode for both the CPU and GPU sides.
Overall, if we put the obvious TDP, clock and memory size variations to one side, the mobile Sandy Bridges are identical to the Sandy Bridge desktops. There is however a small difference in terms of the Turbo mode.
To recap, Intel has a dedicated energy consumption management unit (Power Control Unit) which monitors the temperature and energy consumption of the chip so as to modulate the Turbo accordingly. One particularity of the mobile version is that Intel allows its chips to function beyond the TDP for which they have been designed. Intel plays on the fact that the processor heats up in a relatively progressive way (you can check this yourself by launching a benchmark and monitoring the temperature via an application such as Hardware Monitor or HWiNFO). Intel takes advantage of this phase to overconsume, the goal being to exceed the TDP from the point of view of consumption (the power supply systems must be designed with this in view) but not from the ‘thermal' side. Intel thus allows processors to exceed their TDP by 25% on mobile Sandy Bridge during 28 seconds, which in practice can be relatively stressful for the cooling systems on laptops with a quad-core processor. OEM designs are however manufactured to take account of this particularity, which tends to make the notion of TDP a little more opaque.
Of course, this is not without an impact on benchmarks and, depending on processing time, this improved turbo will push performance up. This is something you’ll have to keep in mind when reading our benchmarks.
On the graphics side, Intel offers both models equipped with the HD 2000 and the HD 3000 (once again, see our original Sandy Bridge test). In practice, the HD 3000 is more common in mobile products.
Mobile Ivy Bridge: dual and quad cores
In addition to the Sandy Bridge processors we were also able to test the mobile Ivy Bridge platforms. Once again, the same architecture has been used, the architecture we covered at the end of April for the Desktop chips.
Ivy Bridge dual and quad core
Firstly from a practical point of view, note that Intel, as it did on the desktop side, has retained the same socket as for Sandy Bridge, namely the PGA 988. In addition to the quad-core models which were launched at the same time as the Ivy Bridge Desktop processors, we also tested a dual-core model announced just a few days ago. All the models announced use HD 4000.
Beyond the architectural changes, the main advantage of Ivy Bridge comes in the fact that it is one of the first processors engraved at 22nm (the AMD offer, like Sandy Bridge, is engraved at 32nm). We’ll see further on if this gives an advantage in terms of energy consumption.
Recap of configurations
For our tests we used three mobile platforms for each of the sockets represented. In order to put all the configurations on an equal footing with respect to energy consumption, we chose machines without additional graphics cards.
For FS1, the socket used by the AMD Llanos, we used a Lenovo Thinkpad Edge E525 laptop. For the AMD Trinity solutions (socket FS2), we used a test machine supplied by AMD. For the Intel platform, used for Sandy Bridge and Ivy Bridge (PGA988), we used a Clevo W270EU equipped with the HM76 chipset.
Our three platforms were equipped with three different but relatively similar chipsets:
On the AMD side the only difference between the A60M, which equips the Llano platform, and the A70M, which equips the Trinity platform, is the addition of an NEC block that supports USB 3.0. The chip TDP hasn’t changed.
The Intel HM76 also supports USB 3.0, like its desktop equivalents. Still in line with the Intel desktop chipsets, only two SATA 6 Gb/s ports are supported here, which isn’t too much of an issue on the mobile side. RAID is however absent and is only used on very few laptops (Intel has some higher end versions of its chipset with RAID). The Intel chipset supports up to 8 PCI Express 2.0 lanes to interconnect additional chips. The AMD chipsets settle for four lanes. Once again, this is plenty for this type of configuration.
We have thus tested seven different processors, the specs of which are shown in this table.
In addition to Trinity, we have added the best Llano processor available at an equivalent TDP. We got our hands on two entry-level Sandy Bridge Core i5s with the 2410M (often now replaced with the 2430M, which is 100 MHz faster) and a quad core model at 45 Watts. Intel doesn’t have a 35 Watt quad core for Sandy Bridge. We have chosen the constructor's most affordable model, the 2630QM (once again, generally replaced by the 2670QM).
We also got hold of the most affordable of the dual core Ivy Bridge 35 Watt models, the 3210M. We also added two other quad cores. Intel has both 35 and 45 Watt models of its chips with some relatively obscure references. Thus the 3610 is a 45 Watt model and faster than the 3612, which is just a 35 Watt model. Once again we restricted ourselves to the least expensive models, though of course we don't know the real prices of these chips.
Generally speaking we're talking about Llano and Intel dual core machines with a starting price of around €600 (entirely variable depending on the configuration and the OEM!) and around €100 more for the quad core models. The exact positioning of the Trinity platforms, while still not on sale in stores, should be between the two, with AMD positioning the A10 opposite the Core i7 range.
The reference AMD Trinity platform
So as to put the machines on an equal footing, they were all tested with the same SSD, a Samsung 830 128 GB that we tested in this article.
Regarding the memory, we tested the machines both in single channel and dual channel. Unfortunately, many OEMs are still delivering machines with a single bar of memory, a hardly significant economy that can wreck performance as we’ll see further on. For information, the Llano machine that we used here was delivered with a single bar of memory. All machines were configured with 4 GB of memory (2x2 GB or 1x4 GB) for the tests. In theory, Llano, Trinity and Ivy Bridge support DDR3-1600 memory while Sandy Bridge is limited to DDR3-1333. In practice our Lenovo Llano platform wouldn’t accept DDR3-1600 and it ran rather at 1333 MHz on this platform. With laptops in fact SPD detection of bars can’t be relied on. It’s impossible on machines that we had to force a given memory setting on. In passing, in the ‘Advanced’ menu of one of our machines we could only find… an option to turn Bluetooth on or off. Very advanced!
The Wi-fi card, the optical drive and the battery were removed for the energy consumption readings, which were taken at the socket with a universal power adaptor so as not to be affected by variable yields of power supplies supplied by manufacturers. All our tests were also carried out using a 1366x768 screen, with the internal screen deactivated so as to enable us to provide as level a playing field as possible. Let’s move onto the tests!
Memory bandwidth, energy consumption
First we measured the memory performance of our respective platforms.
We started with the memory bandwidth accessible via a single thread. The readings were taken with the Aida64 memory test.
The first thing to note for these tests is that we couldn't carry out the memory tests on the Trinity platform with DDR3-1600. Launching a memory bench (Aida64 or RMMT) systematically and immediately brought about a crash that wasn't reproduced with DDR3-1333, or with DDR3-1600 in any other test. This problem may have been linked to our test platform, which is of course a preproduction system.
The Intel processors clearly benefit from having a second memory channel here, something that isn’t really the case with the AMD solutions. Performance levels with Trinity aren't greatly improved over Llano and in fact are even slightly lower for writes.
Lets move on to the multithreaded benchmark included in Rightmark.
Although the AMD processors benefit more from the second channel in this test, write performance is particularly low with Trinity here. We noted a particularly high latency with Trinity, measured at 82ns by Aida64 while Llano is closer to 65ns with th same memory.
We measured the energy consumption at the wall socket of our platforms in different scenarios:
- Machine at idle
- H.264 720p video file in DXVA mode via MPC-HC
- Processor in load (Cinebench)
- GPU in load (Furmark)
- Processor + GPU in load (Cinebench + Furmark)
- Load in game (F1 2011)
On to the results!
As we said earlier, Intel allows its processors to consume above and beyond the TDP for 28 seconds. It’s this maximum energy consumption that we have noted in this graph and this is why the energy consumption of the dual core processors is particularly high in ‘Cinebench+Furmark’. This is the only case where these processors exceed their TDP.
There are overall a few noteworthy points. Firstly, the energy consumption at idle of the Ivy Bridges tested was slightly higher than that of the Sandy Bridges, which are of course older. There’s significant variability from one sample to another. The energy consumption with Trinity is very similar to Llano, with consumption at idle lower but in load, including for a simple DXVA video, on a level with the Intel Sandy Bridge platforms. Note that Furmark stresses the energy consumption of the APU a lot more. This is linked to the fact that the Trinity GPU has a Turbo mode, something that Llano doesn’t have. In F1 2011, energy consumption is more measured, even excellent, considering that the Trinity platform obtains the best performance in this test.
CPU performance: Cinebench, x264, Visual Studio
We began our performance tests by looking at processor-side performance.
We used Cinebench version R11.5 to measure 3D rendering performance. To recap, the software uses the Cinema 4D rendering engine.
[ Single thread ] [ Multithread ]
Whether on one or four cores, our Trinity A10 is more or less on a par with the Llano A8 that we tested, in spite of the difference in clock significantly in favour of the A10 and its superior memory. Even the Sandy Bridge dual core does better in this test.
Staxrip - x264 b2197
Moving frontend, we used Staxrip to transcode a scene from Avatar via x264 in build 2197. We carried out a medium type 2 pass encoding on a 720p source, re-encoded at a bitrate of 6 Mbits/s. To recap, the second pass is the one that benefits most from multithreading.
[ Pass 1 ] [ Pass 2 ]
There’s a more marked advantage for the Piledriver architecture in this bench with gains of 8% and 13% compared to Llano on the first and second passes respectively. Once again, the Intel Sandy Bridge dual core has the lead.
Visual Studio 2011 beta
We opted for the 2011 beta version of Visual Studio. We compiled the latest version (1.7.4) of the source code of the 3D Obre engine (examples included). Parallel compilation was activated for each project in VS.
Here, within the same thermal envelope, Trinity gives a 4% gain over Llano. The Core i5 2410M is still a good way in the lead however.
CPU performance: 7-Zip, Bibble
We used version 9.20 of 7-Zip to compress a large volume of files using the LZMA2 algorithm.
Trinity outdoes Llano by around 8% in this test. There’s a marked difference even with the Intel dual cores.
Let’s bring our purely processor tests to an end with the photo processing software, Bibble. Nous traitons un lot de 48 photos RAW, exported as JPEGs.
In this last test there was an 11% gain from Llano to Trinity. There is however a big gap to the Intel platforms.
Let's move on to the OpenCL tests.
OpenCL performance: DxO Optics Pro, WinZIP, Luxmark
To make up for lower processor performance, AMD highlights its support for OpenCL. Intel also supplies an OpenCL compatible driver with Ivy Bridge for the HD 4000s and 2500s. It isn’t directly compatible with Sandy Bridge. For most of the tests, where possible, we compared the pure CPU rendering mode with the accelerated OpenCL rendering mode. There are still relatively few applications that use OpenCL, though the number is increasing. Transparent support of multiple platforms isn’t however yet entirely there.
DxO Optics Pro 7.2.3
We used version 7.2.3 of this photo processing application to carry out RAW to JPEG exports on a series of 48 files. Note that version 7.5 was recently released.
[ CPU ] [ OpenCL ]
DxO Optics Pro only allows activation of OpenCL by default on a platform where the CPU is slower than the OpenCL acceleration. A benchmark was run at launch of the application to measure the respective performance levels. This benchmark can’t be run on the quad-core Ivy Bridge solutions. We were however able to run it on the dual core Core i5 3210M, but activation of OpenCL systematically results in a blockage during conversion of the first photo. The Intel OpenCL driver is particularly capricious as we’ll see further on.
The gain given by OpenCL is particularly marked on Llano and isn’t insignificant on Trinity and allows them to perform a bit better against the Intel dual core offer. Rather interestingly, single channel mode seems to have more of an impact on performance when GPU processing kicks in.
As highlighted by AMD at the launch of its Radeon HD 7000s, version 16.5 of WinZIP includes OpenCL support. We compressed the same set of files as that used in our 7-Zip test.
[ CPU ] [ OpenCL ]
Where 7-Zip is very much affected by processor performance, the gaps between the different Intel platforms are much smaller here as WinZIP doesn’t benefit from the additional cores as much as 7-Zip. This time it wasn’t possible to turn OpenCL on on the Ivy Bridges as the option didn’t appear. Perhaps the developer hasn’t yet validated the Intel driver. Once again the gains on the AMD side were more marked and our A10-4600M almost managed to catch up to the Core i7 2630QM.
The last of our OpenCL tests was LuxMark, a benchmark that allows you to compare the OpenCL CPU (the rendering of the OpenCL kernels carried out on the processor), GPU, and mixed CPU + GPU modes.
[ OpenCL CPU ] [ OpenCL GPU ] [ OpenCL CPU+GPU ]
Launching Luxmark on the Ivy Bridge platforms made the graphics driver crash in Windows. On restarting windows, the application remained frozen. Combining CPU and GPU performance gives a gain in performance even though you can see that here the processor contributes more to the results.
Overall, OpenCL support limits the damage in some applications for the AMD processors. Intel still needs to work on its driver! Let’s now move on to the tests in games!
Performance in games: F1 2011, Civilization V, Battlefield 3
We measured performance in six modern games on our mobile platforms. To recap, we carried out these tests at 1366 x 768.
We started with the Codemasters game, F1 2011. We tested the game both in DirectX 9 and DirectX 11 modes. We used the Medium/Intermediary setting.
[ DirectX 9 ] [ DirectX 11 ]
The first point to note here is that only having a single memory channel on a machine on which you’re using the IGP has a negative impact on performance. Next, and this isn’t too much of a surprise, AMD does very well here with Trinity easily giving the best level of performance. The HD 4000 however progresses quite markedly in this title, though you have to go for the quad core Ivy Bridges even to equal the performance levels of Llano! Note that while there’s a small DirectX 11 gain on Llano, this isn’t the case on Ivy Bridge or Trinity.
We measured the graphics performance on a scene loaded at the end of a game. All the options were at minimum.
Trinity’s superiority is significant here, especially in single channel mode. Llano and Trinity are limited by their processors in this title. There’s a marked difference with the HD 4000s.
We used the lowest setting in this title.
Even on Trinity, which is a good deal better than its closest rival, the playability of this title is very limited on our mobile platforms, though we do appreciate the efforts made by the developer to make it functional.
Performance in games: Batman Arkham City, Crysis 2, Diablo 3
Batman Arkham City
We used the built-in benchmark with the options set at minimum.
Batman is relatively playable on mobile platforms with Trinity some way ahead of the rest. Note however that on this title, Ivy Bridge and its HD 4000s are on a par with the Llano IGP. The impact of using a machine equipped with just one bar of memory is very marked here.
We measured performance in games in a scene in ‘High’ mode, which is the lowest graphics mode available on this title! Marketing speak!
Trinity does best here and is the only machine on which the game is ‘playable’. Under 30 fps for Crysis 2 is beyond sluggish.
We finished our games tests with Diablo 3. The graphics engine is relatively modest and has been designed for mobile platforms. Here we used the minimum graphics mode (the Low FX option was also ticked) and we measured performance during a movement on a fixed part of the map. Because of the random nature of the fight scenes, it isn't possible to measure the heaviest load scenes precisely. In practice, we noted 20 to 30% dips in performance on the scores given.
While the AMD platforms were a good deal better than the Intel ones, Ivy Bridge and its HD 4000 didn't do badly. While Diablo 3 is playable on the HD 3000, complex fight scenes were notably slower.
This first overview of mobile platforms allowed us to survey the respective platform developments from the two manufacturers.
Inside Trinity, AMD has introduced many architectural changes. With the same manufacturing process and an increased number of transistors, AMD has succeeded in keeping the energy consumption of its chip down. The arrival of Piledriver has given a level of performance that is at least as high as previously, though at the cost of a higher clock in load. In spite of everything, the Piledriver architecture holds up relatively well during this type of exercise, which is to AMD’s credit. The impact resulting from the change in GPU architecture is variable depending on the games, but is on average quite marked. Forgetting about the competition for a second, with Trinity AMD has introduced some daring architectural changes, which have paid off.
Intel continues to dominate in terms of purely processor performance. Ivy Bridge has given a gain partly down to its slightly higher clocks but a simple dual core Sandy Bridge continues to dominate the AMD quad core offers in our processor tests. With Sandy Bridge, Intel’s quad cores were already in another world in terms of performance and energy consumption but the arrival of the 35 Watt quad core processors in the Ivy Bridge range (such as the Core i7 2612QM) have stretched this gap even more. The slight increase in idle energy consumption on the Ivy Bridge samples we tested is the only criticism we have of these chips. This is however relative when you think that the 35 and 45 Watt processors haven’t really been designed for machines supposed to be as mobile as possible (Intel and AMD have chips with a TDP of 17 and 25 Watts for this market).
Intel could be criticized for the delay in OpenCL and graphics drivers, an area where they are likely to be making more of an effort given that AMD is counting on highlighting the use of this API so as to try to make up the ground lost on the processor side. This requires the participation of developers who, while less so than before, remain reticent.
Real usage of these machines remains moreover an important question. With the A10-4600M, AMD is in practice offering the best platform for games of all those in this report. While Intel’s HD 4000 is indeed an advance on the Sandy Bridge HD 3000, it’s still quite some way down on the AMD graphics cores. In addition, the Trinity platform draws less power. A platform equipped with an additional graphics card would however give better results for those who really want to do some gaming, obviously at the price of reduced mobility and higher power consumption. This is what makes the AMD A10-4600M a viable solution within a certain fairly small niche. It remains to be seen if it will convince the OEMs and end users.
Copyright © 1997-2013 BeHardware. All rights reserved.