SSD 2009, act 2: OCZ Vertex and Indilinx Barefoot - BeHardware
Written by Marc Prieur
Published on April 22, 2009
Indilinx & OCZ Vertex
Set up in October 2006 by three ex Samsung Electronics execs, Indilinx is a relatively new player and its first project is Barefoot, a controller for SSDs that is being used on more and more products including the OCZ Vertex range.
Under development since 2006 and finalised in the first half of 2008, the Barefoot uses an ARM7 processor, like some other top end controllers. It is able to use an external chip as cache memory and adress both SLC and MLC Flash NAND chips at the same time, giving read speeds of 210 MB/s and writes of 170 MB/s.
OCZ was the first to announce products based on the Indilinx Barefoot, back in December 2008. The first products didn’t come onto the market until February however and have been evolving as a result of firmware modifications.
At first the Vertex was delivered with the firmware 0112, then the new 1199 version quickly improved performance in mid-March. The only problem was that this firmware had a bug that meant data could be lost in intensive use. Firmware 1275, available a week later, corrected this problem. At the beginning of April, the last firmware 1.1 added TRIM support. Updates should now come at a less hectic rate, and this is no bad thing as every time a Barefoot SSD is updated, the data is lost: Indilinx needs to work on this!
The OCZ Vertex is available in four sizes: 30, 60, 120 and 250 GB. The speeds announced vary according to capacity: reads are 230, 230, 250 and 250 MB/s and writes are 135, 135, 180 and 180 MB/s respectively. To get a performance overview we tested a 30 GB and a 120 GB version..
Once you open a Vertex up you see an Indilinx controller with a 64 MB Elpida SDRAM chip with a 32-bit bus and operating at 166 MHz (theoretical bandwidth of 666 MB/s). For Flash NAND, you can see 16 8GB MLC chips from Samsung (8 on either side of the PCB).
Performance deterioration & TRIM
Deterioration in performanceWe wrote about this in our previous article on SSDs; a new SSD generally gives better performance than a used one. Let’s begin first of all with the usure that affects all SSDs when you write small blocks of data.
The memory cells in Flash NAND ships are grouped on pages and these pages make up blocks. Generally one page is 4 KB and a block 512 KB. It is possible to read or write Flash NAND by page, but you can only delete a block at a time.
Another parameter to take into account is that when a file is deleted by the system, the SSD is not in the loop as the operation is situated at the level of system files. The SSD is therefore only able to know the invalidity of a piece of data on a page when you rewrite over it!
The resulting phenomenon is quite simple: when you write a new 4 KB file onto a page that has already been used, the SSD has to read the whole block, then delete it and then rewrite all the pages in the block. To write just 4 KB then, you might have to read as much as 512 KB, delete and write 512 KB, which obviously has a negative impact on performance, particularly in random writing of small files.
Another problem is how pages are addressed on Flash memory as the controller has to translate between the logical addresses used by the file system and the physical addresses of the corresponding Flash pages. There is no fixed correspondance and as with hard drives, the Flash controller allocated the pages as best it can so as to optimise performance (write combining) and increase the lifetime of the cells (wear levelling).
Above all it’s aggressive write combining that can pose a problem. Write combining is a fairly simple technique that allows several writes to be combined and written in a single sequential burst: writing 32 pages at once onto the same block is much faster than writing 1 page onto 32 different blocks and uses the SSD much less at the same time. The allocation table then has to be updated so that LBA 0 = Page 0, LBA 133 = Page 1, LBA 289 = Page 2 and so on.
The only problem is that if you then have to rewrite sequentially onto the SSD with such an allocation table, performance is severely reduced as it is not possible to use as many Flash memory chips at the same time as is necessary. Of course the SSD controller tries to reorganise the allocation table as best it can for this type of write but without any great success if all the pages have already been used once. Write combining is also undermined if there are no free pages as it can then no longer function due to a lack of raw space.
Long live TRIMThis is where the TRIM function comes in. First mooted in 2007 by T13, the technical committee in charge of defining ATA standard specs, it is a command that lets the system tell the storage device that an LBA contains data that is now invalid.
This may seem stupid but it will considerable simplify things for SSD controllers in terms of keeping performance levels up. In terms of the controller, there is no fixed implementation but logically speaking the reference in the LBA / Flash page allocation table is deleted.
Deleting each page physically at this time, in other words deleting the whole block to which it belongs and rewriting it again, would be completely counter productive from the point of view of the usure of memory chips, but sometimes a whole block that only contains pages no longer in the allocation table may be deleted.
In any case knowing in advance which pages are available allows the controller to optimise writes, both sequential or random, and therefore avoid the problem faced when all cells have been used at least once.
There are two conditions for the use of this function. First,the SSD has to support the functionality, whether when it is released or via new firmware as with the OCZ Vertex, and then the operating system has to be able to use it. Linux has this capability and Windows only in 7 and not yet for the beta versions.
Another solution is that OCZ offers a utility in beta version for 32 bit windows that allows it to use the TRIM function of the SSD. When the SSD launches the utility, it reads the file system allocations table and tells the SSD which LBA addresses are not being used via the TRIM command. And hey presto!
Vertex usure in practice
Vertex usure in practiceHere is a practical example of the deterioration in performance of an SSD, using the 30 GB Vertex when the TRIM function is not used.
Case 1: Sequential performance – Sequential access only
First of all, here are the sequential read and write speeds (in KB/s) when we carry out only this type of access (one read, one write, one read, one write, one read anda final write, each time on the whole of the SSD, ie 30 GB):
There’s a slight fall in performance between a new and a used drive, read speeds dropping from 206 to 196 MB/s and writes from 159 to 145 MB/s. Not a huge difference all the same.
Case 2: Sequential performance – Random then sequential access
This time we started with a new drive (after HDD Erase or being flashed) that is then “used” by writing by block of 4 KB randomly for 30 minutes.
Sequential read performance deteriorated from the first run. The second read that took place after the first sequential write gives even lower performances that start to go back up as of the third run. The average read speeds then are 148, 93 and 116 MB/s compared to a top score of 206 MB/s for a new SSD.
Write speeds are even worse and this as soon as we attempted sequential writes with an allocation talbe previously used for random writes: 25.6 MB/s at first then 40.8 and 50 MB/s, the allocation table slowly and painfully re-organising itself.
Case 3: Random performance
Now here is an observation of the change in performance (in I/Os per second) during random writes of 4 KB files. We launched the test on the new SSD, 6 x 5 minutes, then a second time on an SSD with cells that were previously filled sequentially.
The problem here is what value should be used in the test: some say that a Vertex handles 3270 I/Os per second, but even a new Vertex is quickly way down on this. On an SSD that has previously been filled, performances still fall but the curve then stabilizes.
The TRIM utilityLets see how the TRIM utility developped by Indilinx counters this phenomenon that is pretty troubling, as was the case with the Intel X25-M SSD. The software launches easily in Windows XP or 32 bit Vista, but isn’t yet completely stable in 64 bit. Once launched it reads the file system allocation table and tells the SSD which LBA addresses are not being used via the TRIM command.
Hey presto! Sequential read and write and random write performances are back to their original levels. This is very good news that will only improve when the TRIM function is implemented directly on the operating system and works on the fly. In the meantime, those who have a Vertex and who note a significant drop in performance only have to use the utility to get back to the original levels on the remaining free space.
The testFor this test, we compared these new SSDs to different models:
- OCZ Core V2
- OCZ Apex
- Samsung PM410
- Samsung PS410
- Samsung PB22-J 64 Go
- Samsung PB22-J 256 Go
- Mtron MOBI 3500
The OCZ Core V2 represents what is “best” with the JMicron JMF602/MLC combination, with the Apex a RAID version (two SSDs in one). The Samsung SLC and MLC SSDs are in a way the references for the previous generation, and the Mtron MOBI 3500 is an interesting alternative. The Samsung PB22-Js are the new generation of Samsung SSD, here in two capacities as their specs are different. For information we also include the performances of a VelociRaptor, a 3"1/2 Samsung SpinPoint F1 640 GB drive and a 2"1/2 Samsung SpinPoint M5 160 GB.
Different measurements were taken in the course of this comparison. First of all we were interested in the “synthetic” performances of these drives: average access time, sequential speeds and I/O in random and sequential access. Next, were more practical tests, first of all involving an applicative performance index based on PC Mark Vantage and then an evaluation of writing, reading with various groups of files These groups were composed of:
- A collection of large files: 6 files (on average 2.2 GB) totalling 13.2 GB
-Medim sized: 7.96 GB of 10,480 files (each averaging 796 KB)
-Small sized: 2.86 GB of 68,184 files (each averaging 44 KB)
The source or target of reading or writing on the drive was a RAID of three VelociRaptor 150 GB drives that replace the raid of two 150 GB Raptors used in previous tests. This type of measurement is worthwhile because, while the sequential speed gives us an idea of the performance in copying large files, things can be different with smaller ones.
The test machine was based on an X38 chipset mounted on an ASUSTeK P5E motherboard while Serial ATA ports were configured in the bios in AHCI (Advanced Host Controller Interface) so that NCQ could be used, all operating with Vista SP1.
We also added new practical tests so as to provide more useful data for those who want to work out if going for an SSD over there standard hard drive is worthwhile. For this we timed various operations on an other machine based on a P5QC, QX9770, GTX 280 and 2x2 GB of DDR2-1066.
Measured with h2benchw, SSD access time is really astonishing. The Vertex range beats the record for an MLC SSD with 0.12 ms compared to 0.10 ms for the MOBI 3500.
We continue with the speed, still with the help of h2benchw. In contrast to other benchmarking software such as HDTune or HDTach, h2benchw carries out a true sequential test because it reads or writes the entire drive, whereas the others jump between zones to reduce test time.
With read speeds of over 200 MB/s, the Vertex 30 and 120 GB versions move straight into top position. The results for write speeds are also very good, especially for the Vertex 120 GB version. The 30 GB version is slightly down as its spec would indicate but 159 MB/s is not bad either.
Now on to measurements of the number of 100% random access that can be supported by these storage systems, using the IOMeter. We carry these out in small blocks of 4 KB and see how many the SSD can support each second, first using sequential then random access.
The SSDs show their strength in read, which of course results from very reduced access times. Whether access is sequential or random, performances are very good, while on standard hard drives, the movement of the read head slows performances down a lot. Note that the Vertex SSDs do very well, even though the MOBI 3500 outdoes them in random read.
In write, hard drives are better, the advantage largely due to the cache. For SSDs however, things are harder, especially as the test was carried out on a device on which the pages had already been filled previously with sequential writes. SSDs based on JMicron chips (OCZ Core V2, OCZ Apex) are quite simply awful, with performance levels that don’t give a very fluid use of the machine (freezing and other lags). Another disappointment is the Samsung PB22-J that we tested here in the 64 GB version also gives extremely limited performance levels: sure, they should be sufficient to avoid the sort of lags seen on the JMicron SSDs but it is nevertheless 2.3 times down on the previous generation Samsung PM410!
With the Vertex SSDs, performances are good, at around 450 I/Os per second on both versions tested and well beyond the levels of a standard hard drive, an excellent score in this problematic area for SSDs. Up till now only Intel managed to sustain a high level of performance here, but with significant deterioration over time, something you can easily counter with a Vertex through the use of the TRIM utility. What’s more, as we saw in the section given over to Vertex performances over time, it will be viable to expect over 3000 I/Os per second in random 4K writes when the TRIM is integrated into the operating system.
PC Mark Vantage
PC Mark VantageWe now move on to less synthetic tests, starting with an index of hard drive performances in PC Mark Vantage. FutureMark reproduces a set of reading/writing tasks on the drive, namely a Vista start-up, loading of applications (Word, Photoshop, IE, Outlook), the manipulation of multimedia files (photo, video, music), games (loading of Alan Wake) and disk scan with Windows Defender.
Two things should be noted here; the first being that strangely the 64 GB version of the PB22-J gives slightly better performance than the 256 GB version. The Vertex SSDs are even better and by some distance, the advantage being with the 120 GB version.
Management of files
File copyingThis brings us to file copying. We measure reading and writing of diverse groups of files. These groups were composed of:
- A collection of large files: 6 files (on average 2.2 GB) totalling 13.2 GB
-Medim sized: 7.96 GB of 10,480 files (each averaging 796 KB)
-Small sized: 2.86 GB of 68,184 files (each averaging 44 KB)
The source or target of reading or writing on the drive was a RAID of three VelociRaptor 150 GB drives.
In read the last generation SSDs, Vertex included, are very fast for large files. When it comes to small or medium-sized files however, performances are relatively close to the previous generation and standard hard drives, with the exception of the 5400 rpms.
In writes the Vertex SSDs do well, especially on medium sized files for which they are well ahead of the competition, with the exeption of the Samsung 7200 rpm hard drives that do astonishingly well. Note however that although results are good for large files, they are nowhere near the synthetic speeds obtained in h2bech (200 Mb/s on the Vertex 120 GB!).
Practical testsWe have at last decided to add other practical tests to our SSD tests, in particular so as to provide more useful data for those who want to work out if going for an SSD over their standard hard drive is worthwhile. For this we timed various operations on an other machine based on a P5QC, QX9770, GTX 280 and 2x2 GB of DDR2-1066.
- Windows Vista start-up
We measure the time needed to start up Windows Vista freshly installed with the drivers. The measurement is from running the bios (disappearance of P5QC logo) to full Windows desktop display with the cursor not showing the egg times.
- Installation of Service Pack 1
Here, the time needed to install Service Pack 1, the installation file itself being situated on the SSD.
-Start up of Windows Vista SP1+Kaspersky+Word+Excel+Outlook+Photoshop
After having installed Kapersky Antivirus 2009, the Office suite and Photoshop CS4, we put shortcuts to Word, Excel, Outlook and Photoshop in the Startup directory of the start menu. The time measured is from running the bios to the end of Photoshop CS4 launch.
-Loading of “Train” level in Crysis Warhead
After having installed and patched Crysis Warhead we launch it with the –DEVMODE option and load the train level from the consol with the “map train” command.
We weren’t able to carry out the tests on the 256 GB version of the Samsung PB22-J, as we no longer had this SSD.
There wasn’t a big difference between the SSDs in terms of starting up Vista. There truly is a big gap between standard hard drives, even a VelociRaptor, and the SSDs but not much difference between the SSDs themselves.
Installation of Vista Service Pack 1 does however show up some big differences. Although it isn’t very surprising to see the JMicron SSDs in last place, it is more so to see the MOBI 3500 so badly positioned. Next is the PB22-J behind the old generation Samsung PM410, while the SLC Samsung PS410 does well. The Vertex SSDs give the best installation time. Note that compared to standard hard drives, SSDs can be much slower here!
While simply starting up Vista didn’t reveal much difference between SSDs, things are quite different when launching heavy multiple applications. In the lead is the Samsung PS410, followed by the 2 Vertexes and the MOBI 3500. The MLC Samsungs are a little behind , the new generation faster than the old, and at the back of the field are the OCZ Apex and Core V2. These last two are however slightly in front of the VelociRaptor.
As you can see, going from an HDD to an SSD doesn’t necessarily change your life in terms of loading a games level. Of course, during loading there are read operations that are carried out on the storage device but also a lot of the time is linked to the creation of the environment by the processor from this data – this time can’t be cut down. The gain is therefore slight.
Power consumptionAll the SSDs tested here are completely silent. We concentrated therefore on measuring consumption at rest and reading/writing with IOMeter. Here are the results:
As well as giving great performance the Vertex SSDs are very economical in load! What more could you ask?
ConclusionOCZ has taken a big step forward with the Vertex range. Sure, it’s Indilinx rather than OCZ that is the source of these high performances, but the monitoring carried out by OCZ via its technical forums and the feeding of info back to Indilinx has allowed the Vertex range to develop rapidly since release. Of course these developments should eventually be available for all SSDs made in the same way, such as the SuperTalent UltraDrive ME or G.Skill Falcon.
Relatively affordable because based on MLC memory, offering top drawer performance across the field, giving low levels of energy consumption and the first SSD to demonstrate what the TRIM function will be able to deliver, the Vertex range are perfect SSDs. The only drawbacks are that availability has been irregular since launch and that updating the firmware requires you to back everything up because the process deletes all data on your SSD.
Those who aren’t sure about moving towards an MLC SSD and prefer to go for an SLC that lasts 10 times as long in terms of write cycles, will be delighted to hear that some manufacturers are offering Indilinx SLC SSDs (SuperTalent UltraDrive LE, OCZ Vertex EX): they come in at at least twice the price of the MLC versions however and give around a 10% gain for sequential access and 70-80% for random access.
Up till now our preference was for the first generation Samsung MLC 64 GB SSD, the PM410, then for the larger PB22-J, previously tested here in the 256 GB version. The 64 GB version was disappointing here, the PB22-J 64 GB giving a lower performance in some respects than the PM410 64 GB version! The Vertex now easily claims top spot.
Note that Intel has also updated the firmware for its X25-M supposed to counter the significant performance dips that can occur in some circumstances. This is good news that we still have to verify and that could allow the X25-M to fight back with the Vertex range. The Intels are still expensive, the cheapest version being the X25-M 80 GB available at the same price as a Vertex 120 GB!
What next? We already know that Indilinx is working on a new controller, the Jet Stream. The first prototypes are planned for the second half of the year, the manufacturer claiming speeds of 500 MB/s via an SATA 6 Gbps/s interface. Wowee!
Copyright © 1997-2013 BeHardware. All rights reserved.