As with reviewing hard disks, every product reviewer has their own way of reviewing Solid State Drives and each reviewer also has their own preferred testing methods. Some reviewers swear by simulated real world tests such as PCMark Vantage, some believe that synthetic tests are enough, while the more patient reviewers will go all the way to the bother of test driving the SSD in the real world such as timing the installation of popular software products, measuring game launch times and so on.
Personally, like test driving a car, I believe that nothing beats testing SSDs in the real world with a real OS installation and timing everyday hard disk intensive tasks such as copying files, batch processing a set of images and so on, as in the end, the average user who purchases an SSD is not going to endlessly benchmark it. Instead, most SSD buyers want an SSD that performs well in real life and remains this way until they decide to upgrade in the future.
Just like a car, synthetic tests do have a purpose, especially when it comes to enterprise users who intend using an SSD in a database server or a terminal server running a virtualised desktop environment. In this case, synthetic and simulated real world tests give a good indication as to how the SSD will perform, especially when compared to the same tests carried out on other drives with the same testing hardware.
This article is broken up into the following sections:
Synthetic testing – An explanation of the various tests conducted by synthetic test software, such as read/write throughput, IOPS, queue depths and data compression.
Synthetic test software – A brief description of what tests are conducted by each popular synthetic test utility.
Simulated real world tests – The advantages and drawbacks of using testing software that runs IO traces recorded from a set of real world tests.
Real world tests – Detailing how useful are real world tests conducted manually by stopwatch or by script. This page also goes into a bit of detail of TRIM testing and power consumption.
On this page we explain the different tests commonly carried out by benchmark software and examples of what activities they may relate to in the real world.
Sustained read performance – This is the maximum read performance the drive is capable of delivering going by the test, where all the data is read sequentially, similar to reading a story in a book from start to finish. In the real world, this is the equivalent performance to reading a single large file that is not in a fragmented state.
Sustained write performance – As sustained read performance, but in this case the maximum write performance the drive is capable of receiving going by the test, where all the data is stored sequentially, similar to writing a story in a book from start to finish. In the real world, this is equivalent performance to writing a single large file.
Random read performance – In this type of test, blocks of data are read from random areas of the drive. On a hard disk with spinning platters, this is a very tedious process, as the drive head must seek each location before it can read the block. On an SSD, there are no moving parts, so in theory this can be as quick as sequential read performance.
In practice, the OS sends more commands to the SSD to read random blocks than a large sequential block of data, so with this extra overhead and time taken to process each read operation, random read performance is generally slower than sequential reading, but still a lot faster than random read performance of any hard disk. In the real world, this is the equivalent to reading lines of text from random pages of a book.
Random write performance – Unlike a hard disk, SSDs behave very different when writing data, especially writing small blocks to random areas of the drive. When a traditional hard disk receives a write operation, it simply seeks the location and writes the data. However, with an SSD, each NAND cell can only be rewritten a limited number of times before it wears out and this figure is as low as 3,000 write cycles with modern 25nm NAND.
To limit the number of rewrite operations per cell, an SSD implements a wear levelling algorithm such that when the same logical sector is rewritten, this data is stored to physically different NAND cell than previously. As each NAND block can hold as much as 256KB of data, when a 4KB block of data is written to a partially filled NAND block, the existing NAND block must be blanked before the new data can be written, so the existing data is first read, the block erased and the data along with the new 4KB block are written back to the NAND block.
Each controller has its own method of dealing with writing, so random write performance varies heavily from SSD controller to controller. In the real world, this is the equivalent to replacing random lines of text in a book with new lines of text, where the old lines must be erased with correction fluid and allowed to dry first.
IOPS – An abbreviation for Input-Outputs per Second. Most drive specifications that show IOPS ratings show this figure as the number of 4KB blocks the SSD is capable of handling either being read or written. This figure is one of the most important synthetic tests to give an idea of how an SSD will perform in the real world, since operating systems such as Windows 7 typically read and write data in 4KB blocks, especially with this being the default allocation unit size of the NTFS file system. In the real world, the read figure gives an idea of how quick applications will launch, while the write figure gives a rough idea of how quick software installations, windows updates and batch file processing will perform.
Queue Depth (# of Threads) – Drives that support Native Command Queuing are able to process multiple read and write operations simultaneously, where one IO operation does not need to complete before the next one can take place. With a hard disk, this allows the controller to organise a set of seek operations for the quickest read or write performance my minimising drive head movement. SSDs use this to make better use of the bus, since an SSD controller can transfer data with multiple NAND cells simultaneously, where as a hard disk’s head can only be positioned at one place at any point in time. In the real world, an SSD with better results at high queue depths will perform better with multitasking and multithreaded batch processing.
Compression – With most SSDs and pretty much all hard disks, the type of data being written has a negligible effect on read/write performance, whether it is a series of the same byte (e.g. “00000000”) or random values (e.g. “p2&;nY.[”) The SandForce controller behaves different to previous controllers in that it compresses data before storing it on the NAND cells. This means that data that can be easily compressed can potentially be written and read quicker than uncompressible or difficult to compress data.
Some benchmarks gives figures for compressed data, uncompressed data and varying amounts of compression. Tests for uncompressed data give an idea of a best case scenario, while tests for compressed data give an idea of worst case scenario. We’ll mention which tests use which type of data. In the real world, the uncompressed data is the equivalent to storing text documents and spread sheets on the drive, while heavily compressed is the equivalent to storing JPEG images. Most software and other types of data typically have a compression ratio of about 50%.
There are a wide range of synthetic test suites available, so I’m just going to give a quick guide to the most common widely used tools:
CrystalDiskInfo – This is actually just a drive information tool, useful for showing the drive’s capabilities such as whether it supports Native Command Queuing (NCQ), TRIM and so on.
HD Tune (Pro) – A benchmark tool which bypasses the file system level. It started off as freeware tool to show the sustained read performance across a hard disk’s platter, access performance, burst rate and CPU usage. For an SSD, the read performance graph usually remains fairly flat and gives an idea of what the SSD is capable of sustaining. The newer HD Tune Pro software which must be purchased can also carry out a wide range of transfer size tests, such as to show IOPS performance, average access time and throughput. As far as I’m aware of, in the write tests, the data supplied by HD Tune Pro is compressible, showing the best case scenario for SSD controllers that compress data.
HD Tach – Like HD Tune, this also bypasses the file system level and actually works at a low disk access level. This tool is available by request from Simpli software and is current free. This tool also shows sustained read and write performance, random access performance and how it compares with other drives. As far as I can tell, its sustained writing involves uncompressed data, again showing a best case scenario.
ATTO – This is a very popular benchmark carried out on a wide range of storage media, including external flash drives and hard disks and works at the file system level. This shows the drive’s reading and writing performance at the file system level using different transfer sizes. The tool operates by default at a queue depth of 4, which is the equivalent to four threads or applications simultaneously accessing the drive. ATTO’s data is uncompressed and is commonly used to show the maximum read and write rates an SSD is capable of delivering with the best case scenario of the data being highly compressible. Note that SSDs with large caches tend to show higher transfer rates and what they are capable of sustaining.
CrystalDiskMark – This is another popular benchmark which also works at the file system level. This carries out sequential, random 512KB and 4KB read and write operations. The 4KB test is also repeated at a queue depth of 32. Each test is run 5 times by default with the best result shown when complete. By default, CyrstalDiskMark uses random uncompressible data, showing a worst case scenario for SSDs affected by data compressibility, such as where the SSD is used for storing JPEG images or other types of data that cannot be easily compressed. There is an option to run the test with 0 or 1 fill, to show the best case scenario, which is useful for if the SSD is used for storing compressible data such as in a file server storing large a quantity of text documents and spread sheets.
AS SSD – Unlike most other tests, this benchmark is specifically designed to benchmark SSDs. The benchmark consists of several tests, again covering sequential, random 4KB IOPS and threaded random 4KB IOPS. Each test is conducted over a longer period of time to overcome drive’s cache which can potentially show higher better results in shorter tests. The benchmark uses random uncompressible data in its tests, showing the worst case scenario.
Unlike most other benchmarks, AS SSD also delivers a score, which makes it easier to compare SSDs. In general, a higher score relates to a faster SSD in the real world. The tool can also run file read/write simulation tests and measure sequential transfer rates under various levels of data compressibility. .
IO Meter – As its name suggests, this test is specifically designed to measure IOPS performance. Unlike other benchmarks, each test is individually run and can be heavily customised, such as with different block sizes, queue depths, multiple worker threads and a user set length of time.
Samples can be taken throughout the test such as with a screen capture application to show how the drive performs as the runs over several minutes, especially in write testing. IOMeter is very good at discovering weaknesses such as where an SSD would show a very good result in CrystalDiskDisk mark, but struggles in the real world such as when faced with studio recording software that records a large number of simultaneous tracks over a period of 5 or more minutes.
Many reviewers run customised IO Meter tests to give an idea of how the drive would perform in a database server, web server or file server.
Most simulated real world testing software have a set of drive IO traces, where the software carries out the same IO operations as what was recorded when the tests were originally conducted in the real world. Most reviewers prefer these tests to manually conducting real world tests for several reasons:
Hardware independence – Since each benchmark uses the exact same IO trace, this means the test results can potentially be compared between different PCs, unlike real world tests where the CPU, motherboard, amount of RAM, OS, drivers and so on can all impact the test results. Generally the only thing that matters with the simulation is that the SATA controller remains the same from test to test.
Scoring – This allows people to compare their results with other people running the same simulation software, such as the scores produced by PC Mark Vantage.
Repeatability – Since the exact same IO trace is used in each test, there is no risk of the IO data changing from test to test. For example if a reviewer runs Windows start-up test and then has to repeat it, the second boot up time may be different if Windows carried out some background disk optimisation (e.g. prefetch data) since the last boot.
Simulations do have a few drawbacks. For example, manufacturers can tweak SSDs to perform better with the traces carried out by popular testing software, but where the performance may not be reflected by actual real world usage. Another problem is that the traces can be played with no gaps (e.g. CPU waiting time) and SSDs can behave quite different when faced with a steady stream of IO operations than with intermittent IO operations.
For example, an SSD that has intermittent IO operations may perform better by using the idle time to carry out background housekeeping, but when faced with the IO operations by an IO trace with the idle gaps removed, it may struggle and appear to perform worse than competing SSDs.
Stopwatch based real world testing
Real world tests are the best way to compare how an SSD will perform in the real world, as they give a clear guide as to what SSDs are weak or perform well at. For example, an SSD that may boot the OS in seconds may struggle against a hard disk when faced with a software installation or update. Some SSDs may struggle with multitasking and so on.
Some reviewers include timings of Windows starting up, applications being launched and file copying. For the best tests, have a look for reviews that show timings of software installations as well as comparisons against a hard disk with spinning platters such as the WD Raptor, as these tasks can be more tedious than any synthetic test and often reveal the weaknesses of an SSD. At this time of writing, the only review websites I’m aware of that conduct software installation tests are MyCE, Legit Reviews and Hardware Heaven, mainly due to the time involved in preparing for each test, not to mention conducting it, such as timing the installation of Windows 7 Service Pack 1.
Of course there are drawbacks as well. For example, the timings are largely affected by the operating system, CPU speed, amount of RAM, drivers and so on, so comparisons can only be compared between drives conducted by the same reviewer and the same test PC. Usually most reviewers include a traditional hard disk in their tests, which can be used a reference to give an idea of how much better the SSD performs.
Script based real world testing
Some websites have their real world tests set up by a script, where all the application launches, file copy timings, etc. are conducted and timed automatically by the script. The advantage here is that more simultaneous tasks can be carried out such as to simulate a terminal server with a large number of users logged on and the timings are more accurate, reducing or eliminating the need to repeat tests to get an average or where the user forgot to start/stop the stopwatch. The script can also carry out more awkward tests such as loading photos in editing software, carrying out tasks such as sharpening, cropping, etc. and saving them.
A drawback with script testing is that some important tests such as software installation and especially Windows updates and service pack installations cannot be carried out by script easily, thus some SSD reviewers carry out script tests in addition to stopwatch based testing.
Note that some websites which claim to do replay real world tests on a drive are simply playing back recorded IO traces from real world tests they conducted earlier. In this case, these tests are not real world, but instead a real world simulation as discussed on the previous page, where the SSD may behave differently depending on how the IO trace is played back.
This test involves hammering the SSD with a lengthy period of random write operations such that its performance becomes crippled due to its free space all being used up. The reviewer then measures its performance such as with AS SSD, CrystalDiskMark or IOMeter to show how it compares with when it was in its clean new state.
Next, the reviewer leaves the SSD idle over several hours (usually overnight) and finally repeats the tests to show how much the performance has recovered with the SSD’s background garbage collection. Basically, this gives an idea of the worst case scenario for the SSD if the user is carrying out an extensive amount of writing to the SSD, such as if the SSD is used in the file server of a busy photography or music studio.
If the SSD is to be used in a laptop or Netbook, it is useful to find out its power consumption to give an idea of how it will affect battery life. Generally most modern SSDs consume a tiny fraction of what energy a hard disk typically consumes, but this figure is still useful for laptop/netbook users looking to get every additional minute of battery time.
As SSDs can be quite unpredictable when faced with different types of data, such as threaded, random small blocks, compressible data and so on, before deciding on a specific SSD, try to have a look as many reviews as possible. This also helps find any weaknesses that a previous review may have not tested for. Real world testing, especially software installation timings, is probably the most important especially with laptop and Netbook users, as Windows and software (e.g. Java) updates can turn up at awkward times and a good fast SSD that does in real world software installation tests will significantly reduce the amount of time lost waiting for these updates to take place.
For synthetic tests, look out for the following:
Sustained random write IOPS or MB/s – In general, an SSD that has a higher random write IOPS will perform better in the real world, especially when it comes to disk intensive write operations such as installing software updates.
AS SSD full benchmark – AS SSD conducts the random write test over a much longer period of time than most other benchmarks such as CrystalDiskMark, so while its results tend to be lower, they are more realistic as to how the SSD will perform with random writes over a longer period, such as installing an office package or several windows updates.
IOMeter with a 3+ minute write IOPS test – Some SSDs that do well with short benchmarks are easily bogged down when faced with a lengthy period of random write operations. An example of such a situation would be using the SSD in a music studio for recording a large number of audio tracks, where it is critical that the SSD can keep up.
For real world tests, look out for the following:
Game launching – Many modern games read a large amount of data when launching and when going from level to level. For a gamer, these timings are the main ones to look out for in the real world tests.
Office package installation – This gives an idea of the time it will take to install or upgrade a large software package, such as Microsoft office in comparison to installing it on a hard disk. Many SSDs struggle with large software installations and upgrades and it is not surprising to often see these timings worse than HDD timings. Many Windows updates can also be as disk intensive as a full office suite, such as a .NET framework update, so the quicker these timings are, the quicker the SSD will likely perform with other tasks such as Windows/Java/Browser updates to installing a service pack.
File copy – File copying and archive extraction are probably the two most tedious everyday tasks people carry out. This real world test will also give an idea as to how quick the SSD will handle batch processing of files, such as a batch resize of photographs.
If you have any questions, feel free to post them below as a comment.