If you use a tool such as IOMeter or sqlio.exe--both of which are freely downloadable--to benchmark the I/O performance of a disk, you typically would configure the tool to generate I/Os against a test file that is placed on the disk.
For instance, with sqlio.exe the test file is specified on the command line as follows:
The question is: Does the size of the test file matter? If so, what size should you pick?
E drive:
I then changed the -d parameter value from E to F, and re-ran the command line, and obtained the following results:
F drive:
From these results, it is more than obvious that the F drive was far more performant than the E drive, at least for 8K random reads.
Unfortunately, that conclusion would be terribly far from the truth!
"Wait a minute and stop right there! Are you telling me that a drive that can do ~20,000 reads per second is not significantly better than a drive that does only 2600 reads a second?"
Well, that's not what I'm telling you. What I failed to disclose in the little comparison tests described above was that the test file on the F drive was 256MB in size, whereas the test file on the E drive was 128GB in size. It turns out that the size of the test file has such a dramatic impact on the resulting I/O metrics that I was literally comparing apples and oranges.
Running the same sqlio.exe command line against a 256MB test file on the E drive produced to the following results:
The F drive still performed better than did the E drive. But instead eight times better, the F drive was about 20% better in I/Os per second.
For different disk drives, the shape of the curve and the inflexion point will differ. But there is no mistake that it is meaningless to state a disk I/O metric without also disclosing the size of the test file.
Disk seek area. A larger file generally occupies a larger area on the disk platter than does a smaller file. For random access, this means that the disk arm has to cover more tracks, leading to a higher seek time on average. Since seek time is often the most significant time-consuming component in processing an I/O request, average I/O performance is therefore expected to drop when you increase the test file size.
Cache size. If an I/O request is satisfied with data already in the cache of a disk subsystem, seek time is no longer a factor. In addition, the more I/O requests that are satisfied only from the cache, the higher the average I/O performance. And the smaller the test file relative to the cache, the more likely the requested data will be found in the cache. It therefore follows that the smaller the test file the higher the average I/O performance.
This is generally a good recommendation, but not an absolute truth. It is generally good because as a SQL Server user, you benchmark a disk to make better use of SQL Server for your application databases, and your application databases in general rarely fit entirely into the cache of your disk subsystem.
Then, why would you ever want to use a small test file? The most obvious reason is when you are dealing with databases that are small enough to be cached, and you want to get disk benchmark numbers that are relevant to these databases. There are also specific benchmarking scenarios in which a small test file is a must. Consider the case where you want to measure the read latency between the host OS and the I/O subsystem cache. You should configure your tests to eliminate the influence of the disk media access time, and a test file that is small enough to fit into the cache will do the trick.
Courtesy : http://www.sqlteam.com/article/benchmarking-disk-io-performance-size-matters
For instance, with sqlio.exe the test file is specified on the command line as follows:
sqlIO -kR -t1 -s30 -dE -o32 -frandom -b8 -BH -LS testfile.datIn this case, you are benchmarking 8K random reads against the testfile.dat file on the E drive. The sqlio.exe utility will issue the 8K read requests from a single thread and will maintain an I/O queue depth of 32.
The question is: Does the size of the test file matter? If so, what size should you pick?
Benchmark results may lie
I was interested in comparing the 8K random reads performance between the E drive and the F drive on one of my machines. So, I first ran the above command line, and that gave me the following results:E drive:
IOs/sec: | 2,681 |
MBs/sec: | 21 |
Min_Latency (ms): | 0 |
Avg_Latency (ms): | 11 |
Max_Latency (ms): | 310 |
F drive:
IOs/sec: | 19,871 |
MBs/sec: | 155 |
Min_Latency (ms): | 0 |
Avg_Latency (ms): | 1 |
Max_Latency (ms): | 16 |
Unfortunately, that conclusion would be terribly far from the truth!
"Wait a minute and stop right there! Are you telling me that a drive that can do ~20,000 reads per second is not significantly better than a drive that does only 2600 reads a second?"
Well, that's not what I'm telling you. What I failed to disclose in the little comparison tests described above was that the test file on the F drive was 256MB in size, whereas the test file on the E drive was 128GB in size. It turns out that the size of the test file has such a dramatic impact on the resulting I/O metrics that I was literally comparing apples and oranges.
Running the same sqlio.exe command line against a 256MB test file on the E drive produced to the following results:
IOs/sec: | 16,635 |
MBs/sec: | 130 |
Min_Latency (ms): | 0 |
Avg_Latency (ms): | 1 |
Max_Latency (ms): | 29 |
The impact of test file size
To drive home the fact that the size of the test file can have an overwhelming impact on the disk I/O benchmark results, I ran the following identical command line repeatedly with the testfile.dat set to different sizes:sqlIO -kR -t1 -s30 -dE -o32 -frandom -b8 -BH -LS testfile.datThe sizes of testfile.dat I tested included the following:
- 256MB
- 512MB
- 2GB
- 8GB
- 32GB
- 64GB
- 128GB
For different disk drives, the shape of the curve and the inflexion point will differ. But there is no mistake that it is meaningless to state a disk I/O metric without also disclosing the size of the test file.
Why size matters?
Among many factors that may help render the disk benchmark results sensitive to test file size, the following two are worth noting:- Disk seek area
- Cache size
Disk seek area. A larger file generally occupies a larger area on the disk platter than does a smaller file. For random access, this means that the disk arm has to cover more tracks, leading to a higher seek time on average. Since seek time is often the most significant time-consuming component in processing an I/O request, average I/O performance is therefore expected to drop when you increase the test file size.
Cache size. If an I/O request is satisfied with data already in the cache of a disk subsystem, seek time is no longer a factor. In addition, the more I/O requests that are satisfied only from the cache, the higher the average I/O performance. And the smaller the test file relative to the cache, the more likely the requested data will be found in the cache. It therefore follows that the smaller the test file the higher the average I/O performance.
What size to use?
This is a frequently asked question, but unfortunately there is no one-size-fit-all answer. If you read the literature on this topic, you will no doubt find that the most common recommendation is to use a test file that is larger than the cache size of the disk subsystem being benchmarked.This is generally a good recommendation, but not an absolute truth. It is generally good because as a SQL Server user, you benchmark a disk to make better use of SQL Server for your application databases, and your application databases in general rarely fit entirely into the cache of your disk subsystem.
Then, why would you ever want to use a small test file? The most obvious reason is when you are dealing with databases that are small enough to be cached, and you want to get disk benchmark numbers that are relevant to these databases. There are also specific benchmarking scenarios in which a small test file is a must. Consider the case where you want to measure the read latency between the host OS and the I/O subsystem cache. You should configure your tests to eliminate the influence of the disk media access time, and a test file that is small enough to fit into the cache will do the trick.
Courtesy : http://www.sqlteam.com/article/benchmarking-disk-io-performance-size-matters