Welcome back to this blog series on how to assess your disk performance and your needs. In the second part of the series we will discuss how to read and analyze the collected data using some sample data.
Analyzing the collection
Alright, so we’ve created our collectors, automated it to run during that key moment of the day (or night) where the usage is heaviest, and we have the log. How do you analyze this data?
If you've read up on assessing performance, something to watch that kind of always comes up is latency. Latency is a nice indicator in that it quickly helps you determine if you are having issues, however it’s only a symptom but provides little information in terms of what your needs are.
These are some guidelines that you typically see around latency when talking about SQL disk performance for Dynamics AX2012.
- Less than 8 milliseconds (ms), then the disk subsystem is performing very well. Consider less than 4 milliseconds (ms) for disks that hold transaction logs as an optimal value, due to the write-ahead logging algorithm enforced by SQL Server.
- Between 9-12 milliseconds (ms), the disk subsystem is keeping up with demand, but does not have much capacity left
- Between 13-20 milliseconds (ms), the disk subsystem is starting to show poor performance.
- Greater than 20 milliseconds (ms), the disk subsystem is experiencing a serious I/O bottleneck.
This may allow you to quickly drill into a problematic time within a trace file.
In the following trace example, we are analyzing what is happening between 12 and 12:30 AM on that server.
Some batch process is expected to kick in around 12:00 AM.
We can see that starting at 12:20 AM latency starts to rise mostly on the writes of the data partition of the SQL server.
On the right-hand side, we can see several peaks between 0.1 and 0.2. Given the scale that means 100 to 200 ms latency. Clearly, we have an issue here.
How much of an issue? Well that’s hard to tell. Let’s explain what latency means. This measure; "latency"; is how long does it take to read or write on the disk from the OS standpoint from when the request is made until the action is actually performed. Latency could increase if there is some external factor slowing down the access to the disk. Maybe you are using a SAN and need to go through a controller and that controller is under heavy stress and is taking time to process the demand. There is low likelihood, but it could happen. A more common scenario is that you’ve simply reached the max capacity of your disks. The disks are going as fast as they can but cannot keep up with the demand. This means that demand is getting queued up before it can be processed and therefore the time spend to read or write is not just the physical time to perform the action but the time spent waiting + the time to perform the action.
In this scenario you’ll typically see an increase in the “Current disk queue length” as seen in the red line in the previous example.
This is a great way to find an interesting “spot” to analyze in the trace but is of little help to assess your needs if there is no such spikes, and gives you no information in terms of what your needs are. Or if it really is an “issue”. If your latency increases, sure it means that performance is not as good as it could be if you’ve reached the max your disks can handle, but at the end of the day this is only an issue if the process generating this load is taking too long or if it’s slowing down other processes running at the same time. If this happens to be at a time when there is no user activity and the only thing running is a batch that has a 2-hour window but is running in 1 hour well …. Then everything is fine. But it's definitely a time that requires additional investigation.
You could think that as a result there is not much use to this counter, however it is actually important to have a look at this because if your peak IOPS or throughput occurs when there is latency and your system is having performance issues (i.e. that batch that has a 2-hour window is running in 3 hours, or users are complaining about the system slowing down at times that match when the latency occurs), it means that your disk is currently a bottleneck and your needs exceed what your disks can offer. This means that you cannot answer the question of how much IOPS or throughput any better than this:
I need more than I currently have. How much more? You will not be able to say.
However, if you don’t have any latency issues, you will be able to answer this question very precisely by looking happens when the peaks are achieved.
Finally, if you have latency but performance is acceptable, well congratulations, you’ve hit that sweet spot of enough but not too much.
Also assessing latency is important because it can influence the block size SQL uses when doing those IO operations, which may skew the results when trying to determine your throughput requirements.
IOPS can very easily be found in the perfmon as there is actually a counter giving this information:
The perfmon also provides us with a breakdown between read and write operations.
Let’s drill down on our previous trace and zoom in on a specific smaller time frame:
The blue line here is the Disk Transfers/sec. We can see that it reached a maximum of 4219 IOPS.
Obviously this is exactly the sum of Disk Read/sec + Disk Write/sec.
Of course, in this specific case, I know that limits of the disks are being reached because those events max out when latency and disk queue occurred.
However if there was no latency during those periods, I could look at the distribution of read vs write in order to run a diskspd using this read/write distribution to see how much more IOPS the disks would be able to sustain. With no latency and if this was the peak of the day I could also say, found it, 4300 IOPS is the performance my disks need!
If the performance was “acceptable” I could say, 4300 IOPS will suffice. However, in this case, there was performance issues as this batch was taking longer than its allotted window, so all I can say is: 4300 IOPS is not enough! But I can’t tell you how much will be enough. The only way to answer this question would be to have disks with better performance and see how much we peak at.
This is great, we now have an answer, or at least a partial answer to that famous question how much IOPS do I need, though we’ve seen that in some cases we can’t really say for sure (which is still an answer).
Now consider this, if the provider of the disks told me, hey you can expect 4300 IOPS, then great, I just showed that I indeed reach this value and we’ve hit the cap…But what if my provider told me, your disks should have 5000 IOPS. Does that mean that something is wrong with the disks that were given to me as I only capped at 4300? Why am I not reaching this limit?
Here is something that you may already know. IOPS limit is not the ONLY limit your disks can hit. Another very important limit is what is the throughput of disks? Maybe you are not reaching your IOPS theoretical limit because THAT limit is hit before. This limit is often overlooked but is very important as the max performance you can get out of your disks is equal to whichever limit of those 2 is reached first.
While IOPS tell you how many operations you can do per second, throughput tells you how much data you can physically read or write per second. You can view this like bandwidth … but for disks.
This is very important because the OS and application may use a different block size to read and write data. Of course, if 1 operation is reading 1 Kb of data or 64 Kb of data the throughput is very different.
Disks typically have a max speed they can reach. If using a SAN, your controller might have a max overall throughput it can handle. Meaning it’s possible to reach the max performance of your disks before you reach the IOPS cap if your IO operations are big enough that you reach the max throughput.
The data collected with perfmon tell us the measured throughput:
Again we have a breakdown per read (Disk Read Bytes/sec), write (Disk Write Byes/sec) or total (Disk Bytes/sec) which is the sum of the first two.
Here is what we have in our example:
Given the number of "0" in that scale it’s a little hard to read, but fortunately we have the Maximum provided in the recap: 187 215 820.
This value is in bytes, converted into MB it equals to 178.5 MB/s.
In theory this counter is nice to consider when you are looking at times when no latency was occurring however when there is spike in latency this value can be a bit misleading as throughput will tend to drop when latency occurs and given how fast the operations occur (remember we’re talking of over 4000 operations in 1 second but we are only collecting this data once per second meaning we could have collected a peak, a low or an average value in this throughput value).
Another way to calculate the throughput which I recommend if you are looking at a time period with that is experiencing latency is to look at the IOPS and look at your block size and simply multiply those together.
If we look at our collection, we have the following graph:
We can see that the average bytes/transfer drops significantly when IOPS increase and latency pops up.
However, before those events occur, we’re typically somewhere around 0.5. At this scale this translates to 50 KB transfers. At a rate of 4000 IOPS per second that equals 4200 * 50 KB / 1024 (convert to MB): 205 MB/sec. So, we are probably trying to achieve a throughput over 200 MB/sec (higher than the 178 MB reported).
Again, if there were no latency involved, this calculation should fall close to what is reported in the throughput counter and block size should be a lot more consistent.
So, this leaves us with the following requirements:
200 MB/sec throughput
If we were not hitting any caps (no queue and latency) or if performance was “acceptable” well then, we’ve answered the question of what are the disk requirements for this system.
As it stands, performance was not “acceptable” and all I can say is that the requirements for this system exceeds these values but cannot say how much “more” is required.
The only way to answer this question would be to get better disks so we don’t hit the cap and measure again where we peak.
We can also see here that our block size is something either 48 or 56 KB, and that we are doing a lot of writing during these peaks and a little bit of reading.
Measuring disk max performance
How can we verify if we’ve hit our disk max capacity? Am I really hitting caps on the disks or is there some other issue at play?
Remember DiskSPD? Well let’s use that here.
We could run it with the following parameters to match the activity we are seeing on the
DiskSpd.exe -c50G -d300 -r -w80 -t8 -o8 -b56K -h -L D:\Diskspd-v2.0.17\testfile.dat > D:\Diskspd-v2.0.17\DT-RW-64k-Result.txt
Unfortunately, I don’t have the results of the following test, but I will use a different result I have available from a test machine and show you how to read the result file.
A similar test was run on a system using 64 Kb blocks (because that’s the value that was relevant for them based on observation of the Avg. bytes per transfer) and using 60% read and 40% write (again this was what was relevant based on looking at the read/sec and write/sec).
The outputted file looks something like this:
There is a lot of information in the file before this point (not really necessary for the analysis) and a breakdown following this section differentiating read and write stats (obviously total is the sum of both).
In this test we can see that the disk can sustain 2500 IOPS at a throughput of roughly 160 MB/s (there are 8 lines due to running with 8 threads. The stress on the disk is the sum of all.
I won’t get into the detail of everything in that file, but this extract can easily be used to evaluate and answer some questions.
If you’re not experiencing any issues and your current usage is not capping out, this can help you say how much higher you could go.
If you are capping out, this can help you verify that you are indeed reaching the specs provided by your hardware vendor. If you are not hitting those marks maybe your hardware has not been optimized the way it should. A typical thing that I’ve seen in the past is that the SAN maybe be able to sustain say 5000 IOPS … but for 8 KB blocks. And if you run the test with 8 KB blocks, you’ll reach those IOPS. But maybe it only has a throughput of 100 MB/sec. Run the test again and suddenly you don’t go over 1600 IOPS. Why? Because 1600 IOPS *64 Kb / 1024 (lets get MB) = tada 100 MB/s
Maybe your SAN needs to be re optimized for 64 KB transfers instead of 8 KB.
Keep in mind that the results of this test don’t tell you what is the max IOPS or max throughput your disk can reach. But what is the max you can obtain for operations of a certain type! (in the above examples for 64 KB operations) meaning one of the two variables could probably go higher but is being restricted by the other capping.
I would recommend running additional test using much smaller and much bigger size (ie. 8 KB and 256 Kb). Lower operation size gives you a higher chance of reaching the max IOPS of the disk (as throughput should not be an issue) while higher will give you a high chance of reaching the max throughput (as reaching throughput should not take many IOPS).
Combining these tests should allow you measure with certainty what is the max IOPS and what is the max Throughput your disks can achieve.
This concludes the second part of this series.
In this part, we discussed how to read the perfmon collection.
- How to analyze latency
- How measure IOPS
- How to measure throughput (via counter or via calculation using IOPS and average block size)
We’ve also discussed how use this data to run a relevant diskSPD test and read the results of the test to determine the max IOPS and throughput of your disks can reach.
In the next part of the series we will take the results to these investigations to assess which disks should be used in the context of Azure IaaS.
Premier Field Engineer