Tuesday, April 12, 2011

Using LoadAvg for Performance Optimization

Linux and Unixes have excellent metric of system load called “loadavg”. In fact load average is is 3 numbers which correspond to “load average” calculated for one five and 15 minutes. It is computed as exponential moving average so most recent load have more weight in the value than old one.

What does Load Average corresponds to ? At least on Linux it is number of processes which are in “running” state or in “uninterruptable sleep” state which typically corresponds to disk IO. You can also map LoadAvg to VMSTAT output – it is something like moving average of sum of “r” and “b” columns from VMSTAT.

Obviously minimum value for LoadAvg is zero which corresponds to completely idle system, and there is no maximum :)

First thing to understand about LoadAvg it does not really tell you if it is CPU bound load or IO bound load. For example if you have LoadAvg of 10 it may mean there are 10 processes/threads actively consuming CPU or it could be same 10 processes waiting on disk IO and you can see CPU utilization being close to zero.

Second thing is to understand LoadAvg values are relative to your system size. If you have single CPU and 1 disk loadavg of 2 can be considered significant, while if you have 16 CPUs and 2 disks Load of 4 can be light if it is CPU bound – because the system can execute much more CPU bound tasks in parallel or High if it is Disk Bound LoadAvg.

Low Load Average does not mean there are no performance problems, for example if you run single batch job on the server with MySQL, Load Average is likely to be close to 1 even if there are a lot of CPUs and Disks – system may be quite idle and performance still poor because application is not parallel enough. Similar situations can happen if there is a lot of network IO involved or if there are a lot of locks (table/row level locks) or other limiting factors such as innodb_thread_concurrency.

The most interesting question I think is how LoadAvg represent box load in terms of how much load it can handle before it becomes to slow down or being completely unable to handle the load, and it is tricky question. Both for CPUs and for Disk there are two stages request can be. It can be ether currently executing or queued for further execution. The time which is needed to complete request is sum of time it was really executed and the time it was spent in queue. As the system is loaded response time starts to increase mainly because of time requests spend waiting in various queues and waiting on locks, the time of true execution may well remain constant. This is a bit of simplifications as there are number of other effects coming in play but good enough for sake of explanation.

What does it mean from LoadAvg standpoint ? You need to understand where parallel execution continues and where waiting in the queue starts. If you have fully CPU bound workload which is rather parallel (ie many queries will run at once) and you have 4CPUs until your LoadAvg is below 4 you have low time spend waiting for CPUs to be free to do the work. There is some wait but not much. So if you have LoadAvg of 1 and your workload scales linearly with number of connections and CPUs (ie there are no row waits involved) you can assume box can handle up to 3-4 times more load before response time starts to suffer.

If however the LoadAvg is 4 already it may take rather insignificant increase to take it up to 8 and you will see some delays due to queuing. If there are 4CPUs (Cores) and loadavg is 16 for CPU bound workload it often means requests should take 4 times more to complete than they would on idle box due to waiting in the queue.

Same true for pure Disk IO bound workload with small difference of disk not being replaceable (if you’re waiting on one drive you can’t use another drive instead), and the fact disks can optimize multiple outstanding requests a bit better compared to requests coming one after another.

For mixed workload, which is what we usually see in practice you have to do some assumptions guesses or further analyzes if you want good estimates. Ie you may want to check mpstat, vmstat and iostat to see where load comes from. But the general rule remains the same – until you’re able to explore parallel abilities of the box it will perform well as soon as you need to do a lot of queuing performance starts to suffer.

Let us clarify last point – how much more load the box can handle before it overloads, loadavg skyrockets and it becomes as good as down. First for many applications request inflow is not constant – ie web site gets poor response time and users do not spend so much time on it any more so load drops. This is however temporary relive only as there are stubborn users which would not go away even with slow responding site until their browsers timeout, which is as good as site is down. There are too many variables to come with exact numbers but generally as soon as you have long queuing started it may take just 10-20% extra load to overload system, so it is better to keep loadavg low – below number of CPUs and/or disks you have.

I must note – LoadAvg is not perfect tool for the task. It is just almost always available unlike other metrics. It is best to have profiling information so you can see as response time for your requests starts to grow. As soon as it becomes to grow with no good reason I would start to worry whatever LoadAvg shows.

Courtesy : http://www.mysqlperformanceblog.com/2006/12/04/using-loadavg-for-performance-optimization/