Saturday, June 4, 2011

Throttle I/O Rate: Limit disk I/O for rsync

Question: I run a backup script called /home/admon/backup.sh which runs rsync command. However, rsync makes a lots of disk I/O and network I/O. I’d like to reduce both disk I/O and network I/O. I’ve 10Mbps server connection and 160GiB SATA hard disk. How do reduce disk I/O so that the entire system doesn’t die or become unresponsive?
Answer:This is well known issue. There are two methods to control or throttle the disk and network I/O rate under UNIX / Linux.

Method 1: Limit I/O bandwidth

The –bwlimit option limit I/O bandwidth for rsync. You need to set bandwidth using KBytes per second. For example, limit I/O banwidth to 10000KB/s (9.7MB/s), enter:
# rsync –delete –numeric-ids –relative –delete-excluded –bwlimit=10000 /path/to/source /path/to/dest/
Method 2: Take control of I/O bandwidth by ionice
ionice command provides more control as compare to nice command. This program sets the io scheduling class and priority for a program or script. You can totally control disk i/o. It supports following three scheduling classes (quoting from the man page):
* Idle : A program running with idle io priority will only get disk time when no other program has asked for disk io for a defined grace period. The impact of idle io processes on normal system activity should be zero. This scheduling class does not take a priority argument.
* Best effort : This is the default scheduling class for any process that hasn¡¯t asked for a specific io priority. Programs inherit the CPU nice setting for io priorities. This class takes a priority argument from 0-7, with lower number being higher priority. Programs running at the same best effort priority are served in a round-robin fashion. This is usually recommended for most application.
* Real time : The RT scheduling class is given first access to the disk, regardless of what else is going on in the system. Thus the RT class needs to be used with some care, as it can starve other processes. As with the best effort class, 8 priority levels are defined denoting how big a time slice a given process will receive on each scheduling window. This is should be avoided for all heavily loaded system.
How do I use ionice command?
To display the class and priority of the running process, enter:
# ionice -p {PID}
# ionice -p 1

Sample output: none: prio 0
Dump full web server disk / mysql backup using best effort scheduling (2) and 7 priority:
# /usr/bin/ionice -c2 -n7 /root/scripts/nas.backup.full
Open another terminal and watch disk I/O network stats using atop or top or your favorite monitoring tool:
# atop
Sample cronjob:
@weekly /usr/bin/ionice -c2 -n7 /root/scripts/nas.backup.full >/dev/null 2>&1
You can set process with PID 1004 as an idle io process, enter:
# ionice -c3 -p 1004
Runs rsync.sh script as a best-effort program with highest priority, enter:
# ionice -c2 -n0 /path/to/rsync.sh
Finally, you can combine both nice and ionice together:
# nice -n 19 ionice -c2 -n7 /path/to/shell.script

http://planet.admon.org/throttle-io-rate-limit-disk-io-for-rsync/