The information currently here is mostly Unix-specific, and fairly biased towards Linux - if you have further profiling hints, particularly for other platforms, please add them below.
The first thing to do is find out if the testcase is I/O or CPU bound. To do this, run it under time which will reveal this, and also whether a lot of time is spent in the kernel, for example:
$ time ./test-script real 0m27.863s user 0m4.514s sys 0m0.236s
The above example is I/O bound (the user+sys times are much less than the real time, so assuming the machine is otherwise idle, the difference is due to waiting for I/O).
If we're CPU bound, most profiling tools are suitable. If we're I/O bound, then using a profiling tool which doesn't just look at CPU time will produce much more useful data.
In I/O bound cases, if you're running on Linux, we recommend oprofile. This looks at the whole system, so we see time spent in the kernel, and doing I/O (and slow operations are often partly due to I/O). Also, oprofile has a low overhead so it doesn't change the situation we're trying to measure too much.
The most useful output (from oprofile or other tools) is usually a callgraph which shows how time splits down between functions and the functions they in turn call.
How to profile Xapian using perf
Recent Linux kernels (2.6.31 onwards) have a new profiling framework with a userland tool called "perf":http://perf.wiki.kernel.org/index.php/Main_Page
It seems to be very versatile, and likely to supplant oprofile, but we haven't yet tried it with Xapian. If you do, we encourage you to fill in any useful tips here.
How to profile Xapian using oprofile
What follows is a run through of how you might install and use oprofile to generate a callgraph.Prerequisites:
- A Linux box (ideally running a 2.6 series kernel for best results)
- An architecture where oprofile supports callgraphs (x86 and x86_64 seem to)
- root access
- If you're running virtualised under Xen, you apparently need to do special things to get oprofile working. We're not sure exactly what, but for some promising-sounding resources see: http://www.google.com/search?q=oprofile+xen
sudo apt-get install oprofileYou may also want to ensure you have debug symbols for libraries where time is spent so we can see which functions time is spent in. On Debian and derived distros, debug symbols are often packaged separately in a package named after the library package with -dbg appended. E.g. for libc6 debug symbols install libc6-dbg:
sudo apt-get install libc6-dbgDebian should be the same as Ubuntu for installing oprofile and debug symbols, but this part may vary for other distros.
From now on, the commands should work for any distro, provided you use sudo (except that in edgy, feisty, and Ubuntu gutsy's oprofile package has a bug and you'll need to use bash opcontrol in all the commands below to work around this).
Once you have it installed:
sudo opcontrol --init sudo opcontrol --no-vmlinux --callgraph=12 sudo opcontrol --start-daemon sudo opcontrol --start ; [run testcase] ; sudo opcontrol --stop opreport --callgraph=12 > oprofile-callgraph.txt sudo opcontrol --deinitThe output of opreport can be enormous, so please don't post reports to the mailing list as attachments. The best thing to do is open a bug report and attach the report there so that it doesn't get lost.
How to profile Xapian using sysprof
Sysprof is another Linux system-wide profiling tool. It is very easy to use (once installed), but provides less detail than oprofile (in particular, it doesn't provide access to the hardware counters).First, note that you will need the sysprof kernel module to be installed and loaded before sysprof will work. On Ubuntu, the sysprof module can be installed by running:
sudo apt-get install sysprof-module-source sudo module-assistant auto-install sysprof-moduleIt will then need to be loaded by running:
sudo modprobe sysprof-moduleInstall the sysprof application using:
sudo apt-get install sysprofWhen you want to profile, start the sysprof GUI application (under Applications>Programming, or by typing sysprof in a terminal window). To start profiling, click the "Start" button, then start the activity you wish to profile, and once this has run for a sufficient period of time, click "Profile" to stop the collection of information.
A list of functions, and the time spent in them and in their descendants is displayed in the sysprof window. You can click on a function to get a list of callers and callees, and click on a caller to change the view to that function. You can also sort the list by time-in-self or by accumulated time.
Profiling just CPU usage
The following profilers aren't so useful when there's a significant amount of waiting on I/O, but in mostly CPU-bound cases, they can be handy.gprof
gprof isn't as useful as some of the other profiling options, but it works on more platforms than some, and is easy to enable - just configure with:./configure CXXFLAGS=-pg LDFLAGS=-pgIf your testcase is a separate program, then you also should also pass -pg when you compile and link it.
Then run the testcase as normal, which will create a file gmon.out in the current directory with the raw profiling data in. To analyses this data, run:
gprof testcase
Which produces a report of stdout (so you probably want to redirect output to a file or a pager). valgrind's callgrind and cachegrind
These naturally only work on platforms which valgrind supports - see http://valgrind.org/ for an up-to-date list of these.No recompiling is required - just prefix your test command like so:
valgrind --tool=cachegrind ./testcase
(or --tool=callgrind for callgrind).