Tuesday, March 29, 2011

How to use perf-probe

Dynamic tracepoints are an observability feature that allows functions
and even arbitrary lines of source code to be instrumented without recompiling
the program. Armed with a copy of the program's source code, tracepoints can
be placed anywhere at runtime and the value of variables can be dumped each
time execution passes the tracepoint. This is an extremely powerful technique
for instrumenting complex systems or code written by someone else.

I recently investigated issues with CD-ROM media change
inside a KVM guest. Using QEMU tracing I was able to
instrument the ATAPI CD-ROM emulation inside QEMU and observe the commands
being sent to the CD-ROM. I complemented this with perf-probe(1) to
instrument the sr and cdrom drivers in the Linux host and
guest kernels. The power of perf-probe(1) was key to understanding the
state of the CD-ROM drivers without recompiling a custom debugging kernel.

Overview of perf-probe(1)

The probe subcommand of perf(1) allows dynamic tracepoints
to be added or removed inside the Linux kernel
. Both core kernel and modules
can be instrumented.

Besides instrumenting locations in the code, a tracepoint can also fetch
values from local variables, globals, registers, the stack, or memory. You can
combine this with perf-record -g callgraph recording to get stack
traces when execution passes probes. This makes perf-probe(1)
extremely powerful.

Requirements

A word of warning: use a recent kernel and perf(1) tool. I was unable
to use function return probes with Fedora 14's 2.6.35-based kernel. It works
fine on with a 2.6.38 perf(1) tool. In general, I have found the
perf(1) tool to be somewhat unstable but this has improved a lot with
recent versions.

Plain function tracing can be done without installing kernel debuginfo
packages but arbitrary source line probes and variable dumping requires the
kernel debuginfo packages. On Debian-based distros the kernel debuginfo
packages are called linux-image-*-dbg and on Red Hat-based distros
they are called kernel-debuginfo.

Tracing function entry

The following command-line syntax adds a new function entry probe:
sudo perf probe <function-name> [<local-variable> ...]

Tracing function return

The following command-line syntax adds a new function return probe and dumps the return value:
sudo perf probe <funtion-name>%return '$retval'
Notice the arg1 return value in this example output for scsi_test_unit_ready():
$ sudo perf record -e probe:scsi_test_unit_ready -aR sleep 10
[...]
$ sudo perf script
     kworker/0:0-3383  [000]  3546.003235: scsi_test_unit_ready: (ffffffffa0208c6d <- ffffffffa0069306) arg1=8000002
 hald-addon-stor-3025  [001]  3546.004431: scsi_test_unit_ready: (ffffffffa0208c6d <- ffffffffa006a143) arg1=8000002
     kworker/0:0-3383  [000]  3548.004531: scsi_test_unit_ready: (ffffffffa0208c6d <- ffffffffa0069306) arg1=8000002
 hald-addon-stor-3025  [001]  3548.005531: scsi_test_unit_ready: (ffffffffa0208c6d <- ffffffffa006a143) arg1=8000002

Listing probes

Existing probes can be listed:
sudo perf probe -l

Removing probes

A probe can be deleted with:
sudo perf probe -d <probe-name<
All probes can be deleted with:
sudo perf probe -d '*'

Further information

See the perf-probe(1) man page for details on command-line syntax and additional features.

The underlying mechanism of perf-probe(1) is kprobe-based event tracing, which is documented here.

I hope that future Linux kernels will add perf-probe(1) for
userspace processes. Nowadays GDB might already include this feature in its
tracing commands but I haven't had a chance to try them out yet.