Saturday, March 12, 2011

How to write trace analysis scripts for QEMU

This post shows how to write a Python script that finds overlapping disk writes in a QEMU simple trace file.

Several trace backends, including SystemTap and LTTng Userspace Tracer, are supported by QEMU. The built in "simple" trace backend often gives the best bang for the buck because it does not require installing additional software and is easy to use for developers. See my earlier post for an overview of QEMU tracing.

The simple trace backend has recently been enhanced with a Python module for analyzing trace files. This makes it easy to write scripts that post-process trace files and extract useful information. Take this example from the commit that introduced the simpletrace module:

#!/usr/bin/env python
# Print virtqueue elements that were never returned to the guest.

import simpletrace

class VirtqueueRequestTracker(simpletrace.Analyzer):
    def __init__(self):
        self.elems = set()

    def virtqueue_pop(self, vq, elem, in_num, out_num):
        self.elems.add(elem)

    def virtqueue_fill(self, vq, elem, length, idx):
        self.elems.remove(elem)

    def end(self):
        for elem in self.elems:
            print hex(elem)

simpletrace.run(VirtqueueRequestTracker())

This script tracks virtqueue_pop and virtqueue_fill operations and prints out the elements that were popped but never filled back, which indicates elements have leaked.

The model of an analysis script is similar to awk. Trace records are processed from the input file by invoking methods on the user's simpletrace.Analyzer object. The analyzer object does not have to supply methods for all possible trace events, it can just implement those that it wants to know about. Trace events that have no dedicated method cause the catchall() method to be invoked, if provided.

Tracing disk write operations


Let's write a slightly fancier script that finds disk writes that overlap a given range. I've needed to perform disk write overlap queries in the past when debugging image formats, so this is a useful script to have. QEMU's block layer write function looks like this:

BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num,
                                  QEMUIOVector *iov, int nb_sectors,
                                  BlockDriverCompletionFunc *cb, void *opaque);

The block device is called bs and sectors are 512 bytes. Conveniently there is already a trace event for bdrv_aio_write so we just need to enable it in the trace-events file:

disable bdrv_aio_writev(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"

Open the trace-events file and remove the disable keyword from the bdrv_aio_write trace event. Then rebuild QEMU like this:

$ ./configure --enable-trace-backend=simple # ... plus your usual options
$ make

Next time you run QEMU a trace file named trace-$PID will be created in the current working directory. The file is in binary and can be parsed using the simpletrace Python module or pretty-printed like this:

$ scripts/simpletrace.py trace-events trace-12345 # replace "trace-12345" with the actual filename

Finding overlapping disk writes


Here is the usage information for a script to find disk writes that overlap a given range:

usage: find_overlapping_writes.py <trace-events> <trace-file> <bs> <sector-num> <nb-sectors>

The script only considers writes to a specific block device, bs. That means all disk I/O to other block devices is ignored.

For example, let's find writes to a 1 MB region at offset 2 MB of the BlockDriverState 0x29c8180:
$ ./find_overlapping_writes.py trace-events trace-19129 0x29c8180 4096 2048
0xe40+664 opaque=0x2eb4190
0x10d8+3032 opaque=0x2eb41d0

Here is the code:
#!/usr/bin/env python
import sys
import simpletrace

def intersects(a_start, a_len, b_start, b_len):
    return not (a_start + a_len <= b_start or \
                b_start + b_len <= a_start)

class OverlappingWritesAnalyzer(simpletrace.Analyzer):
    def __init__(self, bs, sector_num, nb_sectors):
        self.bs = bs
        self.sector_num = sector_num
        self.nb_sectors = nb_sectors

    def bdrv_aio_writev(self, bs, sector_num, nb_sectors, opaque):
        if bs != self.bs:
            return
        if intersects(self.sector_num, self.nb_sectors, sector_num, nb_sectors):
            print '%#x+%d opaque=%#x' % (sector_num, nb_sectors, opaque)

if len(sys.argv) != 6:
    sys.stderr.write('usage: %s <trace-events> <trace-file> <bs> <sector-num> <nb-sectors>\n' %
                     sys.argv[0])
    sys.exit(1)

trace_events, trace_file, bs, sector_num, nb_sectors = sys.argv[1:]
bs = int(bs, 0)
sector_num = int(sector_num, 0)
nb_sectors = int(nb_sectors, 0)

analyzer = OverlappingWritesAnalyzer(bs, sector_num, nb_sectors)
simpletrace.process(trace_events, trace_file, analyzer)

The core of the script is the OverlappingWritesAnalyzer that checks bdrv_aio_writev events for intersection with the given range.

This script is longer than the virtqueue leak detector example above because it parses command-line arguments. The simpletrace.run() function used by the leak detector handles the default trace events and trace file arguments for you. So scripts that take no special command-line arguments can use simpletrace.run(), which also prints usage information automatically. For overlapping writes we really need our own command-line arguments so the slightly lower-level simpletrace.process() function is used.

Where to find out more


There is more information about the simpletrace module in the doc comments, so the simplest way to get started is:
$ cd qemu
$ PYTHONPATH=scripts python
>>> import simpletrace
>>> help(simpletrace)

Another example of how to use the simpletrace module is the trace file pretty-printer which is included as part of the scripts/simpletrace.py source code itself!

Feel free to leave questions or comments about the simple trace backend and the simpletrace Python module.