Monday, 20 October 2014

KVM Forum 2014 videos are being posted

Videos of the talks at KVM Forum 2014 in Düsseldorf are being posted on the KVM Forum G+ page.

They are coming online one-by-one as the volunteers finish editing the videos.

In the meantime, here are the slides from my talks at KVM Forum 2014 and Tracing Summit 2014:

Monday, 24 February 2014

QEMU has been accepted for Google Summer of Code 2014!

Great news, the organizations for Google Summer of Code 2014 have been announced and QEMU is participating again this year!

If you are a student who is interested in a 12-week full-time project working on QEMU, KVM, or libvirt this summer, head over to our project ideas page.

Student applications are open from March 10th to March 21st. See the Google Summer of Code timeline for details. Also be sure to reads the FAQs on the Summer of Code website to find out about eligibility, time requirements, and how the process works.

Sunday, 9 February 2014

Slides posted for "VIRTIO 1.0: Paravirtualized I/O for KVM and beyond"

I attended 2014 and gave a talk on VIRTIO 1.0, the standard for paravirtualized I/O devices supported by KVM and other virtualization software.

If you are looking for an overview of virtio, curious how paravirtualized I/O works, or thinking about designing custom devices, check out the slides:

The VIRTIO 1.0 standard refines the hardware specification so that implementors have a clear reference. Older versions of the specification were informal and did not undergo as much scrutiny. The presentation explains the key concepts: virtqueues, configuration space, the device status field, and feature negotiation. It should give you enough background to dive into the code or standard.

Monday, 6 January 2014

Coroutines in QEMU: The basics

Many developers struggle with coroutines the first time they encounter them in QEMU. In this blog post I explain what coroutines are and how to use them.

Callback hell in event-driven programs

QEMU is an event-driven program with a main loop that invokes callback functions when file descriptors or timers become ready. Callbacks become hard to manage when multiple steps are needed as part of a single high-level operation:

/* 3-step process written using callbacks */
void start(void)
    send("Hi, what's your name? ", step1);

void step1(void)

void step2(const char *name)
    send("Hello, %s\n", name, step3);

void step3(void)
    /* done! */

"Callback hell" is the name for a confusing nest of callback functions which sometimes emerges in such programs. In QEMU we faced this challenge and looked for a solution to replace callbacks.

Instead of splitting logic across callbacks and manually marshalling data between them, we wanted to write sequential code even where event loop iterations are required:

/* 3-step process using coroutines */
void coroutine_fn say_hello(void)
    const char *name;

    co_send("Hi, what's your name? ");
    name = co_read_line();
    co_send("Hello, %s\n", name);
    /* done! */

The coroutine version is much easier to understand because the code is sequential. Under the hood the coroutine version returns back to the event loop just like the callback version. Therefore the code still uses the event loop but it can be written like a sequential program.

Coroutines as cooperative threads

Coroutines are cooperatively scheduled threads of control. There is no preemption timer that switches between coroutines periodically, instead switching between coroutines is always explicit. Coroutines run until termination or an explicit yield.

Cooperative scheduling makes writing coroutine code simpler than writing multi-threaded code. Only one coroutine executes at a time and it cannot be interrupted. In many cases this eliminates the need for locks since other coroutines cannot interfere while the current coroutine is running.

In other words, coroutines allow multiple tasks to be executed concurrently in a disciplined fashion.

The QEMU coroutine API

The coroutine API is documented in include/block/coroutine.h. The main functions are:

typedef void coroutine_fn CoroutineEntry(void *opaque);
Coroutine *qemu_coroutine_create(CoroutineEntry *entry);

When a new coroutine is started, it will begin executing the entry function. The caller can pass an opaque pointer to data needed by the coroutine. If you are familiar with multi-threaded programming, this interface is similar to pthread_create(3).

The new coroutine is executed by calling qemu_coroutine_enter():

void qemu_coroutine_enter(Coroutine *coroutine, void *opaque);

If the coroutine needs to wait for an event such as I/O completion or user input, it calls qemu_coroutine_yield():

void coroutine_fn qemu_coroutine_yield(void);

The yield function transfers control back to the qemu_coroutine_enter() caller. The coroutine can be re-entered at a later point in time by calling qemu_coroutine_enter(), for example, when an I/O request has completed.


Coroutines make it possible to write sequential code that is actually executed across multiple iterations of the event loop. This is useful for code that needs to perform blocking I/O and would quickly become messy if split into a chain of callback functions. Transfer of control is always explicit using enter/yield, and there is no scheduler that automatically switches between coroutines.

QEMU provides additional primitives on top of the coroutine API for wait queues, mutexes, and timers. In a future blog post I will explain how to use these primitives.

Thursday, 19 December 2013

Distribute and provision disk images with virt-builder

I recently learnt about the virt-builder tool that was added in libguestfs 1.24. This is a really significant addition that makes publishing and using template disk images safe, quick, and efficient.

The best way to understand virt-builder is by looking at typical use cases.

Quick disk image creation from template images

For casual users there is a public repository of CentOS, Debian, and Ubuntu releases. Now you can create a Debian disk image with a single command. By the way, you don't need to be root:

$ virt-builder debian-7

Customization and configuration management

Whammo, you have a Debian 7 disk image. But folks that wish to customize the default image can use command-line options to add & delete files, create directories, install packages, and setup firstboot scripts. This makes virt-builder a great tool for bootstrapping your Puppet/Chef configuration management.

Publishing template images

Heavy duty users will publish their own disk images, maybe a library of images available to hosting customers or a private cloud environment. Not to mention virt-builder is a handy way for development teams to share template images. All this is possible using the cryptographically signed "index file" that catalogues the template images. Users can list and inspect images like this:

$ virt-builder --list
centos-6                 CentOS 6.5
debian-6                 Debian 6 (Squeeze)
debian-7                 Debian 7 (Wheezy)
fedora-18                Fedora® 18
fedora-19                Fedora® 19
fedora-20                Fedora® 20
scientificlinux-6        Scientific Linux 6.4
ubuntu-10.04             Ubuntu 10.04 (Lucid)
ubuntu-12.04             Ubuntu 12.04 (Precise)
ubuntu-13.10             Ubuntu 13.10 (Saucy)
$ virt-builder --notes fedora-20
Fedora 20.

This Fedora image contains only unmodified @Core group packages.

It is thus very minimal.  The kickstart and install script can be
found in the libguestfs source tree:


Fedora and the Infinity design logo are trademarks of Red Hat, Inc.
Source and further information is available from


virt-builder is a much-needed tool for consuming and publishing template VM images for KVM. It automates a lot of low-level commands normally used to deploy template images. Vagrant and Docker don't need to worry just yet but I think virt-builder is enough to satisfy anyone who is already working with virt-manager, virsh, and friends.

By the way, virt-builder is included in the libguestfs-tools Fedora package.

Saturday, 12 October 2013

Google Summer of Code 2013 has finished!

Google funded 9 students to contribute to QEMU, KVM, and libvirt during the summer of 2013. We had a successful Google Summer of Code that has now come to a close.

Osier Yang (mentor), Michael Roth (mentor), and I wrote a blog post that highlights two projects from this summer:

Gabriel Kerneis (mentor), Charlie Shepherd (student), and I also collaborated on a paper that describes the QEMU/CPC project that we had this summer. The paper is titled "QEMU/CPC: static analysis and CPS conversion for safe, portable, and efficient coroutines" and is available at There is also a mailing list discussion here.

Monday, 24 June 2013

virtio standardization has begun

The virtio paravirtualized I/O interfaces have been widely used in Linux and QEMU. Rusty Russell maintained a specification that the community worked around, but has now kicked off standardization through the OASIS standards body.

Follow virtio specification activity and participate on the VIRTIO Technical Committee page.

Today virtio devices include block (disk), SCSI, net (NIC), rng (random number generator), serial, and 9P (host<->guest filesystem). These devices can operate over PCI (used by x86 KVM), MMIO (used for ARM), and other transports.

I'm participating in the VIRTIO TC and hope this new level of virtio activity leads to even wider adoption of open source virtualized I/O devices.