Sunday, 1 February 2015

Slides posted for "Observability in KVM: Troubleshooting virtual machines"

In my FOSDEM 2015 talk on Observability in KVM, I covered the basic tools and troubleshooting techniques for CPU, networking, and disk I/O problems in virtual machines.

My slides are now available here (PDF).

If you would like to learn the basics or get new ideas for troubleshooting with KVM, check them out.

Enjoy!

Wednesday, 24 December 2014

QEMU Advent Calendar 2014 retrospective

This year I ran http://qemu-advent-calendar.org/, an online advent calendar that features a QEMU disk image for download each day from December 1st to 24th.

Pitching the idea

The idea for a QEMU advent calendar is something I had in 2012 or 2013 but there is only one chance to do it per year and I missed the boat previously.  This year the stars were aligned, I was able to pitch the idea to people who I thought might be game at KVM Forum/LinuxCon Europe.

When I saw the reactions from people in the QEMU community on hearing the idea, I thought it had a chance.  Most people were amused and found it slightly weird, but they were positive and had ideas for disk images.

So I had a sense that I could collect disk image contributions from enough people to make the advent calendar work...

How it worked

Each advent calendar entry consists of a tarball with a disk image and "run" shell script, a brief description of the disk image, a screenshot, and a sources tarball (for GPL compliance).

Going into this I didn't demand a specific format of these artifacts from contributors.  Some people sent me a bare disk image and QEMU command-line to launch the thing.  Then I had to come up with the remaining artifacts and create the tarballs.

Digging up the GPL sources for various Linux distributions was time-consuming but I worked hard on this after a request was submitted for sources (not just a link or name/version of the distribution).

This process could have been much easier if I asked each contributor to follow a checklist and provide artifacts in a specific format.  Instead, I scrambled to put polish on contributions in various states of completeness.

Just-in-time calendar making

I launched the advent calendar with promises for around 10 disk images from potential contributors.  We needed 24 disk images so there was still quite a bit of ground to cover.

The risk was worth it because once the website went live, new contributions started to pour in.  The idea spread successfully on Google+, Hacker News, Reddit, and other communities so that additional people became inspired to recommend or build full disk images from scratch.

There were one or two days where a late cancelation or schedule slip meant someone who had promised an image couldn't deliver.  In those cases I had a list of half-baked ideas that I chose from, and I would scramble to put together an image in about 2 hours.

Companies contributed too

As the word spread about QEMU Advent Calendar 2014, I got emails where companies wanted to contribute disk images.  These were the Ubuntu Core and Pebble smartwatch disk images.

These images fit the scope of the calendar nicely and were "exclusive" in some form.  Both the Ubuntu Core and Pebble smartwatch images were brand new releases that had never seen the light of day before.  It was cool to feature not just nostalgic emulated software on the calendar but also cutting-edge products that are being developed right now with QEMU.

Canonical and Pebble were very proactive here but also tasteful.  They didn't try to push crass advertising, instead they had something appropriate to contribute.  It was easy to accept their contributions since they were in the spirit of the project.  (The whole calendar was ad-free and neither I nor the contributors made money from it.)

The impact

I wanted to do QEMU Advent Calendar 2014 for two reasons:
1. To spread the word about QEMU and cool open source software
2. To celebrate the QEMU community with a fun activity

Here we are, 480 GB of web traffic later.  41,000 unique visitors and over 1,000,000 hits!

(These numbers don't include the full Day 24 because I collected statistics and wrote up this post before waiting for it to finish.)

Top disk image by downloads: Day 1 - Slacker's time travel by Gerd Hoffmann.  Congratulations Gerd!

I'm very happy with the way things went.  The goals have been achieved!

Thank you for all the fun!

Thanks to everyone who contributed disk images.  There were a few disk images which we couldn't fit on the calendar for various reasons (file size too large, demo not quite working, etc).  All of them were appreciated though!

Special thanks to Alex Bennee for providing web traffic allowance way beyond my server's monthly quota.  We didn't know if this thing would take off but he monitored the situation and allowed it to stay online.

Happy holidays and New Year 2014/2015!

Monday, 20 October 2014

KVM Forum 2014 videos are being posted

Videos of the talks at KVM Forum 2014 in Düsseldorf are being posted on the KVM Forum G+ page.

They are coming online one-by-one as the volunteers finish editing the videos.

In the meantime, here are the slides from my talks at KVM Forum 2014 and Tracing Summit 2014:

Monday, 24 February 2014

QEMU has been accepted for Google Summer of Code 2014!

Great news, the organizations for Google Summer of Code 2014 have been announced and QEMU is participating again this year!

If you are a student who is interested in a 12-week full-time project working on QEMU, KVM, or libvirt this summer, head over to our project ideas page.

Student applications are open from March 10th to March 21st. See the Google Summer of Code timeline for details. Also be sure to reads the FAQs on the Summer of Code website to find out about eligibility, time requirements, and how the process works.

Sunday, 9 February 2014

Slides posted for "VIRTIO 1.0: Paravirtualized I/O for KVM and beyond"

I attended devconf.cz 2014 and gave a talk on VIRTIO 1.0, the standard for paravirtualized I/O devices supported by KVM and other virtualization software.

If you are looking for an overview of virtio, curious how paravirtualized I/O works, or thinking about designing custom devices, check out the slides:

http://vmsplice.net/~stefan/virtio-devconf-2014.pdf

The VIRTIO 1.0 standard refines the hardware specification so that implementors have a clear reference. Older versions of the specification were informal and did not undergo as much scrutiny. The presentation explains the key concepts: virtqueues, configuration space, the device status field, and feature negotiation. It should give you enough background to dive into the code or standard.

Monday, 6 January 2014

Coroutines in QEMU: The basics

Many developers struggle with coroutines the first time they encounter them in QEMU. In this blog post I explain what coroutines are and how to use them.

Callback hell in event-driven programs

QEMU is an event-driven program with a main loop that invokes callback functions when file descriptors or timers become ready. Callbacks become hard to manage when multiple steps are needed as part of a single high-level operation:

/* 3-step process written using callbacks */
void start(void)
{
    send("Hi, what's your name? ", step1);
}

void step1(void)
{
    read_line(step2);
}

void step2(const char *name)
{
    send("Hello, %s\n", name, step3);
}

void step3(void)
{
    /* done! */
}

"Callback hell" is the name for a confusing nest of callback functions which sometimes emerges in such programs. In QEMU we faced this challenge and looked for a solution to replace callbacks.

Instead of splitting logic across callbacks and manually marshalling data between them, we wanted to write sequential code even where event loop iterations are required:

/* 3-step process using coroutines */
void coroutine_fn say_hello(void)
{
    const char *name;

    co_send("Hi, what's your name? ");
    name = co_read_line();
    co_send("Hello, %s\n", name);
    /* done! */
}

The coroutine version is much easier to understand because the code is sequential. Under the hood the coroutine version returns back to the event loop just like the callback version. Therefore the code still uses the event loop but it can be written like a sequential program.

Coroutines as cooperative threads

Coroutines are cooperatively scheduled threads of control. There is no preemption timer that switches between coroutines periodically, instead switching between coroutines is always explicit. Coroutines run until termination or an explicit yield.

Cooperative scheduling makes writing coroutine code simpler than writing multi-threaded code. Only one coroutine executes at a time and it cannot be interrupted. In many cases this eliminates the need for locks since other coroutines cannot interfere while the current coroutine is running.

In other words, coroutines allow multiple tasks to be executed concurrently in a disciplined fashion.

The QEMU coroutine API

The coroutine API is documented in include/block/coroutine.h. The main functions are:

typedef void coroutine_fn CoroutineEntry(void *opaque);
Coroutine *qemu_coroutine_create(CoroutineEntry *entry);

When a new coroutine is started, it will begin executing the entry function. The caller can pass an opaque pointer to data needed by the coroutine. If you are familiar with multi-threaded programming, this interface is similar to pthread_create(3).

The new coroutine is executed by calling qemu_coroutine_enter():

void qemu_coroutine_enter(Coroutine *coroutine, void *opaque);

If the coroutine needs to wait for an event such as I/O completion or user input, it calls qemu_coroutine_yield():

void coroutine_fn qemu_coroutine_yield(void);

The yield function transfers control back to the qemu_coroutine_enter() caller. The coroutine can be re-entered at a later point in time by calling qemu_coroutine_enter(), for example, when an I/O request has completed.

Conclusion

Coroutines make it possible to write sequential code that is actually executed across multiple iterations of the event loop. This is useful for code that needs to perform blocking I/O and would quickly become messy if split into a chain of callback functions. Transfer of control is always explicit using enter/yield, and there is no scheduler that automatically switches between coroutines.

QEMU provides additional primitives on top of the coroutine API for wait queues, mutexes, and timers. In a future blog post I will explain how to use these primitives.

Thursday, 19 December 2013

Distribute and provision disk images with virt-builder

I recently learnt about the virt-builder tool that was added in libguestfs 1.24. This is a really significant addition that makes publishing and using template disk images safe, quick, and efficient.

The best way to understand virt-builder is by looking at typical use cases.

Quick disk image creation from template images

For casual users there is a public repository of CentOS, Debian, and Ubuntu releases. Now you can create a Debian disk image with a single command. By the way, you don't need to be root:

$ virt-builder debian-7

Customization and configuration management

Whammo, you have a Debian 7 disk image. But folks that wish to customize the default image can use command-line options to add & delete files, create directories, install packages, and setup firstboot scripts. This makes virt-builder a great tool for bootstrapping your Puppet/Chef configuration management.

Publishing template images

Heavy duty users will publish their own disk images, maybe a library of images available to hosting customers or a private cloud environment. Not to mention virt-builder is a handy way for development teams to share template images. All this is possible using the cryptographically signed "index file" that catalogues the template images. Users can list and inspect images like this:

$ virt-builder --list
centos-6                 CentOS 6.5
debian-6                 Debian 6 (Squeeze)
debian-7                 Debian 7 (Wheezy)
fedora-18                Fedora® 18
fedora-19                Fedora® 19
fedora-20                Fedora® 20
scientificlinux-6        Scientific Linux 6.4
ubuntu-10.04             Ubuntu 10.04 (Lucid)
ubuntu-12.04             Ubuntu 12.04 (Precise)
ubuntu-13.10             Ubuntu 13.10 (Saucy)
$ virt-builder --notes fedora-20
Fedora 20.

This Fedora image contains only unmodified @Core group packages.

It is thus very minimal.  The kickstart and install script can be
found in the libguestfs source tree:

builder/website/fedora.sh

Fedora and the Infinity design logo are trademarks of Red Hat, Inc.
Source and further information is available from http://fedoraproject.org/

Conclusion

virt-builder is a much-needed tool for consuming and publishing template VM images for KVM. It automates a lot of low-level commands normally used to deploy template images. Vagrant and Docker don't need to worry just yet but I think virt-builder is enough to satisfy anyone who is already working with virt-manager, virsh, and friends.

By the way, virt-builder is included in the libguestfs-tools Fedora package.