Monday, April 20, 2020

virtio-fs has landed in QEMU 5.0!

The virtio-fs shared host<->guest file system has landed in QEMU 5.0! It consists of two parts: the QEMU -device vhost-user-fs-pci and the actual file server called virtiofsd. Guests need to have a virtio-fs driver in order to access shared file systems. In Linux the driver is called virtiofs.ko and has been upstream since Linux v5.4.

Using virtio-fs

Thanks to libvirt virtio-fs support, it's possible to share directories trees from the host with the guest like this:

<filesystem type='mount' accessmode='passthrough'>
    <driver type='virtiofs'/>
    <binary xattr='on'>
       <lock posix='on' flock='on'/>
    </binary>
    <source dir='/path/on/host'/>
    <target dir='mount_tag'/>
</filesystem>

The host /path/on/host directory tree can be mounted inside the guest like this:

# mount -t virtiofs mount_tag /mnt

Applications inside the guest can then access the files as if they were local files. For more information about virtio-fs, see the project website.

How it works

For the most part, -device vhost-user-fs-pci just facilitates the connection to virtiofsd where the real work happens. When guests submit file system requests they are handled directly by the virtiofsd process on the host and don't need to go via the QEMU process.

virtiofsd is a FUSE file system daemon with virtio-fs extensions. virtio-fs is built on top of the FUSE protocol and therefore supports the POSIX file system semantics that applications expect from a native Linux file system. The Linux guest driver shares a lot of code with the traditional fuse.ko kernel module.

Resources on virtio-fs

I have given a few presentations on virtio-fs:

Future features

A key feature of virtio-fs is the ability to directly access the host page cache, eliminating the need to copy file contents into guest RAM. This so-called DAX support is not upstream yet.

Live migration is not yet implemented. It is a little challenging to transfer all file system state to the destination host and seamlessly continue file system operation without remounting, but it should be doable.

There is a Rust implementation of virtiofsd that is close to reaching maturity and will replace the C implementation. The advantage is that Rust has better memory and thread safety than C so entire classes of bugs can be eliminated. Also, the codebase is written from scratch whereas the C implementation was a combination of several existing pieces of software that were not designed together.