[libvirt] RFC: An "embedded" mode for QEMU/LXC drivers

Wednesday, 2 January 2013

This is something I was thinking about a little over the christmas
break. I've no intention of implementing this in the immediate
future, but wanted to post it while it was fresh in my mind.

Historically we have had 2 ways of using the stateful drivers like
QEMU/LXC/UML/etc.

 - "system mode"  - privileged libvirtd, one per host, started at boot
 - "session mode" - unprivileged libvirtd, one per non-root user, autostarted

Within context of each daemon, VM name uniqueness is enforced. Operating
via the daemon means that all applications connected to libvirtd get the
same world view. This single world view is exactly what you want when
dealing with server / cloud / desktop virtualization, because it means
tools like 'virt-top', 'virt-viewer', 'virsh' can see the same VMs
as
virt-manager / oVirt / OpenStack / Boxes / etc.

Recently we've seen increasing importance of a new use case which I will
refer to as "embedded" virtualization. The best example of this use case
is libguestfs which has long run a dedicated QEMU instance, but just now
switched to using libvirtd. The other use case is virt-sandbox which is
doing application confinement using LXC/KVM.

In both these cases, operating via libvirtd is sub-optimal. Users of so
called "embedded" virtualization, explicitly don't want to have interaction
with other libvirt applications. They likely don't even want to expose the
concept of virtualization to their users. For them virtualization is intended
to be just a hidden impl detail of their application.

Some issues which arise when using embedded virtualization

 - Need to invent sensible unique names for each VM launched. This
   leads to pollution of logfiles for QEMU instances run.

 - User sees libguestfs / virt-sandbox VMs in virt-manager / oVirt
   which they may then try to "manage", breaking libguestfs / etc

 - Disassociated process context, so if 'virt-sandbox' is placed in
   a cgroup, the VMs it launches are in a different cgroup. Likewise
   if custom env variables are set, work is needed to propagate those
   to VMs.

This leads me to wonder whether it is worth exploring the idea of a new
type of libvirt connection.

 - "embed mode" - no libvirtd, driver runs in application context

The idea here is to take libvirtd out of the equation and directly use
the QEMU driver code in the libvirt.so client / application. Since
libvirtd (mostly) uses the same APIs as the public libvirt.so clients,
there isn't much required to make this work.

 - A way for the application to invoke virStateInit for the driver
 - Application must provide an event loop impl
 - A way to specify alternative dirs for logs/state/config/etc

An application would access this mode using a different path for the
driver, and specifying the path to use for logs/state/config etc.
eg libguestfs might use

   qemu:///embed?statedir=/tmp/libguestfsXXXXX/

to get an instance of the QEMU driver that is completely private
to itself. One question is whether there should be a single embed
instance per process, or whether an application should be allowed
to open multiple completely isolated embed instances. The latter
might require is to eliminate more static variables in our code.

This kind of embedded mode is not without its downsides though

 - How to access virtual network  / storage / node device APIs ?

 - Extra SELinux policy work to allow each app to have the same
   kind of privileges that libvirtd has to let it start VMs

 - How to troubleshoot - can't use things like

    'virsh qemu-monitor-command'

   since the embedded instance is private to the application
   in question.

One answer to the latter question, might be to actually allow the
application to expose the same RPC service as libvirtd does. So
virsh could connect to libguestfs using

    qemu:///embed?socketdir=/tmp/libguestfsXXXXX/libvirt-sock

For the question of network/storage/node device access, the long
term answer is probably to split up the system libvirtd instance
into separate pieces. eg a virtnodedeviced, virtnetworkd,
virstoraged, virtqemud, virtlxcd, etc. Now a client app would
connect to their embedded QEMU instances, but to the shared
nodedevice/network/storage daemons.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005