On Wed, Jan 02, 2013 at 03:36:54PM +0000, Daniel P. Berrange wrote:
This is something I was thinking about a little over the christmas
break. I've no intention of implementing this in the immediate
future, but wanted to post it while it was fresh in my mind.
Historically we have had 2 ways of using the stateful drivers like
QEMU/LXC/UML/etc.
- "system mode" - privileged libvirtd, one per host, started at boot
- "session mode" - unprivileged libvirtd, one per non-root user, autostarted
Within context of each daemon, VM name uniqueness is enforced. Operating
via the daemon means that all applications connected to libvirtd get the
same world view. This single world view is exactly what you want when
dealing with server / cloud / desktop virtualization, because it means
tools like 'virt-top', 'virt-viewer', 'virsh' can see the same
VMs as
virt-manager / oVirt / OpenStack / Boxes / etc.
Recently we've seen increasing importance of a new use case which I will
refer to as "embedded" virtualization. The best example of this use case
is libguestfs which has long run a dedicated QEMU instance, but just now
switched to using libvirtd. The other use case is virt-sandbox which is
doing application confinement using LXC/KVM.
In both these cases, operating via libvirtd is sub-optimal. Users of so
called "embedded" virtualization, explicitly don't want to have
interaction
with other libvirt applications. They likely don't even want to expose the
concept of virtualization to their users. For them virtualization is intended
to be just a hidden impl detail of their application.
Some issues which arise when using embedded virtualization
- Need to invent sensible unique names for each VM launched. This
leads to pollution of logfiles for QEMU instances run.
- User sees libguestfs / virt-sandbox VMs in virt-manager / oVirt
which they may then try to "manage", breaking libguestfs / etc
I didn't realize this before, but yes this is bad.
- Disassociated process context, so if 'virt-sandbox' is
placed in
a cgroup, the VMs it launches are in a different cgroup. Likewise
if custom env variables are set, work is needed to propagate those
to VMs.
This leads me to wonder whether it is worth exploring the idea of a new
type of libvirt connection.
- "embed mode" - no libvirtd, driver runs in application context
Seems like an excellent idea.
The idea here is to take libvirtd out of the equation and directly
use
the QEMU driver code in the libvirt.so client / application. Since
libvirtd (mostly) uses the same APIs as the public libvirt.so clients,
there isn't much required to make this work.
- A way for the application to invoke virStateInit for the driver
- Application must provide an event loop impl
- A way to specify alternative dirs for logs/state/config/etc
An application would access this mode using a different path for the
driver, and specifying the path to use for logs/state/config etc.
eg libguestfs might use
qemu:///embed?statedir=/tmp/libguestfsXXXXX/
to get an instance of the QEMU driver that is completely private
to itself. One question is whether there should be a single embed
instance per process, or whether an application should be allowed
to open multiple completely isolated embed instances. The latter
might require is to eliminate more static variables in our code.
This kind of embedded mode is not without its downsides though
- How to access virtual network / storage / node device APIs ?
libguestfs only uses (optionally) user networking. We also don't
access any storage or node APIs, and don't intend to AFAIK.
- Extra SELinux policy work to allow each app to have the same
kind of privileges that libvirtd has to let it start VMs
Right, this is important, and probably tricky. How about still
running libvirtd, but per connection/process? (I think you mentioned
before that this is in fact already possible). It costs 1 extra fork,
but in the libguestfs scheme of things this won't make much
difference.
- How to troubleshoot - can't use things like
'virsh qemu-monitor-command'
since the embedded instance is private to the application
in question.
For libguestfs this last one isn't important.
One answer to the latter question, might be to actually allow the
application to expose the same RPC service as libvirtd does. So
virsh could connect to libguestfs using
qemu:///embed?socketdir=/tmp/libguestfsXXXXX/libvirt-sock
For the question of network/storage/node device access, the long
term answer is probably to split up the system libvirtd instance
into separate pieces. eg a virtnodedeviced, virtnetworkd,
virstoraged, virtqemud, virtlxcd, etc. Now a client app would
connect to their embedded QEMU instances, but to the shared
nodedevice/network/storage daemons.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top