I was at the KVM Forum / LinuxCon last week and there were many
interesting things discussed which are relevant to ongoing libvirt
development. Here was the list that caught my attention. If I have
missed any, fill in the gaps....
- Sandbox/container KVM. The Solaris port of KVM puts QEMU inside
a zone so that an exploit of QEMU can't escape into the full OS.
Containers are Linux's parallel of Zones, and while not nearly as
secure yet, it would still be worth using more containers support
to confine QEMU.
- Events for object changes. We already have async events for virDomainPtr.
We need the same for virInterfacePtr, virStoragePoolPtr, virStorageVolPtr
and virNodeDevPtr, so that at the very least applications can be notified
when objects are created or removed. For virNodeDevPtr we also want to
be notified when properties change (ie CDROM media change).
- CGroups passthrough. There is alot of experimentation with cgroups. We
don't want to expose cgroups as a direct concept in the libvirt API,
but we should consider putting a generic cgroups get/set in the
libvirt-qemu.so library, or create a libvirt-linux.so library.
Also likely add a <linux:cgroups> XML element to store arbitrary
tunables in the XML. Same (low) level of support as with qemu:XXX
of course
- CPUSet for changing CPU + Memory NUMA pinning. The CPUset cgroups
controller is able to actually move a guest's memory between NUMA
nodes. We can already change VCPU pinning, but we need a new API
to do node pinning of the whole VM, so we can ensure the I/O threads
are also moved. We also need an API to move the memory pinning to
new nodes.
- Guest NUMA topology. If we have guests with RAM size > node size,
we need to expose a NUMA topology into the guest. The CPU/memory
pinning APIs will also need to be able to pin individual guest
NUMA nodes to individual host NUMA nodes.
- AHCI controller. IDE is going the way of the dodo. We need to add
support for QEMU's new AHCI controller. This is quite simple, we
already have a 'sata' disk type we can wire up to QEMU
- VFIO PCI passthru. The current PCI assignment code may well be
changed to use something called 'VFIO'. This will need some
work in libvirt to support new CLI arg syntax, and probably
some SELinux work
- QCow3. There will soon be a QCow3 format. We need to add code to
detect it and extract backing stores, etc. Trivial since the primary
header format will still be the same as QCow2.
- QMP completion. Given anthony's plan for a complete replacement of
the current CLI + monitor syntax in QEMU 2.0 (long way out), he has
dropped objections to adding new commands to QMP in the near future.
So all existing HMP commands will immediately be made available in
QMP with no attempt to re-design them now. So the need for the HMP
passthrough command will soon go away.
- Migration + VEPA/VNLink failures. As raised previously on this list,
Cisco really wants libvirt to have the ability to do migration, and
optionally *not* fail, even if the VEPA/VNLink setup fails. This will
require an event notification to the app if a failure of a device
backend occurs, and an API to let the admin app fix the device backend
(virDomainUpdateDevice) and some way to tell migration what bits are
allowed to fail.
- Virtio SCSI. We need to support this new stuff in QEMU when it is
eventually implemented. It will mean we avoid the PCI slot usage
problems inherant in virtio-blk, and get other things like multipath
and decent SCSI passthrough support.
- USB 2.0. We need to support this in libvirt asap. It is very important
for desktop experiance and to support better integration with SPICE
This also gets us proper USB port addressing. Fun footnote, QEMU USB
has *never* supported migration. The USB tablet only works by sheer
luck, as OS' see the device disappear on migration & come back with
different device ID/port addr and so does a re-initialize !
- Native KVM tool. The problem statement was that the QEMU code is too
big/complex & and command line args are too complex, so lets rewrite
from scratch to make the code small & CLI simple. They achieve this,
but of course primarily because they lack so many features compared
to QEMU. They had libvirt support as a bullet point on their preso,
but I'm not expecting it to replace the current QEMU KVM support in
the forseeable future, given its current level of features and the
size of its dev team compared to QEMU/KVM. They did have some fun
demos of booting using the host OS filesystem though. We can
actually do the same with regular KVM/libvirt but there's no nice
demo tool to show it off. I'm hoping to create one....
- Shared memory devices. Some people doing high performance work are
using the QEMU shared memory device. We don't support this (ivhshm
device) in libvirt yet. Fairly niche use cases but might be nice to
have this.
- SDK / Docs. Request for a more SDK like approach to KVM development
tools and documentation. Also want to simplify libvirt operations.
The exposure of the virt-install internal API as official GObjects
would have significantly helped the project Ricardo (from IBM)
described in his presentation. Of course no one can argue that we
need more documentation in every area.
- USB managed mode. As we do with PCI passthrough, we should be able
to detach USB device from host OS, and perform a reset before
attaching to the guest, and most importantly track which USB devices
have been given to which guest, so we don't duplicate assign. We have
all neccessary APIs, just need to wire them up.
- PCI passthrough. We need to support setting of MAC addr, VLAN and
VEPA/VNLink properties against VFs from SRIOV NICs that are assigned
to a guest.
For those who were not at the KVM Forum, the presentations are already
available online at:
http://www.linux-kvm.org/page/KVM_Forum_2011
All the session were also video recorded, so sometime in the next week
or two, there should be OGG videos of the talks being uploaded to the
same site.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|