[libvirt] qemu and <transient/> disks

Partly following up this message: https://www.redhat.com/archives/libvirt-users/2011-October/msg00142.html
And what about qemu's option "snapshot=on" ?
Unreliable. It won't work with SELinux (since qemu tries to create the snapshot on /tmp), and it makes your guest unmigratable. I think that it is easier to have libvirt use qemu-img to create a qcow2 wrapper prior to booting the guest, at which point the solution is easier to control from an sVirt perspective, and is more likely to allow us to figure out a way to make things work with migration (at least, I'm hoping I can figure out how to migrate a guest with a transient disk).
Lack of this feature is pretty annoying, particularly since (a) libguestfs needs it and (b) it's a trivial patch to add snapshot=on to the qemu driver. So to concentrate on the two objections above: - Why is SELinux concerned about qemu creating and using a file in /tmp? Obviously it should stop qemu opening and reading random files from /tmp but that would be a different rule surely? - The documentation actually states that using <transient/> may make your guest unable to migrate, and in any case we don't care.
But if you want to use qemu's snapshot=on in the meantime, you can use the <qemu:commandline> namespace XML to add it.
Is there an example how to do this? It seems like it might be possible using a global qemu '-set device.<id>.snapshot=on' parameter, but how can I know what <id> libvirt will give to a '-drive' parameter? (I found from experimentation that it's something like "drive-virtio-disk0", but I can't track down the bit of code that produces this string yet ...) Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming blog: http://rwmj.wordpress.com Fedora now supports 80 OCaml packages (the OPEN alternative to F#) http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora

On Sat, Jul 21, 2012 at 07:29:00PM +0100, Richard W.M. Jones wrote:
Partly following up this message: https://www.redhat.com/archives/libvirt-users/2011-October/msg00142.html
And what about qemu's option "snapshot=on" ?
Unreliable. It won't work with SELinux (since qemu tries to create the snapshot on /tmp), and it makes your guest unmigratable. I think that it is easier to have libvirt use qemu-img to create a qcow2 wrapper prior to booting the guest, at which point the solution is easier to control from an sVirt perspective, and is more likely to allow us to figure out a way to make things work with migration (at least, I'm hoping I can figure out how to migrate a guest with a transient disk).
Lack of this feature is pretty annoying, particularly since (a) libguestfs needs it and (b) it's a trivial patch to add snapshot=on to the qemu driver. So to concentrate on the two objections above:
- Why is SELinux concerned about qemu creating and using a file in /tmp? Obviously it should stop qemu opening and reading random files from /tmp but that would be a different rule surely?
Default labelling of files that are created is done based on the directory label. Since /tmp is a shared directory, this is suboptimal. We want snapshot files to go in a directory which is private to QEMU. More critically in Fedora 18, /tmp is tmpfs so you don't want to be using that for arbitrarily large guest disk files. I agree with the quoted text that libvirt should define a directory to use for the transient disks under /var/lib/libvirt/images perhaps, and just use 'qemu-img create' to create that, and the unlink it once QEMU has started up, so we can auto-cleanup on QEMU shutdown.
- The documentation actually states that using <transient/> may make your guest unable to migrate, and in any case we don't care.
I don't see what's wrong with the docs warning you about this.
But if you want to use qemu's snapshot=on in the meantime, you can use the <qemu:commandline> namespace XML to add it.
Is there an example how to do this? It seems like it might be possible using a global qemu '-set device.<id>.snapshot=on' parameter, but how can I know what <id> libvirt will give to a '-drive' parameter? (I found from experimentation that it's something like "drive-virtio-disk0", but I can't track down the bit of code that produces this string yet ...)
Yes, you can use the -set option as you describe. The ID values are produced by produced by qemuAssignDeviceAliases in qemu_command.c Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Mon, Jul 23, 2012 at 10:28:57AM +0100, Daniel P. Berrange wrote:
I agree with the quoted text that libvirt should define a directory to use for the transient disks under /var/lib/libvirt/images perhaps, and just use 'qemu-img create' to create that, and the unlink it once QEMU has started up, so we can auto-cleanup on QEMU shutdown.
One issue here is that 'qemu-img create' is slow. 0.4 seconds to create a qcow2 file backing another file (ie. per readonly disk, and programs like virt-df may open dozens of readonly disks). I looked at the strace output and it seems the slowness is just overhead from the machinery of qemu's coroutines, and not connected with writing the qcow2 file, which is presumably why the qemu snapshot=on option is not noticably slow. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones New in Fedora 11: Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 70 libraries supprt'd http://fedoraproject.org/wiki/MinGW http://www.annexia.org/fedora_mingw

On Mon, Jul 23, 2012 at 10:41:55AM +0100, Richard W.M. Jones wrote:
On Mon, Jul 23, 2012 at 10:28:57AM +0100, Daniel P. Berrange wrote:
I agree with the quoted text that libvirt should define a directory to use for the transient disks under /var/lib/libvirt/images perhaps, and just use 'qemu-img create' to create that, and the unlink it once QEMU has started up, so we can auto-cleanup on QEMU shutdown.
One issue here is that 'qemu-img create' is slow. 0.4 seconds to create a qcow2 file backing another file (ie. per readonly disk, and programs like virt-df may open dozens of readonly disks).
I looked at the strace output and it seems the slowness is just overhead from the machinery of qemu's coroutines, and not connected with writing the qcow2 file, which is presumably why the qemu snapshot=on option is not noticably slow.
Other options available could be - Enhance QEMU to allow a snapshot directory to be specified - Make libvirt actually create the empty QCow2 file. The QEMU people will probably say this is evil, but the knowledge required to a create an empty qcow2 file pointing to a backing file is pretty tiny. Or we could even do both, using the former for new enough QEMU and the latter only for older QEMU Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 07/23/2012 03:48 AM, Daniel P. Berrange wrote:
On Mon, Jul 23, 2012 at 10:41:55AM +0100, Richard W.M. Jones wrote:
On Mon, Jul 23, 2012 at 10:28:57AM +0100, Daniel P. Berrange wrote:
I agree with the quoted text that libvirt should define a directory to use for the transient disks under /var/lib/libvirt/images perhaps, and just use 'qemu-img create' to create that, and the unlink it once QEMU has started up, so we can auto-cleanup on QEMU shutdown.
I've also seen complaints from people that /var/lib/libvirt can be a space-limited directory; we should really consider adding the notion of a per-VM storage pool (defaulting to a libvirt-internal pool at /var/lib/libvirt, but can be specified by the user to live in a different mount point) for anything that needs potentially large amounts of storage (transient disk wrappers, managed save images, and so forth).
One issue here is that 'qemu-img create' is slow. 0.4 seconds to create a qcow2 file backing another file (ie. per readonly disk, and programs like virt-df may open dozens of readonly disks).
I looked at the strace output and it seems the slowness is just overhead from the machinery of qemu's coroutines, and not connected with writing the qcow2 file, which is presumably why the qemu snapshot=on option is not noticably slow.
Other options available could be
- Enhance QEMU to allow a snapshot directory to be specified - Make libvirt actually create the empty QCow2 file. The QEMU people will probably say this is evil, but the knowledge required to a create an empty qcow2 file pointing to a backing file is pretty tiny.
Especially since libvirt already has the knowledge on how to read a qcow2 file.
Or we could even do both, using the former for new enough QEMU and the latter only for older QEMU
Daniel
-- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (3)
-
Daniel P. Berrange
-
Eric Blake
-
Richard W.M. Jones