[libvirt] [BUG] Managed save qemu state gets deleted after failed resume

Hello, I haven't had time to provide a fix, but still want you to inform you about a bug: If resuming a saved VM fails with Qemu-0.14, the managed save state file /var/lib/libvirt/qemu/save/$VM.save is still deleted. I think it would be better to only delete the state after an successful resume. /var/log/libvirt/qemu/ucs2.3-0_basis_amd64.log: 2011-03-30 09:37:56.960: starting up LC_ALL=C PATH=/sbin:/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -S -M pc-0.14 -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -name ucs2.3-0_basis_amd64 -uuid 656957aa-13a0-6922-5d08-3a39561f9775 -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/ucs2.3-0_basis_amd64.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=utc -boot cnd -drive file=/var/lib/libvirt/images/ucs_2.3-0-091215-dvd-amd64.iso,if=none,media=cdrom,id=drive-ide0-0-0,readonly=on,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=/var/lib/libvirt/images/ucs230basis.qcow2,if=none,id=drive-ide0-0-1,format=qcow2 -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -netdev tap,fd=24,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:12:7c:03,bus=pci.0,addr=0x3 -usb -device usb-tablet,id=input0 -vnc 0.0.0.0:5 -k de -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -option-rom /usr/share/kvm/pxe-rtl8139.bin Failed to allocate 2147483648 B: Cannot allocate memory /var/log/univention/virtual-machine-manager-daemon-errors.log: libvir: QEMU error : operation failed: migration to 'exec:cat | { dd bs=4096 seek=1 if=/dev/null && dd bs=1048576; } 1<>'/var/lib/libvirt/qemu/save/ucs2.3-0_basis_amd64.save'' failed: migration failed libvir: QEMU error : operation failed: failed to retrieve chardev info in qemu with 'info chardev' This issue is tracked in our (German) bug-tracker at <https://forge.univention.org/bugzilla/show_bug.cgi?id=22021> Sincerely Philipp Hahn -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH Linux for Your Business fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/

On 03/30/2011 02:24 AM, Philipp Hahn wrote:
Hello,
I haven't had time to provide a fix, but still want you to inform you about a bug: If resuming a saved VM fails with Qemu-0.14, the managed save state file /var/lib/libvirt/qemu/save/$VM.save is still deleted. I think it would be better to only delete the state after an successful resume.
Then we would get multiple failures on every resume attempt. However, I tend to agree with you that data loss in any form is bad, and this is a form of data loss.
-device usb-tablet,id=input0 -vnc 0.0.0.0:5 -k de -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -option-rom /usr/share/kvm/pxe-rtl8139.bin Failed to allocate 2147483648 B: Cannot allocate memory
Especially bad that it looks like a transient ENOMEM condition can cause the data loss (that is, if I'm interpreting your log correctly, then you can trigger this condition by temporarily consuming too much memory in the host, trying to resume a target, then reducing memory pressure, and try to resume again but now you've lost the state file to resume from).
This issue is tracked in our (German) bug-tracker at <https://forge.univention.org/bugzilla/show_bug.cgi?id=22021>
Thanks for the link. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

On 04/01/2011 10:28 AM, Eric Blake wrote:
On 03/30/2011 02:24 AM, Philipp Hahn wrote:
Hello,
I haven't had time to provide a fix, but still want you to inform you about a bug: If resuming a saved VM fails with Qemu-0.14, the managed save state file /var/lib/libvirt/qemu/save/$VM.save is still deleted. I think it would be better to only delete the state after an successful resume.
This issue is tracked in our (German) bug-tracker at <https://forge.univention.org/bugzilla/show_bug.cgi?id=22021>
Thanks for the link.
Should be fixed now (modulo anyone approving my followup patch to tweak the 'man virsh' wording): https://www.redhat.com/archives/libvir-list/2011-April/msg00405.html -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org
participants (2)
-
Eric Blake
-
Philipp Hahn