On Tue, Sep 20, 2011 at 02:06:49PM -0400, Dave Allan wrote:
On Tue, Sep 20, 2011 at 07:39:15PM +0200, Jiri Denemark wrote:
> The commit that prevents disk corruption on domain shutdown
> (96fc4784177ecb70357518fa863442455e45ad0e) causes regression with QEMU
> 0.14.* and 0.15.* because of a regression bug in QEMU that was fixed
> only recently in QEMU git. With affected QEMU binaries, domains cannot
> be shutdown properly and stay in a paused state. This patch tries to
> avoid this by sending SIGKILL to 0.1[45].* QEMU processes. Though we
> wait a bit more between sending SIGTERM and SIGKILL to reduce the
> possibility of virtual disk corruption.
IMO, SIGKILL should only be sent at the explicit direction of the
user, saying in effect, I'm ok with possible data corruption, I want
the VM killed unconditionally. I would rather leave VMs paused than
risk corrupting data. Let's get as much input as we can from the qemu
folks before we go down this path.
If we want that, then we need to have a different way to deal with
this shutdown problem. Leaving VMs in a paused state is not
acceptable behaviour - our IRC channel has already been flooded
with people complaining about this and we've only released this
2 days ago.
This patch of Jiri's attempts to minimise the liklihood of hitting
a problem. If people want an absolute guarantee then they have the
option to update to a version of QEMU that has the fix, or distros
can backport the fix to QEMU and disable this bit in libvirt.
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|