On 08/21/2013 10:51 AM, Paolo Bonzini wrote:
Il 21/08/2013 18:48, Daniel P. Berrange ha scritto:
> No, <on_crash> is the right thing to be using for this from
> libvirt's pov & I don't think we should invent something new.
> The <on_crash> element has always been intended to represent
> handling of guest panics, not qemu internal errors.
Actually for Xen HVM guests, it mostly traps things such as failed
vmentries. The Xen PV-on-HVM drivers do not register a panic notifier
that moves the guest to the "crashed" state.
<on_crash> cannot be salvaged, in my opinion, because all domain XMLs in
the wild will have a setting that causes libvirt to add "-device
isa-pvpanic". Thus changing libvirt versions will change guest
hardware, which is _very_ bad.
Let's expand on that statement:
Libvirt's default for <on_crash> is 'destroy'. But virt-install (and
thus virt-manager) have been setting explicit 'restart' for AGES now.
Arguably, this is YET ANOTHER reason why virt-manager should be using
libosinfo to make sane choices about new guest XML, based on known
capabilities of the guest it will be installing. But that only affects
newly created guests after we fix the virt stack.
In the meantime, you have a point that we have a back-compat mess - we
promise ABI stability (guests shall not see hardware changes when
upgrading versions of libvirtd but leaving the XML unchanged - the only
way to change hardware seen by an existing guest is to explicitly modify
XML).
In addition, Windows XP and 2003 will show the annoying device wizard
upon a libvirt upgrade, and fixing this is what surfaced all the mess.
Yes, so we need the back-compat code to leave pvpanic out of
pre-existing guests, if we can find a way to sensibly do that.
So, this boils down to a question of what SHOULD the valid states for
<on_crash> be? Generically, we want <on_crash>destroy</on_crash> to
not
invalidate a guest, but also to not instantiate a pvpanic device; since
that covers the libvirt defaults. We also want
<on_crash>restart</on_crash> to not invalidate a guest, but also to not
instantiate a pvpanic device, since so many existing guests have that
setting thanks to virt-install.
Maybe that means we add attributes/sub-elements to <on_crash> that
express whether pvpanic device is permitted; and the absence of that
attribute means the status quo (the <on_crash> tag is effectively
ignored because without pvpanic device, there is no way for libvirt to
learn if a guest panicked). Or does it mean we expose a new sub-element
of <devices>, similar to how we have a <memballoon> subelement that
controls whether the memballoon device is show to the guest, and just
document that for qemu, <on_crash> is a no-op without the <pvpanic>
subelement?
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org