[libvirt] [PATCH] qemu: Don't lose VM runtime state on libvirt downgrade

Run libvirtd from git with latest qemu, start a VM, stop libvirtd. Run an older libvirtd version an you may see an error like: qemuDomainObjPrivateXMLParse:857 : internal error: Unknown qemu capabilities flag device-tray-moved-event Libvirt finds a cached capabilities flag it doesn't understand, and fails to parse the VM runtime state. It now thinks the VM isn't running, when it is. This is potentially serious since it could lead to disk corruption if the VM is re-run. For the common case of unknown qemu capabilities flags, treat an unknown flag as non-fatal and continue on --- src/qemu/qemu_domain.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index b0eb3b6..dbf8124 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1411,18 +1411,18 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, goto error; for (i = 0; i < n; i++) { + int flag; char *str = virXMLPropString(nodes[i], "name"); - if (str) { - int flag = virQEMUCapsTypeFromString(str); - if (flag < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unknown qemu capabilities flag %s"), str); - VIR_FREE(str); - goto error; - } - VIR_FREE(str); + if (!str) + continue; + + flag = virQEMUCapsTypeFromString(str); + if (flag < 0) { + VIR_WARN("Unknown qemu capabilities flag %s", str); + } else { virQEMUCapsSet(qemuCaps, flag); } + VIR_FREE(str); } priv->qemuCaps = qemuCaps; -- 2.7.4

On Sun, May 15, 2016 at 18:35:24 -0400, Cole Robinson wrote:
Run libvirtd from git with latest qemu, start a VM, stop libvirtd. Run an older libvirtd version an you may see an error like:
We explicitly don't support downgrades. It will be very hard to handle this correctly ... let me explain: [...]
--- src/qemu/qemu_domain.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index b0eb3b6..dbf8124 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1411,18 +1411,18 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, goto error;
for (i = 0; i < n; i++) { + int flag; char *str = virXMLPropString(nodes[i], "name"); - if (str) { - int flag = virQEMUCapsTypeFromString(str); - if (flag < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unknown qemu capabilities flag %s"), str); - VIR_FREE(str); - goto error; - } - VIR_FREE(str); + if (!str) + continue; + + flag = virQEMUCapsTypeFromString(str); + if (flag < 0) { + VIR_WARN("Unknown qemu capabilities flag %s", str);
At this point the VM was created with a set of capabilities known by the newer libvirt version which may change the behavior in a way where only the new code can handle it. One of recent examples would be QEMU_CAPS_NAME_DEBUG_THREADS which broke one of the APIs returning stats about the thread due to change in the naming. The API was fixed along with the addition of the capability. If any previous version would contain this code the API would start to fail after a downgrade.
+ } else { virQEMUCapsSet(qemuCaps, flag); } + VIR_FREE(str); }
NACK to this approach. If you want to fix the disk corruption issue which is legitimate you need to kill the running VM process with missing capabilities. Making silently ignore new caps is asking for trouble and complications in the long run since we'd need to start to be forward compatible. One of the troublesome approaches could be introduced by QEMU_CAPS_OBJECT_SECRET which needs different handling in the hotplug code (not yet implemented). Also this would create a false sense of that we actually do support downgrades which I don't think we ever want to do. Peter

On 05/16/2016 02:59 AM, Peter Krempa wrote:
On Sun, May 15, 2016 at 18:35:24 -0400, Cole Robinson wrote:
Run libvirtd from git with latest qemu, start a VM, stop libvirtd. Run an older libvirtd version an you may see an error like:
We explicitly don't support downgrades. It will be very hard to handle this correctly ... let me explain:
[...]
--- src/qemu/qemu_domain.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index b0eb3b6..dbf8124 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1411,18 +1411,18 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, goto error;
for (i = 0; i < n; i++) { + int flag; char *str = virXMLPropString(nodes[i], "name"); - if (str) { - int flag = virQEMUCapsTypeFromString(str); - if (flag < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unknown qemu capabilities flag %s"), str); - VIR_FREE(str); - goto error; - } - VIR_FREE(str); + if (!str) + continue; + + flag = virQEMUCapsTypeFromString(str); + if (flag < 0) { + VIR_WARN("Unknown qemu capabilities flag %s", str);
At this point the VM was created with a set of capabilities known by the newer libvirt version which may change the behavior in a way where only the new code can handle it.
One of recent examples would be QEMU_CAPS_NAME_DEBUG_THREADS which broke one of the APIs returning stats about the thread due to change in the naming. The API was fixed along with the addition of the capability. If any previous version would contain this code the API would start to fail after a downgrade.
+ } else { virQEMUCapsSet(qemuCaps, flag); } + VIR_FREE(str); }
NACK to this approach. If you want to fix the disk corruption issue which is legitimate you need to kill the running VM process with missing capabilities. Making silently ignore new caps is asking for trouble and complications in the long run since we'd need to start to be forward compatible.
One of the troublesome approaches could be introduced by QEMU_CAPS_OBJECT_SECRET which needs different handling in the hotplug code (not yet implemented).
Also this would create a false sense of that we actually do support downgrades which I don't think we ever want to do.
Makes sense, thanks for explaining. I'll drop this patch - Cole

On Mon, May 16, 2016 at 09:00:17AM -0400, Cole Robinson wrote:
On 05/16/2016 02:59 AM, Peter Krempa wrote:
On Sun, May 15, 2016 at 18:35:24 -0400, Cole Robinson wrote:
Run libvirtd from git with latest qemu, start a VM, stop libvirtd. Run an older libvirtd version an you may see an error like:
We explicitly don't support downgrades. It will be very hard to handle this correctly ... let me explain:
[...]
--- src/qemu/qemu_domain.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index b0eb3b6..dbf8124 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1411,18 +1411,18 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, goto error;
for (i = 0; i < n; i++) { + int flag; char *str = virXMLPropString(nodes[i], "name"); - if (str) { - int flag = virQEMUCapsTypeFromString(str); - if (flag < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unknown qemu capabilities flag %s"), str); - VIR_FREE(str); - goto error; - } - VIR_FREE(str); + if (!str) + continue; + + flag = virQEMUCapsTypeFromString(str); + if (flag < 0) { + VIR_WARN("Unknown qemu capabilities flag %s", str);
At this point the VM was created with a set of capabilities known by the newer libvirt version which may change the behavior in a way where only the new code can handle it.
One of recent examples would be QEMU_CAPS_NAME_DEBUG_THREADS which broke one of the APIs returning stats about the thread due to change in the naming. The API was fixed along with the addition of the capability. If any previous version would contain this code the API would start to fail after a downgrade.
+ } else { virQEMUCapsSet(qemuCaps, flag); } + VIR_FREE(str); }
NACK to this approach. If you want to fix the disk corruption issue which is legitimate you need to kill the running VM process with missing capabilities. Making silently ignore new caps is asking for trouble and complications in the long run since we'd need to start to be forward compatible.
One of the troublesome approaches could be introduced by QEMU_CAPS_OBJECT_SECRET which needs different handling in the hotplug code (not yet implemented).
Also this would create a false sense of that we actually do support downgrades which I don't think we ever want to do.
Makes sense, thanks for explaining. I'll drop this patch
This comes up time & agin. Do either of you want to submit a patch putting a nice description of the reason for this in a comment in code for future reference. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, May 19, 2016 at 10:31:53 +0100, Daniel Berrange wrote:
On Mon, May 16, 2016 at 09:00:17AM -0400, Cole Robinson wrote:
On 05/16/2016 02:59 AM, Peter Krempa wrote:
On Sun, May 15, 2016 at 18:35:24 -0400, Cole Robinson wrote:
[...]
Makes sense, thanks for explaining. I'll drop this patch
This comes up time & agin. Do either of you want to submit a patch putting a nice description of the reason for this in a comment in code for future reference.
I've posted a series that adds checks that are executed only in certain paths which will be the correct place to put stuff like this. I need to address some review comments though. Peter
participants (3)
-
Cole Robinson
-
Daniel P. Berrange
-
Peter Krempa