On Fri, Nov 25, 2016 at 09:19:18 +0100, Boris Fiuczynski wrote:
(Re-)Starting libvirt on a system with running qemu domains which
earlier
had been successfully started by libvirt results in the error
internal error: unable to execute QEMU command 'query-hotpluggable-cpus':
The feature 'query-hotpluggable-cpus' is not enabled
if the qemu binary does not support the qmp command 'query-hotpluggable-cpus'.
As libvirt tries to reconnect to the running qemu domains it reads in the
capabilities but in qemuProcessReconnect misses to run
virQEMUCapsCacheLookupCopy and not clearing the query-hotpluggable-cpus
capability in virQEMUCapsFilterByMachineType which was introduced with
commit 920bbe5c.
Libvirt therefore issues the qmp command and qemu responds with the error
'The feature 'query-hotpluggable-cpus' is not enabled'.
As a consequence libvirt terminates the running qemu process since it
determines that it cannot reconnect to the domain.
Signed-off-by: Boris Fiuczynski <fiuczy(a)linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay(a)linux.vnet.ibm.com>
---
Due to the severity of this issue I recommend to backport this fix
into all maintenance releases up to v2.2.0.
src/qemu/qemu_process.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index f8f379a..675f5b5 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -3349,8 +3349,7 @@ qemuProcessReconnect(void *opaque)
/* If upgrading from old libvirtd we won't have found any
* caps in the domain status, so re-query them
At reconnect the capabilities are taken from the status XML file, where
they are saved for every instance specifically. This code is supposed to
run
*/
- if (!priv->qemuCaps &&
- !(priv->qemuCaps = virQEMUCapsCacheLookupCopy(caps,
+ if (!(priv->qemuCaps = virQEMUCapsCacheLookupCopy(caps,
driver->qemuCapsCache,
obj->def->emulator,
obj->def->os.machine)))
NACK, this certainly is not the right fix. Does the status XML have the
'query-hotpluggable-cpus' capability set? If it's so then it was
probably mis-detected at start of the VM and should be fixed there.
If there is no such capability, then the reconnect code is broken
somehow.
At any rate we should not re-detect the capabilities if they were
reloaded properly from the XML.
Peter