'qemuDomainObjStopWorker()' which is meant to dispose of the event loop
thread for the monitor unlocks the VM object while disposing the thread
to prevent possible deadlocks with events waiting on the monitor thread.
Unfortunately 'qemuDomainObjStopWorker()' is called *before* the VM is
marked as inactive by clearing 'vm->def->id', but at the same time it's
no longer marked as 'beingDestroyed' when we're inside
'qemuProcessStop()'.
If 'vm' would be kept locked this wouldn't be a problem. Same way it's
not a problem for anything that uses non-ASYNC VM jobs, or when the
monitor is accessed in an async job, as the 'destroy' job interlocks
with those.
It is a problem for code inside an async job which uses
'qemuDomainObjWait()' though. The API contract of qemuDomainObjWait()
ensures the caller that the VM on successful return from it, but in this
specific reason it's not the case, as both 'beingDestroyed' is already
false, and 'vm->def->id' is not yet cleared.
To fix the issue move the 'qemuDomainObjStopWorker()' call *after*
clearing 'vm->def->id' and also add a note stating what the function is
doing.
Fixes: 860a999802d3c82538373bb3f314f92a2e258754
Closes:
https://gitlab.com/libvirt/libvirt/-/issues/640
Reported-by: grass-lu <
https://gitlab.com/grass-lu>
Signed-off-by: Peter Krempa <pkrempa(a)redhat.com>
---
src/qemu/qemu_process.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 7ef7040a85..aef83d2409 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -8527,8 +8527,6 @@ void qemuProcessStop(virQEMUDriver *driver,
g_clear_pointer(&priv->monConfig, virObjectUnref);
}
- qemuDomainObjStopWorker(vm);
-
/* Remove the master key */
qemuDomainMasterKeyRemove(priv);
@@ -8562,6 +8560,11 @@ void qemuProcessStop(virQEMUDriver *driver,
/* Wake up anything waiting on domain condition */
virDomainObjBroadcast(vm);
+ /* IMPORTANT: qemuDomainObjStopWorker() unlocks @vm in order to prevent
+ * deadlocks with the per-VM event loop thread. This MUST be done after
+ * marking the VM as dead */
+ qemuDomainObjStopWorker(vm);
+
virFileDeleteTree(priv->libDir);
virFileDeleteTree(priv->channelTargetDir);
--
2.45.2