
On 7/25/24 14:18, Daniel P. Berrangé wrote:
On Thu, Jul 25, 2024 at 12:57:59PM +0200, Michal Privoznik wrote:
Currently, qemuProcessStop() unlocks given domain object right in the middle of cleanup process. This is dangerous because there might be another thread which is executing virDomainObjListAdd(). And since the domain object is on the list of domain objects AND by the time qemuProcessStop() unlocks it the object is also marked as inactive, the other thread acquires the lock and switches vm->def pointer.
The unlocking of domain object is needed though, to allow even processing thread finish its queue. Well, the processing can be done before any cleanup is attempted.
Therefore, use freshly introduced virEventThreadStop() to join the event thread and drop lock/unlock from the middle of qemuProcessStop().
Fixes: 3865410e7f67ca4ec66e9a905e75f452762a97f0 Resolves: https://issues.redhat.com/browse/RHEL-49607 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 6255ba92e7..0869307a90 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -8615,6 +8615,14 @@ void qemuProcessStop(virQEMUDriver *driver, * reporting so we don't squash a legit error. */ virErrorPreserveLast(&orig_err);
+ /* By unlocking the domain object the events processing thread is + * allowed to finish its job. */ + if (priv->eventThread) { + virObjectUnlock(vm); + virEventThreadStop(priv->eventThread); + virObjectLock(vm); + } +
I wonder if this is a little too early.
We don't call qemuMOnitorClose until a short while later, just after we have done qemuProcessKill.
If we stop the event loop here, then I worry that we're at risk of missing some final monitor events that might be desirable ? eg the final lifecycle events indicating that we're stopping/stopped ?
But can we close the monitor without a job? And IIUC, we shouldn't try to stop the event loop thread with a job held (as those callbacks try to acquire _MODIFY job on their own). Michal