On 7/25/24 14:18, Daniel P. Berrangé wrote:
On Thu, Jul 25, 2024 at 12:57:59PM +0200, Michal Privoznik wrote:
> Currently, qemuProcessStop() unlocks given domain object right in
> the middle of cleanup process. This is dangerous because there
> might be another thread which is executing virDomainObjListAdd().
> And since the domain object is on the list of domain objects AND
> by the time qemuProcessStop() unlocks it the object is also
> marked as inactive, the other thread acquires the lock and
> switches vm->def pointer.
>
> The unlocking of domain object is needed though, to allow even
> processing thread finish its queue. Well, the processing can be
> done before any cleanup is attempted.
>
> Therefore, use freshly introduced virEventThreadStop() to join
> the event thread and drop lock/unlock from the middle of
> qemuProcessStop().
>
> Fixes: 3865410e7f67ca4ec66e9a905e75f452762a97f0
> Resolves:
https://issues.redhat.com/browse/RHEL-49607
> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
> ---
> src/qemu/qemu_process.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index 6255ba92e7..0869307a90 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -8615,6 +8615,14 @@ void qemuProcessStop(virQEMUDriver *driver,
> * reporting so we don't squash a legit error. */
> virErrorPreserveLast(&orig_err);
>
> + /* By unlocking the domain object the events processing thread is
> + * allowed to finish its job. */
> + if (priv->eventThread) {
> + virObjectUnlock(vm);
> + virEventThreadStop(priv->eventThread);
> + virObjectLock(vm);
> + }
> +
I wonder if this is a little too early.
We don't call qemuMOnitorClose until a short while later,
just after we have done qemuProcessKill.
If we stop the event loop here, then I worry that we're
at risk of missing some final monitor events that might
be desirable ? eg the final lifecycle events indicating
that we're stopping/stopped ?
But can we close the monitor without a job? And IIUC, we shouldn't try
to stop the event loop thread with a job held (as those callbacks try to
acquire _MODIFY job on their own).
Michal