[PATCH] qemu_driver: Don't handle the EOF event if monitor changed

Currently, libvirt creates a thread pool with only on thread to handle all qemu monitor events for virtual machines, In the cases that if the thread gets stuck while handling a monitor EOF event, such as unable to kill the virtual machine process or release resources, the events of other virtual machine will be also blocked, which will lead to the abnormal behavior of other virtual machines. For instance, when another virtual machine completes a shutdown operation and the monitor EOF event has been queued but remains unprocessed, we immediately destroy and start the virtual machine again, at a later time when EOF event get processed, the processMonitorEOFEvent() will kill the virtual machine that just started. To address this issue, in the processMonitorEOFEvent(), we check whether the current virtual machine's monitor object matches the one at the time the event was generated. If they do not match, we immediately return. Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn> Signed-off-by: dengpengcheng <dengpc12@chinatelecom.cn> --- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_driver.c | 11 +++++++++-- src/qemu/qemu_process.c | 2 +- 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index ae19ce884b..bc80afdfac 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11500,7 +11500,6 @@ qemuProcessEventFree(struct qemuProcessEvent *event) case QEMU_PROCESS_EVENT_NETDEV_STREAM_DISCONNECTED: case QEMU_PROCESS_EVENT_NIC_RX_FILTER_CHANGED: case QEMU_PROCESS_EVENT_SERIAL_CHANGED: - case QEMU_PROCESS_EVENT_MONITOR_EOF: case QEMU_PROCESS_EVENT_GUEST_CRASHLOADED: g_free(event->data); break; @@ -11514,6 +11513,7 @@ qemuProcessEventFree(struct qemuProcessEvent *event) case QEMU_PROCESS_EVENT_UNATTENDED_MIGRATION: case QEMU_PROCESS_EVENT_RESET: case QEMU_PROCESS_EVENT_NBDKIT_EXITED: + case QEMU_PROCESS_EVENT_MONITOR_EOF: case QEMU_PROCESS_EVENT_LAST: break; } diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d00d2a27c6..57acafb48b 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3854,7 +3854,8 @@ processJobStatusChangeEvent(virDomainObj *vm, static void processMonitorEOFEvent(virQEMUDriver *driver, - virDomainObj *vm) + virDomainObj *vm, + qemuMonitor *mon) { qemuDomainObjPrivate *priv = vm->privateData; int eventReason = VIR_DOMAIN_EVENT_STOPPED_SHUTDOWN; @@ -3863,6 +3864,12 @@ processMonitorEOFEvent(virQEMUDriver *driver, unsigned int stopFlags = 0; virObjectEvent *event = NULL; + if (priv->mon != mon) { + VIR_DEBUG("Domain %p '%s' has been shutdown, ignoring EOF", + vm, vm->def->name); + return; + } + if (qemuProcessBeginStopJob(vm, VIR_JOB_DESTROY, true) < 0) return; @@ -4082,7 +4089,7 @@ static void qemuProcessEventHandler(void *data, void *opaque) processJobStatusChangeEvent(vm, processEvent->data); break; case QEMU_PROCESS_EVENT_MONITOR_EOF: - processMonitorEOFEvent(driver, vm); + processMonitorEOFEvent(driver, vm, processEvent->data); break; case QEMU_PROCESS_EVENT_PR_DISCONNECT: processPRDisconnectEvent(vm); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index b9267d8699..4de6e5c234 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -316,7 +316,7 @@ qemuProcessHandleMonitorEOF(qemuMonitor *mon, } qemuProcessEventSubmit(vm, QEMU_PROCESS_EVENT_MONITOR_EOF, - 0, 0, NULL); + 0, 0, priv->mon); /* We don't want this EOF handler to be called over and over while the * thread is waiting for a job. -- 2.17.1

On a Saturday in 2023, Guoyi Tu wrote:
Currently, libvirt creates a thread pool with only on thread to handle all qemu monitor events for virtual machines, In the cases that if the thread gets stuck while handling a monitor EOF event, such as unable to kill the virtual machine process or release resources, the events of other virtual machine will be also blocked, which will lead to the abnormal behavior of other virtual machines.
For instance, when another virtual machine completes a shutdown operation and the monitor EOF event has been queued but remains unprocessed, we immediately destroy and start the virtual machine again, at a later time when EOF event get processed, the processMonitorEOFEvent() will kill the virtual machine that just started.
To address this issue, in the processMonitorEOFEvent(), we check whether the current virtual machine's monitor object matches the one at the time the event was generated. If they do not match, we immediately return.
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn> Signed-off-by: dengpengcheng <dengpc12@chinatelecom.cn> --- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_driver.c | 11 +++++++++-- src/qemu/qemu_process.c | 2 +- 3 files changed, 11 insertions(+), 4 deletions(-)
The patch looks reasonable to me, but I cannot apply it: warning: Patch sent with format=flowed; space at the end of lines might be lost. Applying: qemu_driver: Don't handle the EOF event if monitor changed error: corrupt patch at line 32 Patch failed at 0001 qemu_driver: Don't handle the EOF event if monitor changed Do you have it applied somewhere on GitLab or can you try sending the output of 'git format-patch' as an attachment? Thank you, Jano

Thank you for your reminder, I'll send another patch On 2023/11/28 23:36, Ján Tomko wrote:
On a Saturday in 2023, Guoyi Tu wrote:
Currently, libvirt creates a thread pool with only on thread to handle all qemu monitor events for virtual machines, In the cases that if the thread gets stuck while handling a monitor EOF event, such as unable to kill the virtual machine process or release resources, the events of other virtual machine will be also blocked, which will lead to the abnormal behavior of other virtual machines.
For instance, when another virtual machine completes a shutdown operation and the monitor EOF event has been queued but remains unprocessed, we immediately destroy and start the virtual machine again, at a later time when EOF event get processed, the processMonitorEOFEvent() will kill the virtual machine that just started.
To address this issue, in the processMonitorEOFEvent(), we check whether the current virtual machine's monitor object matches the one at the time the event was generated. If they do not match, we immediately return.
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn> Signed-off-by: dengpengcheng <dengpc12@chinatelecom.cn> --- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_driver.c | 11 +++++++++-- src/qemu/qemu_process.c | 2 +- 3 files changed, 11 insertions(+), 4 deletions(-)
The patch looks reasonable to me, but I cannot apply it:
warning: Patch sent with format=flowed; space at the end of lines might be lost. Applying: qemu_driver: Don't handle the EOF event if monitor changed error: corrupt patch at line 32 Patch failed at 0001 qemu_driver: Don't handle the EOF event if monitor changed
Do you have it applied somewhere on GitLab or can you try sending the output of 'git format-patch' as an attachment?
Thank you, Jano
participants (2)
-
Guoyi Tu
-
Ján Tomko