[PATCH 0/4] qemu_agent: Rework domain object handling in open

The most important patch is the first one because it fixes a race condition. The rest is just a cleanup I've noticed while looking at the code. Michal Prívozník (4): qemu_agent: Rework domain object locking when opening agent qemuAgentOpen: Rework domain object refcounting qemu_agent: Drop destroy callback qemuMonitorOpen: Rework domain object refcounting src/qemu/qemu_agent.c | 17 ++++++++--------- src/qemu/qemu_agent.h | 2 -- src/qemu/qemu_monitor.c | 5 ----- src/qemu/qemu_process.c | 20 -------------------- 4 files changed, 8 insertions(+), 36 deletions(-) -- 2.32.0

Just like qemuMonitorOpen(), hold the domain object locked throughout the whole time of qemuConnectAgent() and unlock it only for a brief time of actual connect() (because this is the only part that has a potential of blocking). The reason is that qemuAgentOpen() does access domain object (well, its privateData) AND also at least one argument (@context) depends on domain object. Accessing these without the lock is potentially dangerous. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1845468#c12 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_agent.c | 3 +++ src/qemu/qemu_process.c | 4 ---- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index 5f421be6f6..166cfaf485 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -704,7 +704,10 @@ qemuAgentOpen(virDomainObj *vm, goto cleanup; } + virObjectUnlock(vm); agent->fd = qemuAgentOpenUnix(config->data.nix.path); + virObjectLock(vm); + if (agent->fd == -1) goto cleanup; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d5f8a47ac2..d2ea9b55fe 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -238,16 +238,12 @@ qemuConnectAgent(virQEMUDriver *driver, virDomainObj *vm) * deleted while the agent is active */ virObjectRef(vm); - virObjectUnlock(vm); - agent = qemuAgentOpen(vm, config->source, virEventThreadGetContext(priv->eventThread), &agentCallbacks, virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VSERPORT_CHANGE)); - virObjectLock(vm); - if (agent == NULL) virObjectUnref(vm); -- 2.32.0

Currently, when opening an agent socket the qemuConnectAgent() increments domain object refcounter and calls qemuAgentOpen() where the domain object pointer is simply stored inside _qemuAgent struct. If qemuAgentOpen() fails, then it clears @cb member only to avoid qemuProcessHandleAgentDestroy() being called (which decrements the domain object refcounter) and the domain object refcounter is then decreased explicitly in qemuConnectAgent(). The same result can be achieved with much cleaner code: increment the refcounter inside qemuAgentOpen() and drop the dance around @cb. Also, the comment in qemuConnectAgent() about holding an extra reference is not correct. The thread that called qemuConnectAgent() already holds a reference to the domain object. No matter how many time the object is locked and unlocked the reference counter can't be decreased. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_agent.c | 14 +++++--------- src/qemu/qemu_process.c | 7 ------- 2 files changed, 5 insertions(+), 16 deletions(-) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index 166cfaf485..8bbaa127d4 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -171,9 +171,11 @@ qemuAgentEscapeNonPrintable(const char *text) static void qemuAgentDispose(void *obj) { qemuAgent *agent = obj; + VIR_DEBUG("agent=%p", agent); - if (agent->cb && agent->cb->destroy) - (agent->cb->destroy)(agent, agent->vm); + + if (agent->vm) + virObjectUnref(agent->vm); virCondDestroy(&agent->notify); g_free(agent->buffer); g_main_context_unref(agent->context); @@ -693,7 +695,7 @@ qemuAgentOpen(virDomainObj *vm, virObjectUnref(agent); return NULL; } - agent->vm = vm; + agent->vm = virObjectRef(vm); agent->cb = cb; agent->singleSync = singleSync; @@ -729,12 +731,6 @@ qemuAgentOpen(virDomainObj *vm, return agent; cleanup: - /* We don't want the 'destroy' callback invoked during - * cleanup from construction failure, because that can - * give a double-unref on virDomainObj *in the caller, - * so kill the callbacks now. - */ - agent->cb = NULL; qemuAgentClose(agent); return NULL; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d2ea9b55fe..b1fd72d269 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -234,19 +234,12 @@ qemuConnectAgent(virQEMUDriver *driver, virDomainObj *vm) goto cleanup; } - /* Hold an extra reference because we can't allow 'vm' to be - * deleted while the agent is active */ - virObjectRef(vm); - agent = qemuAgentOpen(vm, config->source, virEventThreadGetContext(priv->eventThread), &agentCallbacks, virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VSERPORT_CHANGE)); - if (agent == NULL) - virObjectUnref(vm); - if (!virDomainObjIsActive(vm)) { qemuAgentClose(agent); virReportError(VIR_ERR_INTERNAL_ERROR, "%s", -- 2.32.0

After previous cleanups this callback is unused. Remove it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_agent.h | 2 -- src/qemu/qemu_process.c | 9 --------- 2 files changed, 11 deletions(-) diff --git a/src/qemu/qemu_agent.h b/src/qemu/qemu_agent.h index 81b45b8e5d..b5954e76e7 100644 --- a/src/qemu/qemu_agent.h +++ b/src/qemu/qemu_agent.h @@ -28,8 +28,6 @@ typedef struct _qemuAgent qemuAgent; typedef struct _qemuAgentCallbacks qemuAgentCallbacks; struct _qemuAgentCallbacks { - void (*destroy)(qemuAgent *mon, - virDomainObj *vm); void (*eofNotify)(qemuAgent *mon, virDomainObj *vm); void (*errorNotify)(qemuAgent *mon, diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index b1fd72d269..ba9e9a5aa0 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -193,17 +193,8 @@ qemuProcessHandleAgentError(qemuAgent *agent G_GNUC_UNUSED, virObjectUnlock(vm); } -static void qemuProcessHandleAgentDestroy(qemuAgent *agent, - virDomainObj *vm) -{ - VIR_DEBUG("Received destroy agent=%p vm=%p", agent, vm); - - virObjectUnref(vm); -} - static qemuAgentCallbacks agentCallbacks = { - .destroy = qemuProcessHandleAgentDestroy, .eofNotify = qemuProcessHandleAgentEOF, .errorNotify = qemuProcessHandleAgentError, }; -- 2.32.0

Similarly to one of previous commits, there's no need to increment domain object refcounter before unlocking it. Any number of lock and unlock calls over domain object has no effect on the refcounter. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_monitor.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 908ee0d302..7559020001 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -769,10 +769,6 @@ qemuMonitorOpen(virDomainObj *vm, timeout += QEMU_DEFAULT_MONITOR_WAIT; - /* Hold an extra reference because we can't allow 'vm' to be - * deleted until the monitor gets its own reference. */ - virObjectRef(vm); - if (config->type != VIR_DOMAIN_CHR_TYPE_UNIX) { virReportError(VIR_ERR_INTERNAL_ERROR, _("unable to handle monitor type: %s"), @@ -798,7 +794,6 @@ qemuMonitorOpen(virDomainObj *vm, cleanup: if (!ret) VIR_FORCE_CLOSE(fd); - virObjectUnref(vm); return ret; } -- 2.32.0

On a Friday in 2021, Michal Privoznik wrote:
Similarly to one of previous commits, there's no need to increment domain object refcounter before unlocking it. Any number of lock and unlock calls over domain object has no effect on the refcounter.
There is no need because the caller holds a reference. The locks having no effect on the refcounter is not required, since they would be balanced anyway. Jano
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_monitor.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 908ee0d302..7559020001 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -769,10 +769,6 @@ qemuMonitorOpen(virDomainObj *vm,
timeout += QEMU_DEFAULT_MONITOR_WAIT;
- /* Hold an extra reference because we can't allow 'vm' to be - * deleted until the monitor gets its own reference. */ - virObjectRef(vm); - if (config->type != VIR_DOMAIN_CHR_TYPE_UNIX) { virReportError(VIR_ERR_INTERNAL_ERROR, _("unable to handle monitor type: %s"), @@ -798,7 +794,6 @@ qemuMonitorOpen(virDomainObj *vm, cleanup: if (!ret) VIR_FORCE_CLOSE(fd); - virObjectUnref(vm); return ret; }
-- 2.32.0

On a Friday in 2021, Michal Privoznik wrote:
The most important patch is the first one because it fixes a race condition. The rest is just a cleanup I've noticed while looking at the code.
Michal Prívozník (4): qemu_agent: Rework domain object locking when opening agent qemuAgentOpen: Rework domain object refcounting qemu_agent: Drop destroy callback qemuMonitorOpen: Rework domain object refcounting
src/qemu/qemu_agent.c | 17 ++++++++--------- src/qemu/qemu_agent.h | 2 -- src/qemu/qemu_monitor.c | 5 ----- src/qemu/qemu_process.c | 20 -------------------- 4 files changed, 8 insertions(+), 36 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano
participants (2)
-
Ján Tomko
-
Michal Privoznik