[libvirt] [PATCH 0/3] Some qemu monitor crashers fix

The patches are pretty independent with each other. Michal Privoznik (3): qemu: Avoid double free of VM qemu: Don't access vm->priv on unlocked domain qemuMonitorDispose: Reset lastError src/qemu/qemu_migration.c | 9 +++++---- src/qemu/qemu_monitor.c | 1 + src/qemu/qemu_process.c | 9 --------- 3 files changed, 6 insertions(+), 13 deletions(-) -- 1.8.1.5

One of my previous patches (c7ac2519b7f) did try to fix the issue when domain dies too soon during migration. However, this clumsy approach was missing removal of qemuProcessHandleMonitorDestroy resulting in double unrefing of mon->vm and hence producing the daemon crash: ==11843== Invalid read of size 4 ==11843== at 0x50C28C5: virObjectUnref (virobject.c:255) ==11843== by 0x1148F7DB: qemuMonitorDispose (qemu_monitor.c:258) ==11843== by 0x50C2991: virObjectUnref (virobject.c:262) ==11843== by 0x50C2D13: virObjectFreeCallback (virobject.c:388) ==11843== by 0x509C37B: virEventPollCleanupHandles (vireventpoll.c:583) ==11843== by 0x509C711: virEventPollRunOnce (vireventpoll.c:652) ==11843== by 0x509A620: virEventRunDefaultImpl (virevent.c:274) ==11843== by 0x520D21C: virNetServerRun (virnetserver.c:1112) ==11843== by 0x11F368: main (libvirtd.c:1513) ==11843== Address 0x13b88864 is 4 bytes inside a block of size 136 free'd ==11843== at 0x4A07F5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==11843== by 0x5079A2F: virFree (viralloc.c:580) ==11843== by 0x50C29E3: virObjectUnref (virobject.c:270) ==11843== by 0x114770E4: qemuProcessHandleMonitorDestroy (qemu_process.c:1103) ==11843== by 0x1148F7CB: qemuMonitorDispose (qemu_monitor.c:257) ==11843== by 0x50C2991: virObjectUnref (virobject.c:262) ==11843== by 0x50C2D13: virObjectFreeCallback (virobject.c:388) ==11843== by 0x509C37B: virEventPollCleanupHandles (vireventpoll.c:583) ==11843== by 0x509C711: virEventPollRunOnce (vireventpoll.c:652) ==11843== by 0x509A620: virEventRunDefaultImpl (virevent.c:274) ==11843== by 0x520D21C: virNetServerRun (virnetserver.c:1112) ==11843== by 0x11F368: main (libvirtd.c:1513) Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 9 --------- 1 file changed, 9 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 7c23eb4..3de4e53 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1095,14 +1095,6 @@ error: return -1; } - -static void qemuProcessHandleMonitorDestroy(qemuMonitorPtr mon ATTRIBUTE_UNUSED, - virDomainObjPtr vm, - void *opaque ATTRIBUTE_UNUSED) -{ - virObjectUnref(vm); -} - static int qemuProcessHandleTrayChange(qemuMonitorPtr mon ATTRIBUTE_UNUSED, virDomainObjPtr vm, @@ -1366,7 +1358,6 @@ cleanup: static qemuMonitorCallbacks monitorCallbacks = { - .destroy = qemuProcessHandleMonitorDestroy, .eofNotify = qemuProcessHandleMonitorEOF, .errorNotify = qemuProcessHandleMonitorError, .diskSecretLookup = qemuProcessFindVolumeQcowPassphrase, -- 1.8.1.5

Since 86d90b3a (yes, my patch; again) we are supporting NBD storage migration. However, on error recovery path we got the steps reversed. The correct order is: return NBD port to the virPortAllocator and then either unlock the vm or remove it from the driver. Not vice versa. ==11192== Invalid write of size 4 ==11192== at 0x11488559: qemuMigrationPrepareAny (qemu_migration.c:2459) ==11192== by 0x11488EA6: qemuMigrationPrepareDirect (qemu_migration.c:2652) ==11192== by 0x114D1509: qemuDomainMigratePrepare3Params (qemu_driver.c:10332) ==11192== by 0x519075D: virDomainMigratePrepare3Params (libvirt.c:7290) ==11192== by 0x1502DA: remoteDispatchDomainMigratePrepare3Params (remote.c:4798) ==11192== by 0x12DECA: remoteDispatchDomainMigratePrepare3ParamsHelper (remote_dispatch.h:5741) ==11192== by 0x5212127: virNetServerProgramDispatchCall (virnetserverprogram.c:435) ==11192== by 0x5211C86: virNetServerProgramDispatch (virnetserverprogram.c:305) ==11192== by 0x520A8FD: virNetServerProcessMsg (virnetserver.c:165) ==11192== by 0x520A9E1: virNetServerHandleJob (virnetserver.c:186) ==11192== by 0x50DA78F: virThreadPoolWorker (virthreadpool.c:144) ==11192== by 0x50DA11C: virThreadHelper (virthreadpthread.c:161) ==11192== Address 0x1368baa0 is 576 bytes inside a block of size 688 free'd ==11192== at 0x4A07F5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==11192== by 0x5079A2F: virFree (viralloc.c:580) ==11192== by 0x11456C34: qemuDomainObjPrivateFree (qemu_domain.c:267) ==11192== by 0x50F41B4: virDomainObjDispose (domain_conf.c:2034) ==11192== by 0x50C2991: virObjectUnref (virobject.c:262) ==11192== by 0x50F4CFC: virDomainObjListRemove (domain_conf.c:2361) ==11192== by 0x1145C125: qemuDomainRemoveInactive (qemu_domain.c:2087) ==11192== by 0x11488520: qemuMigrationPrepareAny (qemu_migration.c:2456) ==11192== by 0x11488EA6: qemuMigrationPrepareDirect (qemu_migration.c:2652) ==11192== by 0x114D1509: qemuDomainMigratePrepare3Params (qemu_driver.c:10332) ==11192== by 0x519075D: virDomainMigratePrepare3Params (libvirt.c:7290) ==11192== by 0x1502DA: remoteDispatchDomainMigratePrepare3Params (remote.c:4798) Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_migration.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 4f35a7a..13d0808 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2450,13 +2450,14 @@ cleanup: VIR_FORCE_CLOSE(dataFD[0]); VIR_FORCE_CLOSE(dataFD[1]); if (vm) { - if (ret >= 0 || vm->persistent) - virObjectUnlock(vm); - else - qemuDomainRemoveInactive(driver, vm); if (ret < 0) { virPortAllocatorRelease(driver->remotePorts, priv->nbdPort); priv->nbdPort = 0; + } else { + if (vm->persistent) + virObjectUnlock(vm); + else + qemuDomainRemoveInactive(driver, vm); } } if (event) -- 1.8.1.5

On 11/06/2013 02:06 PM, Michal Privoznik wrote:
Since 86d90b3a (yes, my patch; again) we are supporting NBD storage migration. However, on error recovery path we got the steps reversed. The correct order is: return NBD port to the virPortAllocator and then either unlock the vm or remove it from the driver. Not vice versa.
...
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_migration.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
ACK

On 11/06/2013 02:06 PM, Michal Privoznik wrote:
Since 86d90b3a (yes, my patch; again) we are supporting NBD storage migration. However, on error recovery path we got the steps reversed. The correct order is: return NBD port to the virPortAllocator and then either unlock the vm or remove it from the driver. Not vice versa.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_migration.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 4f35a7a..13d0808 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2450,13 +2450,14 @@ cleanup: VIR_FORCE_CLOSE(dataFD[0]); VIR_FORCE_CLOSE(dataFD[1]); if (vm) { - if (ret >= 0 || vm->persistent) - virObjectUnlock(vm); - else - qemuDomainRemoveInactive(driver, vm); if (ret < 0) { virPortAllocatorRelease(driver->remotePorts, priv->nbdPort); priv->nbdPort = 0; + } else { + if (vm->persistent) + virObjectUnlock(vm); + else + qemuDomainRemoveInactive(driver, vm); } } if (event)
On second thought, this doesn't unlock the VM if (ret < 0) && (vm->persistent).

Since the 90139a62 commit the error is copied into mon->lastError but it's never freed from there. ==31989== 395 bytes in 1 blocks are definitely lost in loss record 877 of 978 ==31989== at 0x4A06C2B: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==31989== by 0x7EAF129: strdup (in /lib64/libc-2.15.so) ==31989== by 0x50D586C: virStrdup (virstring.c:554) ==31989== by 0x50976C1: virCopyError (virerror.c:191) ==31989== by 0x5097A35: virCopyLastError (virerror.c:312) ==31989== by 0x114909A9: qemuMonitorIO (qemu_monitor.c:690) ==31989== by 0x509BEDE: virEventPollDispatchHandles (vireventpoll.c:501) ==31989== by 0x509C701: virEventPollRunOnce (vireventpoll.c:648) ==31989== by 0x509A620: virEventRunDefaultImpl (virevent.c:274) ==31989== by 0x520D21C: virNetServerRun (virnetserver.c:1112) ==31989== by 0x11F368: main (libvirtd.c:1513) Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_monitor.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 2bafe28..0e520a0 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -257,6 +257,7 @@ static void qemuMonitorDispose(void *obj) (mon->cb->destroy)(mon, mon->vm, mon->callbackOpaque); virObjectUnref(mon->vm); + virResetError(&mon->lastError); virCondDestroy(&mon->notify); VIR_FREE(mon->buffer); virJSONValueFree(mon->options); -- 1.8.1.5

On 11/06/2013 02:06 PM, Michal Privoznik wrote:
Since the 90139a62 commit the error is copied into mon->lastError but it's never freed from there.
...
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_monitor.c | 1 + 1 file changed, 1 insertion(+)
ACK
participants (2)
-
Ján Tomko
-
Michal Privoznik