[libvirt] [PATCH 0/2] fix some bugs in libvirtd

Steps to reproduce this bug: 1. service libvirtd start 2. virsh start <domain> 3. kill -STOP $(cat /var/run/libvirt/qemu/<domain>.pid) 4. service libvirtd restart 5. kill -9 $(cat /var/run/libvirt/qemu/<domain>.pid) Then libvirtd will core dump or be in deadlock state. Make sure that json is built into libvirt and the version of qemu is newer than 0.13.0. Wen Congyang (2): avoid vm to be deleted if qemuConnectMonitor failed avoid closing monitor twice src/qemu/qemu_driver.c | 31 ++++++++++++++++++++++--------- src/qemu/qemu_monitor.c | 10 ++++++++++ 2 files changed, 32 insertions(+), 9 deletions(-)

The reason of libvirtd cores dump is that: We add vm->refs when we alloc the memory, and decrease it in the function qemuHandleMonitorEOF() in other thread. We add vm->refs in the function qemuConnectMonitor() and decrease it when the vm is inactive. The libvirtd will block in the function qemuMonitorSetCapabilities() because the vm is stopped by signal SIGSTOP. Now the vm->refs is 2. Then we kill the vm by signal SIGKILL. The function qemuMonitorSetCapabilities() failed, and then we will decrease vm->refs in the function qemuMonitorClose(). In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called. If qemuHandleMonitorEOF() decreases vm->refs before qemuConnectMonitor() returns, vm->refs will be decrease to 0 and the memory is freed. We will call qemudShutdownVMDaemon() as qemuConnectMonitor() failed. The memory has been freed, so qemudShutdownVMDaemon() is too dangerous. We will reference NULL pointer in the function virDomainConfVMNWFilterTeardown(): ============= void virDomainConfVMNWFilterTeardown(virDomainObjPtr vm) { int i; if (nwfilterDriver != NULL) { for (i = 0; i < vm->def->nnets; i++) virDomainConfNWFilterTeardown(vm->def->nets[i]); } } ============ vm->def->nnets is not 0 but vm->def->nets is NULL(We don't set vm->def->nnets to 0 when we free vm). We should add an extra reference of vm to avoid vm to be deleted if qemuConnectMonitor() failed. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> --- src/qemu/qemu_driver.c | 31 ++++++++++++++++++++++--------- 1 files changed, 22 insertions(+), 9 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 34cc29f..613c2d4 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -930,6 +930,10 @@ qemuReconnectDomain(void *payload, const char *name ATTRIBUTE_UNUSED, void *opaq priv = obj->privateData; + /* Hold an extra reference because we can't allow 'vm' to be + * deleted if qemuConnectMonitor() failed */ + virDomainObjRef(obj); + /* XXX check PID liveliness & EXE path */ if (qemuConnectMonitor(driver, obj) < 0) goto error; @@ -961,18 +965,27 @@ qemuReconnectDomain(void *payload, const char *name ATTRIBUTE_UNUSED, void *opaq if (obj->def->id >= driver->nextvmid) driver->nextvmid = obj->def->id + 1; - virDomainObjUnlock(obj); + if (virDomainObjUnref(obj) > 0) + virDomainObjUnlock(obj); return; error: - /* We can't get the monitor back, so must kill the VM - * to remove danger of it ending up running twice if - * user tries to start it again later */ - qemudShutdownVMDaemon(driver, obj, 0); - if (!obj->persistent) - virDomainRemoveInactive(&driver->domains, obj); - else - virDomainObjUnlock(obj); + if (!virDomainObjIsActive(obj)) { + if (virDomainObjUnref(obj) > 0) + virDomainObjUnlock(obj); + return; + } + + if (virDomainObjUnref(obj) > 0) { + /* We can't get the monitor back, so must kill the VM + * to remove danger of it ending up running twice if + * user tries to start it again later */ + qemudShutdownVMDaemon(driver, obj, 0); + if (!obj->persistent) + virDomainRemoveInactive(&driver->domains, obj); + else + virDomainObjUnlock(obj); + } } /** -- 1.7.1

On 01/24/2011 11:43 PM, Wen Congyang wrote:
We should add an extra reference of vm to avoid vm to be deleted if qemuConnectMonitor() failed.
Thanks for the detailed analysis. On crashes like these, the commit message is as important as the patch itself. I would have gone one step further, and included the crash recipe in 0/2 in the commit message.
+++ b/src/qemu/qemu_driver.c @@ -930,6 +930,10 @@ qemuReconnectDomain(void *payload, const char *name ATTRIBUTE_UNUSED, void *opaq
priv = obj->privateData;
+ /* Hold an extra reference because we can't allow 'vm' to be + * deleted if qemuConnectMonitor() failed */ + virDomainObjRef(obj);
ACK - this idiom matches other places in code where reducing the references to 0 on an error path has disastrous consequences, so maintaining the extra ref makes sense. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

On Tue, Jan 25, 2011 at 02:43:43PM +0800, Wen Congyang wrote:
The reason of libvirtd cores dump is that: We add vm->refs when we alloc the memory, and decrease it in the function qemuHandleMonitorEOF() in other thread.
We add vm->refs in the function qemuConnectMonitor() and decrease it when the vm is inactive.
The libvirtd will block in the function qemuMonitorSetCapabilities() because the vm is stopped by signal SIGSTOP. Now the vm->refs is 2.
Then we kill the vm by signal SIGKILL. The function qemuMonitorSetCapabilities() failed, and then we will decrease vm->refs in the function qemuMonitorClose(). In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called.
If qemuHandleMonitorEOF() decreases vm->refs before qemuConnectMonitor() returns, vm->refs will be decrease to 0 and the memory is freed.
We will call qemudShutdownVMDaemon() as qemuConnectMonitor() failed. The memory has been freed, so qemudShutdownVMDaemon() is too dangerous.
We will reference NULL pointer in the function virDomainConfVMNWFilterTeardown(): ============= void virDomainConfVMNWFilterTeardown(virDomainObjPtr vm) { int i;
if (nwfilterDriver != NULL) { for (i = 0; i < vm->def->nnets; i++) virDomainConfNWFilterTeardown(vm->def->nets[i]); } } ============ vm->def->nnets is not 0 but vm->def->nets is NULL(We don't set vm->def->nnets to 0 when we free vm).
We should add an extra reference of vm to avoid vm to be deleted if qemuConnectMonitor() failed.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
I'm not convinced this is required, if we apply the change I mention in your other patch, to remove the call to qemuMOnitorClose() from qemuConnectMonitor(). I think that mistaken call to qemuMonitorClose() in qemuConnectMonitor is what is breaking the ref counting everywhere else. Daniel

On Tue, Jan 25, 2011 at 02:43:43PM +0800, Wen Congyang wrote:
The reason of libvirtd cores dump is that: We add vm->refs when we alloc the memory, and decrease it in the function qemuHandleMonitorEOF() in other thread.
We add vm->refs in the function qemuConnectMonitor() and decrease it when the vm is inactive.
The libvirtd will block in the function qemuMonitorSetCapabilities() because the vm is stopped by signal SIGSTOP. Now the vm->refs is 2.
Then we kill the vm by signal SIGKILL. The function qemuMonitorSetCapabilities() failed, and then we will decrease vm->refs in the function qemuMonitorClose(). In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called.
If qemuHandleMonitorEOF() decreases vm->refs before qemuConnectMonitor() returns, vm->refs will be decrease to 0 and the memory is freed.
We will call qemudShutdownVMDaemon() as qemuConnectMonitor() failed. The memory has been freed, so qemudShutdownVMDaemon() is too dangerous.
We will reference NULL pointer in the function virDomainConfVMNWFilterTeardown(): ============= void virDomainConfVMNWFilterTeardown(virDomainObjPtr vm) { int i;
if (nwfilterDriver != NULL) { for (i = 0; i < vm->def->nnets; i++) virDomainConfNWFilterTeardown(vm->def->nets[i]); } } ============ vm->def->nnets is not 0 but vm->def->nets is NULL(We don't set vm->def->nnets to 0 when we free vm).
We should add an extra reference of vm to avoid vm to be deleted if qemuConnectMonitor() failed.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
--- src/qemu/qemu_driver.c | 31 ++++++++++++++++++++++--------- 1 files changed, 22 insertions(+), 9 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 34cc29f..613c2d4 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -930,6 +930,10 @@ qemuReconnectDomain(void *payload, const char *name ATTRIBUTE_UNUSED, void *opaq
priv = obj->privateData;
+ /* Hold an extra reference because we can't allow 'vm' to be + * deleted if qemuConnectMonitor() failed */ + virDomainObjRef(obj); + /* XXX check PID liveliness & EXE path */ if (qemuConnectMonitor(driver, obj) < 0) goto error; @@ -961,18 +965,27 @@ qemuReconnectDomain(void *payload, const char *name ATTRIBUTE_UNUSED, void *opaq if (obj->def->id >= driver->nextvmid) driver->nextvmid = obj->def->id + 1;
- virDomainObjUnlock(obj); + if (virDomainObjUnref(obj) > 0) + virDomainObjUnlock(obj); return;
error: - /* We can't get the monitor back, so must kill the VM - * to remove danger of it ending up running twice if - * user tries to start it again later */ - qemudShutdownVMDaemon(driver, obj, 0); - if (!obj->persistent) - virDomainRemoveInactive(&driver->domains, obj); - else - virDomainObjUnlock(obj); + if (!virDomainObjIsActive(obj)) { + if (virDomainObjUnref(obj) > 0) + virDomainObjUnlock(obj); + return; + } + + if (virDomainObjUnref(obj) > 0) { + /* We can't get the monitor back, so must kill the VM + * to remove danger of it ending up running twice if + * user tries to start it again later */ + qemudShutdownVMDaemon(driver, obj, 0); + if (!obj->persistent) + virDomainRemoveInactive(&driver->domains, obj); + else + virDomainObjUnlock(obj); + } }
On closer examination I see why this change is required. Normally we would be doing qemuDomainObjBeginJob before doing anything with the monitor and that grabs an extra reference. ACK Daniel

On 01/26/2011 08:17 AM, Daniel P. Berrange wrote:
On Tue, Jan 25, 2011 at 02:43:43PM +0800, Wen Congyang wrote:
The reason of libvirtd cores dump is that: We add vm->refs when we alloc the memory, and decrease it in the function qemuHandleMonitorEOF() in other thread.
We add vm->refs in the function qemuConnectMonitor() and decrease it when the vm is inactive.
The libvirtd will block in the function qemuMonitorSetCapabilities() because the vm is stopped by signal SIGSTOP. Now the vm->refs is 2.
Then we kill the vm by signal SIGKILL. The function qemuMonitorSetCapabilities() failed, and then we will decrease vm->refs in the function qemuMonitorClose(). In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called.
On closer examination I see why this change is required. Normally we would be doing qemuDomainObjBeginJob before doing anything with the monitor and that grabs an extra reference.
ACK
Pushed. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

When we kill the qemu, the function qemuMonitorSetCapabilities() failed and then we close monitor. In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called. The function qemuHandleMonitorEOF() calls qemudShutdownVMDaemon() to shutdown vm. The monitor will be closed in the function qemudShutdownVMDaemon(). The monitor close twice and the reference is decreased to 0 unexpectedly. The memory will be freed when reference is decreased to 0. We will remove the watch of mon->fd when the monitor is closed. This request will be done in the function qemuMonitorUnwatch() in the qemuloop thread. In the function qemuMonitorUnwatch(), we will lock monitor, but the lock is destroyed and we will block here, In the main thread, we may add some watch or timeout, and will be blocked because the lock of eventLoop is hold by qemuLoop thread. We should close monitor only once. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> --- src/qemu/qemu_monitor.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 3026733..0e95f74 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -47,6 +47,7 @@ struct _qemuMonitor { virMutex lock; /* also used to protect fd */ virCond notify; + bool closed; int refs; int fd; @@ -623,6 +624,7 @@ qemuMonitorOpen(virDomainObjPtr vm, mon->vm = vm; mon->json = json; mon->cb = cb; + mon->closed = false; qemuMonitorLock(mon); switch (config->type) { @@ -695,6 +697,14 @@ void qemuMonitorClose(qemuMonitorPtr mon) qemuMonitorLock(mon); + if (mon->closed) { + /* We have closed monitor in other thread. */ + qemuMonitorUnlock(mon); + return; + } + + mon->closed = true; + if (mon->fd >= 0) { if (mon->watch) virEventRemoveHandle(mon->watch); -- 1.7.1

On 01/24/2011 11:57 PM, Wen Congyang wrote:
When we kill the qemu, the function qemuMonitorSetCapabilities() failed and then we close monitor.
In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called. The function qemuHandleMonitorEOF() calls qemudShutdownVMDaemon() to shutdown vm. The monitor will be closed in the function qemudShutdownVMDaemon().
The monitor close twice and the reference is decreased to 0 unexpectedly. The memory will be freed when reference is decreased to 0.
We will remove the watch of mon->fd when the monitor is closed. This request will be done in the function qemuMonitorUnwatch() in the qemuloop thread. In the function qemuMonitorUnwatch(), we will lock monitor, but the lock is destroyed and we will block here,
In the main thread, we may add some watch or timeout, and will be blocked because the lock of eventLoop is hold by qemuLoop thread.
We should close monitor only once.
I agree with the conclusion that unref'ing the monitor too many times can result in deadlocks. However, I'm not yet convinced that this patch is the right solution.
+ if (mon->closed) { + /* We have closed monitor in other thread. */ + qemuMonitorUnlock(mon); + return; + } + + mon->closed = true; + if (mon->fd >= 0) {
Why can't mon->fd >= 0 be the flag of whether close has previously been attempted? After all, once the handle gets removed, VIR_FORCE_CLOSE sets mon->fd to -1.
if (mon->watch) virEventRemoveHandle(mon->watch);
Making qemuMonitorClose() do a short-circuit no-op if we already detect that another thread has requested close worries me that we might leak the resource (because we are bypassing the reference counter). Is there instead somewhere that we should be temporarily increasing the reference counter? This patch may be correct as-is, but I'm not sure of it yet. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

At 01/26/2011 12:46 AM, Eric Blake Write:
On 01/24/2011 11:57 PM, Wen Congyang wrote:
+ if (mon->closed) { + /* We have closed monitor in other thread. */ + qemuMonitorUnlock(mon); + return; + } + + mon->closed = true; + if (mon->fd >= 0) {
Why can't mon->fd >= 0 be the flag of whether close has previously been attempted? After all, once the handle gets removed, VIR_FORCE_CLOSE sets mon->fd to -1.
We may call qemuMonitorClose() in the function qemuMonitorOpen() to do the cleanup when it failed. So we still need to close the monitor if mon->fd is -1.
if (mon->watch) virEventRemoveHandle(mon->watch);
Making qemuMonitorClose() do a short-circuit no-op if we already detect that another thread has requested close worries me that we might leak the resource (because we are bypassing the reference counter). Is there instead somewhere that we should be temporarily increasing the reference counter?
If we open the monitor successfully, we hold 2 refs: one for open, and another one for watch. So I think that that we only close monitor once may not leak the resource.
This patch may be correct as-is, but I'm not sure of it yet.

On Tue, Jan 25, 2011 at 02:57:34PM +0800, Wen Congyang wrote:
When we kill the qemu, the function qemuMonitorSetCapabilities() failed and then we close monitor.
In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called. The function qemuHandleMonitorEOF() calls qemudShutdownVMDaemon() to shutdown vm. The monitor will be closed in the function qemudShutdownVMDaemon().
The monitor close twice and the reference is decreased to 0 unexpectedly. The memory will be freed when reference is decreased to 0.
We will remove the watch of mon->fd when the monitor is closed. This request will be done in the function qemuMonitorUnwatch() in the qemuloop thread. In the function qemuMonitorUnwatch(), we will lock monitor, but the lock is destroyed and we will block here,
In the main thread, we may add some watch or timeout, and will be blocked because the lock of eventLoop is hold by qemuLoop thread.
We should close monitor only once.
I think the problem actually lies in the qemuConnectMonitor() call. This method calls qemuMonitorSetCapabilities() and if that fails it calls qemuMonitorClose(). The caller of qemuConnectMonitor() will see the error code and also try to kill the QEMU process, by calling qemuShutdownVMDaemon(), which calls qemuMonitorClose() again. So I think we need to remove the call to qemuMonitorClose() from qemuConnectMonitor() and just let the calls cleanup up via normal VM shutdown procedure. Regards, Daniel

At 01/26/2011 01:02 AM, Daniel P. Berrange Write:
On Tue, Jan 25, 2011 at 02:57:34PM +0800, Wen Congyang wrote:
When we kill the qemu, the function qemuMonitorSetCapabilities() failed and then we close monitor.
In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called. The function qemuHandleMonitorEOF() calls qemudShutdownVMDaemon() to shutdown vm. The monitor will be closed in the function qemudShutdownVMDaemon().
The monitor close twice and the reference is decreased to 0 unexpectedly. The memory will be freed when reference is decreased to 0.
We will remove the watch of mon->fd when the monitor is closed. This request will be done in the function qemuMonitorUnwatch() in the qemuloop thread. In the function qemuMonitorUnwatch(), we will lock monitor, but the lock is destroyed and we will block here,
In the main thread, we may add some watch or timeout, and will be blocked because the lock of eventLoop is hold by qemuLoop thread.
We should close monitor only once.
I think the problem actually lies in the qemuConnectMonitor() call. This method calls qemuMonitorSetCapabilities() and if that fails it calls qemuMonitorClose().
The caller of qemuConnectMonitor() will see the error code and also try to kill the QEMU process, by calling qemuShutdownVMDaemon(), which calls qemuMonitorClose() again.
So I think we need to remove the call to qemuMonitorClose() from qemuConnectMonitor() and just let the calls cleanup up via normal VM shutdown procedure.
The function qemudWaitForMonitor() calls qemuConnectMonitor(), but it does not shutdown VM if qemuConnectMonitor() failed.
Regards, Daniel

On Wed, Jan 26, 2011 at 09:46:52AM +0800, Wen Congyang wrote:
At 01/26/2011 01:02 AM, Daniel P. Berrange Write:
On Tue, Jan 25, 2011 at 02:57:34PM +0800, Wen Congyang wrote:
When we kill the qemu, the function qemuMonitorSetCapabilities() failed and then we close monitor.
In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called. The function qemuHandleMonitorEOF() calls qemudShutdownVMDaemon() to shutdown vm. The monitor will be closed in the function qemudShutdownVMDaemon().
The monitor close twice and the reference is decreased to 0 unexpectedly. The memory will be freed when reference is decreased to 0.
We will remove the watch of mon->fd when the monitor is closed. This request will be done in the function qemuMonitorUnwatch() in the qemuloop thread. In the function qemuMonitorUnwatch(), we will lock monitor, but the lock is destroyed and we will block here,
In the main thread, we may add some watch or timeout, and will be blocked because the lock of eventLoop is hold by qemuLoop thread.
We should close monitor only once.
I think the problem actually lies in the qemuConnectMonitor() call. This method calls qemuMonitorSetCapabilities() and if that fails it calls qemuMonitorClose().
The caller of qemuConnectMonitor() will see the error code and also try to kill the QEMU process, by calling qemuShutdownVMDaemon(), which calls qemuMonitorClose() again.
So I think we need to remove the call to qemuMonitorClose() from qemuConnectMonitor() and just let the calls cleanup up via normal VM shutdown procedure.
The function qemudWaitForMonitor() calls qemuConnectMonitor(), but it does not shutdown VM if qemuConnectMonitor() failed.
You need to look further up the call chain. If the call to qemuMonitorSetCapabilities() fails, qemuConnectMOnitor() returns -1. If qemuConnectMonitor() returns -1, then qemuWaitForMonitor() also returns -1. If qemuWaitForMonitor returns -1, then qemudStartVMDaemon will call qemudShutdownVMDaemon which kills the guest and closes the monitor. So, by having qemuConnectMonitor() close the monitor when SetCapabilities() fails, AFAICT, we get a double-close. Regards, Daniel

On Wed, Jan 26, 2011 at 10:49:28AM +0000, Daniel P. Berrange wrote:
On Wed, Jan 26, 2011 at 09:46:52AM +0800, Wen Congyang wrote:
At 01/26/2011 01:02 AM, Daniel P. Berrange Write:
On Tue, Jan 25, 2011 at 02:57:34PM +0800, Wen Congyang wrote:
When we kill the qemu, the function qemuMonitorSetCapabilities() failed and then we close monitor.
In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called. The function qemuHandleMonitorEOF() calls qemudShutdownVMDaemon() to shutdown vm. The monitor will be closed in the function qemudShutdownVMDaemon().
The monitor close twice and the reference is decreased to 0 unexpectedly. The memory will be freed when reference is decreased to 0.
We will remove the watch of mon->fd when the monitor is closed. This request will be done in the function qemuMonitorUnwatch() in the qemuloop thread. In the function qemuMonitorUnwatch(), we will lock monitor, but the lock is destroyed and we will block here,
In the main thread, we may add some watch or timeout, and will be blocked because the lock of eventLoop is hold by qemuLoop thread.
We should close monitor only once.
I think the problem actually lies in the qemuConnectMonitor() call. This method calls qemuMonitorSetCapabilities() and if that fails it calls qemuMonitorClose().
The caller of qemuConnectMonitor() will see the error code and also try to kill the QEMU process, by calling qemuShutdownVMDaemon(), which calls qemuMonitorClose() again.
So I think we need to remove the call to qemuMonitorClose() from qemuConnectMonitor() and just let the calls cleanup up via normal VM shutdown procedure.
The function qemudWaitForMonitor() calls qemuConnectMonitor(), but it does not shutdown VM if qemuConnectMonitor() failed.
You need to look further up the call chain. If the call to qemuMonitorSetCapabilities() fails, qemuConnectMOnitor() returns -1. If qemuConnectMonitor() returns -1, then qemuWaitForMonitor() also returns -1. If qemuWaitForMonitor returns -1, then qemudStartVMDaemon will call qemudShutdownVMDaemon which kills the guest and closes the monitor.
So, by having qemuConnectMonitor() close the monitor when SetCapabilities() fails, AFAICT, we get a double-close.
I propose this patch, as an alternative to your patch 2/2: diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index b6a5cd6..5050921 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -900,8 +900,6 @@ qemuConnectMonitor(struct qemud_driver *driver, virDomainObjPtr vm) qemuDomainObjExitMonitorWithDriver(driver, vm); error: - if (ret < 0) - qemuMonitorClose(priv->mon); return ret; } In combination with your patch 1/2, this fixes the double-free in reconnecting to QEMU and passes valgrind's checks Daniel

At 01/26/2011 06:49 PM, Daniel P. Berrange Write:
On Wed, Jan 26, 2011 at 09:46:52AM +0800, Wen Congyang wrote:
At 01/26/2011 01:02 AM, Daniel P. Berrange Write:
On Tue, Jan 25, 2011 at 02:57:34PM +0800, Wen Congyang wrote:
When we kill the qemu, the function qemuMonitorSetCapabilities() failed and then we close monitor.
In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called. The function qemuHandleMonitorEOF() calls qemudShutdownVMDaemon() to shutdown vm. The monitor will be closed in the function qemudShutdownVMDaemon().
The monitor close twice and the reference is decreased to 0 unexpectedly. The memory will be freed when reference is decreased to 0.
We will remove the watch of mon->fd when the monitor is closed. This request will be done in the function qemuMonitorUnwatch() in the qemuloop thread. In the function qemuMonitorUnwatch(), we will lock monitor, but the lock is destroyed and we will block here,
In the main thread, we may add some watch or timeout, and will be blocked because the lock of eventLoop is hold by qemuLoop thread.
We should close monitor only once.
I think the problem actually lies in the qemuConnectMonitor() call. This method calls qemuMonitorSetCapabilities() and if that fails it calls qemuMonitorClose().
The caller of qemuConnectMonitor() will see the error code and also try to kill the QEMU process, by calling qemuShutdownVMDaemon(), which calls qemuMonitorClose() again.
So I think we need to remove the call to qemuMonitorClose() from qemuConnectMonitor() and just let the calls cleanup up via normal VM shutdown procedure.
The function qemudWaitForMonitor() calls qemuConnectMonitor(), but it does not shutdown VM if qemuConnectMonitor() failed.
You need to look further up the call chain. If the call to qemuMonitorSetCapabilities() fails, qemuConnectMOnitor() returns -1. If qemuConnectMonitor() returns -1, then qemuWaitForMonitor() also returns -1. If qemuWaitForMonitor returns -1, then qemudStartVMDaemon will call qemudShutdownVMDaemon which kills the guest and closes the monitor.
Yes, you are right. Another question: I think we should check whether the vm has been closed before calling qemudShutdownVMDaemon() in the function qemudStartVMDaemon() as the vm may be shutdown in the function qemuHandleMonitorEOF() in another thread.
So, by having qemuConnectMonitor() close the monitor when SetCapabilities() fails, AFAICT, we get a double-close.
Regards, Daniel

Steps to reproduce this bug: 1. use gdb to debug libvirtd, and set breakpoint in the function qemuConnectMonitor() 2. start a vm, and the libvirtd will be stopped in qemuConnectMonitor() 3. kill -STOP $(cat /var/run/libvirt/qemu/<domain>.pid) 4. continue to run libvirtd in gdb, and libvirtd will be blocked in the function qemuMonitorSetCapabilities() 5. kill -9 $(cat /var/run/libvirt/qemu/<domain>.pid) Here is log of the qemu: ========= LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin ... char device redirected to /dev/pts/3 2011-01-27 09:38:48.101: shutting down 2011-01-27 09:41:26.401: shutting down ========= The vm is shut down twice. I do not know whether this behavior has side effect, but I think we should shutdown the vm only once. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> --- src/qemu/qemu_driver.c | 6 +++++- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 6140f0f..c527bb7 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2972,7 +2972,11 @@ cleanup: * pretend we never started it */ virCommandFree(cmd); VIR_FORCE_CLOSE(logfile); - qemudShutdownVMDaemon(driver, vm, 0); + /* The vm may be cloesd in other thread, so we should check whether the + * vm is active before shutdown. + */ + if (virDomainObjIsActive(vm)) + qemudShutdownVMDaemon(driver, vm, 0); return -1; } -- 1.7.1

On 01/26/2011 08:18 PM, Wen Congyang wrote:
Steps to reproduce this bug:
1. use gdb to debug libvirtd, and set breakpoint in the function qemuConnectMonitor() 2. start a vm, and the libvirtd will be stopped in qemuConnectMonitor() 3. kill -STOP $(cat /var/run/libvirt/qemu/<domain>.pid) 4. continue to run libvirtd in gdb, and libvirtd will be blocked in the function qemuMonitorSetCapabilities() 5. kill -9 $(cat /var/run/libvirt/qemu/<domain>.pid)
Here is log of the qemu: ========= LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin ... char device redirected to /dev/pts/3 2011-01-27 09:38:48.101: shutting down 2011-01-27 09:41:26.401: shutting down =========
The vm is shut down twice. I do not know whether this behavior has side effect, but I think we should shutdown the vm only once.
Hmm; maybe _this_ is the root cause for the crash here? https://bugzilla.redhat.com/show_bug.cgi?id=670848
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
--- src/qemu/qemu_driver.c | 6 +++++- 1 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 6140f0f..c527bb7 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2972,7 +2972,11 @@ cleanup: * pretend we never started it */ virCommandFree(cmd); VIR_FORCE_CLOSE(logfile); - qemudShutdownVMDaemon(driver, vm, 0); + /* The vm may be cloesd in other thread, so we should check whether the
s/cloesd/closed/
+ * vm is active before shutdown. + */ + if (virDomainObjIsActive(vm)) + qemudShutdownVMDaemon(driver, vm, 0);
I'm still playing with this patch, but at first glance, it is making sense to me. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 6140f0f..c527bb7 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2972,7 +2972,11 @@ cleanup: * pretend we never started it */ virCommandFree(cmd); VIR_FORCE_CLOSE(logfile); - qemudShutdownVMDaemon(driver, vm, 0); + /* The vm may be cloesd in other thread, so we should check whether the
s/cloesd/closed/
+ * vm is active before shutdown. + */ + if (virDomainObjIsActive(vm)) + qemudShutdownVMDaemon(driver, vm, 0);
I'm still playing with this patch, but at first glance, it is making sense to me.
The patch makes sense to me, since we may unlock the domain object several times before we get to the cleanup code. Thus the state may have changed by the time we get there. Eric, what is the result of you playing with the patch? Is it ok to ack and push it? Jirka

On Wed, Feb 02, 2011 at 02:18:47PM +0100, Jiri Denemark wrote:
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 6140f0f..c527bb7 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2972,7 +2972,11 @@ cleanup: * pretend we never started it */ virCommandFree(cmd); VIR_FORCE_CLOSE(logfile); - qemudShutdownVMDaemon(driver, vm, 0); + /* The vm may be cloesd in other thread, so we should check whether the
s/cloesd/closed/
+ * vm is active before shutdown. + */ + if (virDomainObjIsActive(vm)) + qemudShutdownVMDaemon(driver, vm, 0);
I'm still playing with this patch, but at first glance, it is making sense to me.
The patch makes sense to me, since we may unlock the domain object several times before we get to the cleanup code. Thus the state may have changed by the time we get there.
Eric, what is the result of you playing with the patch? Is it ok to ack and push it?
This patch is only protecting one caller of qemuShutdownVMDaemon. I think it'd be worth moving it to be in the qemuShutdownVMDaemon function itself, so all callers are protected from mistakes / races. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

* src/qemu/qemu_driver.c (qemudShutdownVMDaemon): Check that vm is still active. Reported by Wen Congyang as follows: Steps to reproduce this bug: 1. use gdb to debug libvirtd, and set breakpoint in the function qemuConnectMonitor() 2. start a vm, and the libvirtd will be stopped in qemuConnectMonitor() 3. kill -STOP $(cat /var/run/libvirt/qemu/<domain>.pid) 4. continue to run libvirtd in gdb, and libvirtd will be blocked in the function qemuMonitorSetCapabilities() 5. kill -9 $(cat /var/run/libvirt/qemu/<domain>.pid) Here is log of the qemu: ========= LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin ... char device redirected to /dev/pts/3 2011-01-27 09:38:48.101: shutting down 2011-01-27 09:41:26.401: shutting down ========= The vm is shut down twice. I do not know whether this behavior has side effect, but I think we should shutdown the vm only once. --- I was able to reproduce a segfault due to double shutdown, as well as testing that this patch avoids the issue. src/qemu/qemu_driver.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a3a68b1..d16a4ab 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2991,6 +2991,11 @@ static void qemudShutdownVMDaemon(struct qemud_driver *driver, VIR_DEBUG("Shutting down VM '%s' pid=%d migrated=%d", vm->def->name, vm->pid, migrated); + if (!virDomainObjIsActive(vm)) { + VIR_DEBUG("VM '%s' not active", vm->def->name); + return; + } + if ((logfile = qemudLogFD(driver, vm->def->name, true)) < 0) { /* To not break the normal domain shutdown process, skip the * timestamp log writing if failed on opening log file. */ -- 1.7.3.5

On Wed, Feb 02, 2011 at 11:37:37 -0700, Eric Blake wrote:
* src/qemu/qemu_driver.c (qemudShutdownVMDaemon): Check that vm is still active. Reported by Wen Congyang as follows:
---
I was able to reproduce a segfault due to double shutdown, as well as testing that this patch avoids the issue.
src/qemu/qemu_driver.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a3a68b1..d16a4ab 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2991,6 +2991,11 @@ static void qemudShutdownVMDaemon(struct qemud_driver *driver, VIR_DEBUG("Shutting down VM '%s' pid=%d migrated=%d", vm->def->name, vm->pid, migrated);
+ if (!virDomainObjIsActive(vm)) { + VIR_DEBUG("VM '%s' not active", vm->def->name); + return; + } +
Looks like the solution Dan was asking for... ACK Jirka

On 02/03/2011 06:14 AM, Jiri Denemark wrote:
On Wed, Feb 02, 2011 at 11:37:37 -0700, Eric Blake wrote:
* src/qemu/qemu_driver.c (qemudShutdownVMDaemon): Check that vm is still active. Reported by Wen Congyang as follows:
--- +++ b/src/qemu/qemu_driver.c @@ -2991,6 +2991,11 @@ static void qemudShutdownVMDaemon(struct qemud_driver *driver, VIR_DEBUG("Shutting down VM '%s' pid=%d migrated=%d", vm->def->name, vm->pid, migrated);
+ if (!virDomainObjIsActive(vm)) { + VIR_DEBUG("VM '%s' not active", vm->def->name); + return; + } +
Looks like the solution Dan was asking for... ACK
Thanks; pushed. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

On 02/02/2011 06:18 AM, Jiri Denemark wrote:
+ * vm is active before shutdown. + */ + if (virDomainObjIsActive(vm)) + qemudShutdownVMDaemon(driver, vm, 0);
I'm still playing with this patch, but at first glance, it is making sense to me.
The patch makes sense to me, since we may unlock the domain object several times before we get to the cleanup code. Thus the state may have changed by the time we get there.
Eric, what is the result of you playing with the patch? Is it ok to ack and push it?
I've confirmed that this patch does indeed make a difference in the scenario Wen posted, where without the patch, I did indeed get a double-free segfault in libvirtd due to simultaneous shutdown attempts from two threads, but only one shutdown and no crash with the patch. However, I'm testing an alternate patch based on danpb's suggestion of hoisting the check into qemudShutdownVMDaemon (or qemuProcessStop, if danpb's refactoring into qemu_process goes in first), which I'll post shortly. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org
participants (4)
-
Daniel P. Berrange
-
Eric Blake
-
Jiri Denemark
-
Wen Congyang