
On Wed, Jan 26, 2011 at 10:49:28AM +0000, Daniel P. Berrange wrote:
On Wed, Jan 26, 2011 at 09:46:52AM +0800, Wen Congyang wrote:
At 01/26/2011 01:02 AM, Daniel P. Berrange Write:
On Tue, Jan 25, 2011 at 02:57:34PM +0800, Wen Congyang wrote:
When we kill the qemu, the function qemuMonitorSetCapabilities() failed and then we close monitor.
In another thread, mon->fd is broken and the function qemuHandleMonitorEOF() is called. The function qemuHandleMonitorEOF() calls qemudShutdownVMDaemon() to shutdown vm. The monitor will be closed in the function qemudShutdownVMDaemon().
The monitor close twice and the reference is decreased to 0 unexpectedly. The memory will be freed when reference is decreased to 0.
We will remove the watch of mon->fd when the monitor is closed. This request will be done in the function qemuMonitorUnwatch() in the qemuloop thread. In the function qemuMonitorUnwatch(), we will lock monitor, but the lock is destroyed and we will block here,
In the main thread, we may add some watch or timeout, and will be blocked because the lock of eventLoop is hold by qemuLoop thread.
We should close monitor only once.
I think the problem actually lies in the qemuConnectMonitor() call. This method calls qemuMonitorSetCapabilities() and if that fails it calls qemuMonitorClose().
The caller of qemuConnectMonitor() will see the error code and also try to kill the QEMU process, by calling qemuShutdownVMDaemon(), which calls qemuMonitorClose() again.
So I think we need to remove the call to qemuMonitorClose() from qemuConnectMonitor() and just let the calls cleanup up via normal VM shutdown procedure.
The function qemudWaitForMonitor() calls qemuConnectMonitor(), but it does not shutdown VM if qemuConnectMonitor() failed.
You need to look further up the call chain. If the call to qemuMonitorSetCapabilities() fails, qemuConnectMOnitor() returns -1. If qemuConnectMonitor() returns -1, then qemuWaitForMonitor() also returns -1. If qemuWaitForMonitor returns -1, then qemudStartVMDaemon will call qemudShutdownVMDaemon which kills the guest and closes the monitor.
So, by having qemuConnectMonitor() close the monitor when SetCapabilities() fails, AFAICT, we get a double-close.
I propose this patch, as an alternative to your patch 2/2: diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index b6a5cd6..5050921 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -900,8 +900,6 @@ qemuConnectMonitor(struct qemud_driver *driver, virDomainObjPtr vm) qemuDomainObjExitMonitorWithDriver(driver, vm); error: - if (ret < 0) - qemuMonitorClose(priv->mon); return ret; } In combination with your patch 1/2, this fixes the double-free in reconnecting to QEMU and passes valgrind's checks Daniel