On Fri, Jan 18, 2013 at 02:39:11PM +0000, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange(a)redhat.com>
When running virDomainDestroy, we need to make sure that no other
background thread cleans up the domain while we're doing our work.
This can happen if we release the domain object while in the
middle of work, because the monitor might detect EOF in this window.
For this reason we have a 'beingDestroyed' flag to stop the monitor
from doing its normal cleanup. Unfortunately this flag was only
being used to protect qemuDomainBeginJob, and not qemuProcessKill
This left open a race condition where either libvirtd could crash,
or alternatively report bogus error messages about the domain already
having been destroyed to the caller
Signed-off-by: Daniel P. Berrange <berrange(a)redhat.com>
---
src/qemu/qemu_driver.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index c28c223..2a6f381 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -2121,24 +2121,29 @@ qemuDomainDestroyFlags(virDomainPtr dom,
qemuDomainSetFakeReboot(driver, vm, false);
+
+ /* We need to prevent monitor EOF callback from doing our work (and sending
+ * misleading events) while the vm is unlocked inside BeginJob/ProcessKill API
+ */
+ priv->beingDestroyed = true;
+
/* Although qemuProcessStop does this already, there may
* be an outstanding job active. We want to make sure we
* can kill the process even if a job is active. Killing
* it now means the job will be released
*/
if (flags & VIR_DOMAIN_DESTROY_GRACEFUL) {
- if (qemuProcessKill(driver, vm, 0) < 0)
+ if (qemuProcessKill(driver, vm, 0) < 0) {
+ priv->beingDestroyed = false;
goto cleanup;
+ }
} else {
- if (qemuProcessKill(driver, vm, VIR_QEMU_PROCESS_KILL_FORCE) < 0)
+ if (qemuProcessKill(driver, vm, VIR_QEMU_PROCESS_KILL_FORCE) < 0) {
+ priv->beingDestroyed = false;
goto cleanup;
+ }
}
- /* We need to prevent monitor EOF callback from doing our work (and sending
- * misleading events) while the vm is unlocked inside BeginJob API
- */
- priv->beingDestroyed = true;
-
if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_DESTROY) < 0)
goto cleanup;
--
1.8.0.2
This is a really difficult bug to reproduce using libguestfs. I've
only reproduced it a handful of times in about 3 months.
I would say, push it, based on:
(1) Rationale and patch looks sane.
(2) It fixes your minimal test case (which is a strong enough reason in
itself).
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/