
On 03/03/2011 12:47 PM, Laine Stump wrote:
This was found while researching the root cause of:
https://bugzilla.redhat.com/show_bug.cgi?id=670848
virDomainUnref should only be called with the lock held for the virDomainObj in question. However, when a transient qemu domain gets EOF on its monitor socket, it queues an event which frees the monitor, which unref's the virDomainObj without first locking it. If another thread has already locked the virDomainObj, the modification of the refcount could potentially be corrupted. In an extreme case, it could also be potentially unlocked by virDomainObjFree, thus left open to modification by anyone else who would have otherwise waited for the lock (not to mention the fact that they would be accessing freed data!).
The solution is to have qemuMonitorFree lock the domain object right before unrefing it. Since the caller to qemuMonitorFree doesn't expect this lock to be held, if the refcount doesn't go all the way to 0, qemuMonitorFree must unlock it after the unref. --- src/qemu/qemu_process.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index c419c75..1d67b3c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -596,7 +596,9 @@ static void qemuProcessHandleMonitorDestroy(qemuMonitorPtr mon, qemuDomainObjPrivatePtr priv = vm->privateData; if (priv->mon == mon) priv->mon = NULL; - virDomainObjUnref(vm); + virDomainObjLock(vm); + if (virDomainObjUnref(vm) > 0) + virDomainObjUnlock(vm);
Phooey, this causes 'virsh save domain file' to deadlock. I'm still looking into why, but it may be that we're going about fixing this wrong. Maybe we should be making sure that qemuProcessStop ensures that the monitor callback function won't trigger if there is no domain for it to trigger on. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org