On Thu, Dec 03, 2009 at 12:47:02AM +0100, Matthias Bolte wrote:
2009/11/26 Daniel P. Berrange <berrange(a)redhat.com>:
> If QEMU shuts down while we're in the middle of processing a
> monitor command, the monitor will be freed, and upon cleaning
> up we attempt to do qemuMonitorUnlock(priv->mon) when priv->mon
> is NULL.
>
> To address this we introduce proper reference counting into
> the qemuMonitorPtr object, and hold an extra reference whenever
> executing a command.
>
> * src/qemu/qemu_driver.c: Hold a reference on the monitor while
> executing commands, and only NULL-ify the priv->mon field when
> the last reference is released
> * src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Add reference
> counting to handle safe deletion of monitor objects
The locking pattern below results in destroying a locked mutex. It
this intended?
No, that's a bug, the qemuMonitorUnref() call itself should have
called unlock if the ref count hit 0, before destroying it.
qemuMonitorLock(mon);
[...]
if (qemuMonitorUnref(mon) > 0)
qemuMonitorUnlock(mon);
Well, this patch makes the TCK deadlock for me, seems to be a lock
ordering issue combined with a race condition; it doesn't happen every
run. I don't understand all details of the locking and refcounting
scheme of the QEMU monitor yet, it's quite complex and gets even more
complex.
Yes, that problem shown by valgrind will indeed deadlock. I'll fix
this. The domain object lock must never be acquired if the thread
holds the monitor lock already. We must have strict ordering from
top to bottom (driver -> domain object -> qemu monitor)
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://ovirt.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|