On Wed, Jul 24, 2013 at 12:15:32PM +0200, Michal Privoznik wrote:
There's a race in lxc driver causing a deadlock. If a domain is
destroyed immediately after started, the deadlock can occur. When domain
is started, the even loop tries to connect to the monitor. If the
connecting succeeds, virLXCProcessMonitorInitNotify() is called with
@mon->client locked. The first thing that callee does, is
virObjectLock(vm). So the order of locking is: 1) @mon->client, 2) @vm.
However, if there's another thread executing virDomainDestroy on the
very same domain, the first thing done here is locking the @vm. Then,
the corresponding libvirt_lxc process is killed and monitor is closed
via calling virLXCMonitorClose(). This callee tries to lock @mon->client
too. So the order is reversed to the first case. This situation results
in deadlock and unresponsive libvirtd (since the eventloop is involved).
The proper solution is to unlock the @vm in virLXCMonitorClose prior
entering virNetClientClose(). See the backtrace as follows:
Hmm, I think I'd say that the flaw is in the way virLXCProcessMonitorInitNotify
is invoked. In the QEMU driver monitor, we unlock the monitor before invoking
any callbacks. In the LXC driver monitor we're invoking the callbacks with
the monitor lock held. I think we need to make the LXC monitor locking wrt
callbacks do what QEMU does, and unlock the monitor. See QEMU_MONITOR_CALLBACK
in qemu_monitor.c
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|