On Thu, Feb 06, 2014 at 02:11:26PM +0200, Laine Stump wrote:
From: "Daniel P. Berrange" <berrange(a)redhat.com>
(see notes about conflict resolution at the end. I'm not sending the
other 5 patches required, since they were all simple cherry-picks of:
4f2094346d98f4ed6a2de115d204c166cc563496
b77b16ce4166dcc87963ae5d279b77b162ddbb55
ebca369e3fe5ac999c261c2d44e60a1bac3cfe65
999d72fbd59ea712128ae294b69b6a54039d757b
c065984b58000a44c90588198d222a314ac532fd
)
The NWFilter code has as a deadlock race condition between
the virNWFilter{Define,Undefine} APIs and starting of guest
VMs due to mis-matched lock ordering.
In the virNWFilter{Define,Undefine} codepaths the lock ordering
is
1. nwfilter driver lock
2. virt driver lock
3. nwfilter update lock
4. domain object lock
In the VM guest startup paths the lock ordering is
1. virt driver lock
2. domain object lock
3. nwfilter update lock
As can be seen the domain object and nwfilter update locks are
not acquired in a consistent order.
The fix used is to push the nwfilter update lock upto the top
level resulting in a lock ordering for virNWFilter{Define,Undefine}
of
1. nwfilter driver lock
2. nwfilter update lock
3. virt driver lock
4. domain object lock
and VM start using
1. nwfilter update lock
2. virt driver lock
3. domain object lock
This has the effect of serializing VM startup once again, even if
no nwfilters are applied to the guest. There is also the possibility
of deadlock due to a call graph loop via virNWFilterInstantiate
and virNWFilterInstantiateFilterLate.
These two problems mean the lock must be turned into a read/write
lock instead of a plain mutex at the same time. The lock is used to
serialize changes to the "driver->nwfilters" hash, so the write lock
only needs to be held by the define/undefine methods. All other
methods can rely on a read lock which allows good concurrency.
Signed-off-by: Daniel P. Berrange <berrange(a)redhat.com>
(cherry picked from commit 6e5c79a1b5a8b3a23e7df7ffe58fb272aa17fbfb)
Conflicts:
src/conf/nwfilter_conf.c
- virReportOOMError() in context of one hunk.
src/lxc/lxc_driver.c
- functions renamed, and lxc object locking changed, creating
a conflict in the context.
---
src/conf/nwfilter_conf.c | 25 ++++++++++++-------------
src/conf/nwfilter_conf.h | 3 ++-
src/libvirt_private.syms | 3 ++-
src/lxc/lxc_driver.c | 8 +++++++-
src/nwfilter/nwfilter_driver.c | 10 ++++++----
src/nwfilter/nwfilter_gentech_driver.c | 6 +-----
src/qemu/qemu_driver.c | 6 ++++++
src/uml/uml_driver.c | 4 ++++
8 files changed, 40 insertions(+), 25 deletions(-)
ACK, you got the lock ordering correct when resolving the
lxc conflict.
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|