Historically the nwfilter driver didn't keep track of any
instances of filters. When it needed to rebuild filters it
called back to the virt drivers to ask them to re-create
the filters for their VMs. This lead to some complex locking
requirements where the virt driver needed to acquire locks
on the nwfilter driver before creating/deleting filter
instances.
Somewhat recently we introduced the virNWFilterBinding
concept which allowed the nwfilter driver to keep track of
all filter instances aka bindings. Thus it could rebuild
them without talking to the virt drivers. We never updated
the locking model when doing this though, so we were still
reliant on the virt drivers acquiring locks before creating
or deleting virNWFilterBinding objects.
When we started using the modular daemons though, the locks
acquired by the virt drivers no longer protected the nwfilter
driver as they were separate processes. Thus the race condition
fixed back in
commit 6e5c79a1b5a8b3a23e7df7ffe58fb272aa17fbfb
Author: Daniel P. Berrangé <berrange(a)redhat.com>
Date: Wed Jan 22 17:28:29 2014 +0000
Push nwfilter update locking up to top level
has now re-appeared when running modular daemons. In fact
it is possible to trigger it even in libvirtd if you ignore
the virt drivers and directly call the virNWFilterBinding
APIs, though that won't hit users in practice.
The solution is surprisingly simple. Since the nwfilter
driver keeps track of virNWFilterBinding objects, we
can simply put locking calls around the Create/Delete
methods for those objects, and remove the locking from
the virt drivers.
The only slight difference is that previous the locking
serialized the entire VM startup/shutdown sequence, while
the new locking only serializes individual NICs. This
should not matter in practice, since there's no shared
state between filters for NICs on the same VM.
This is a minimal fix suitable for the freeze. There is
scope for some significant improvements in locking which
I'm still working on. In particular we currently have a
terrible writer starvation problem with the RWLock that
protects the filter create/delete/redefine operations
that has existed forever. Repeatedly creating/deleting
virNWFilterBinding objects can prevent a virNWFilterDefineXML
call from ever running.
Daniel P. Berrangé (5):
nwfilter: stop using recursive mutex for IP learning
nwfilter: hold filter update lock when creating/deleting bindings
qemu,lxc: remove use to nwfilter update lock
nwfilter: remove decl of virNWFilterCreateVarHashmap
nwfilter: make some gentech driver methods static
src/lxc/lxc_driver.c | 6 ------
src/nwfilter/nwfilter_driver.c | 5 +++++
src/nwfilter/nwfilter_gentech_driver.c | 4 ++--
src/nwfilter/nwfilter_gentech_driver.h | 8 --------
src/nwfilter/nwfilter_learnipaddr.c | 2 +-
src/qemu/qemu_driver.c | 18 ------------------
src/qemu/qemu_migration.c | 3 ---
src/qemu/qemu_process.c | 4 ----
8 files changed, 8 insertions(+), 42 deletions(-)
--
2.35.1