On 04/27/2018 11:24 AM, Daniel P. Berrangé wrote:
Today the nwfilter driver is entangled with the virt drivers in both
directions. At various times when rebuilding filters nwfilter will call
out to the virt driver to iterate over running guest's NICs. This has
caused very complicated lock ordering rules to be required. If we are to
split the virt drivers out into separate daemons we need to get rid of
this coupling since we don't want the separate daemons calling each
other, as that risks deadlock if all of the RPC workers are busy.
The obvious way to solve this is to have the nwfilter driver remember
all the filters it has active, avoiding the need to iterate over running
guests.
NB, these patches are all ready for review, but the last patch really
should not be merged at this time. I need to do more work to be able to
serialize the filter state to disk, so the nwfilter driver can keep track
of it across daemon restarts. All except the last patch should be ok to
merge though.
As usual, thinking about the dusty corners of
this...
First - this is a *great* idea, and we should do something similar to
the network driver (keep track of all the connections to each network so
they can be re-connected when a network is stopped/started).
Second - we need to think about the situation where the nwfilter process
is stopped during a time when the hypervisor driver shuts down a guest.
If nwfilter keeps track of the current set of active filters, serializes
them to disk, and then rereads/reloads that list of filters when it's
restarted, then this sequence is possible:
1) nwfilter daemon is stopped
2) qemu destroys a guest
3) nwfilter daemon is restarted
Since we wouldn't want qemu to just hang forever during guest shutdown
if one of the subordinate drivers was unavailable, we could end up with
the filter rules for the defunct guest (which were saved off to disk by
nwfilter) being reloaded in step 3, and there would be no way to clear
them out short of rebooting the host.
I can think of two ways to solve this:
1) qemu (or whichever hypervisor) keeps a queue of pending requests to
nwfilter, and re-sends them when the nwfilter daemon once again becomes
available and is reconnected. (But then what happens if qemu is
restarted while there are pending requests to nwfilter? We would need to
keep all the pending requests serialized on disk too? :-/)
2) each time qemu reconnects to the nwfilter driver, it issues a "reset"
command, which through some form of magic hand waving both reloads the
rules for all guests that are still active, and deletes the rules for
those that aren't.
(2) seems like it might be simpler and more robust - perhaps all that
would be needed would be for nwfilter to keep the name/uuid of each
guest that has active rules (which you're already doing in this series,
although we also need something to identify the hypervisor driver that
requested the rules (if all connections are local, then just a driver
name and pid would probably be sufficient), and have a
virNWFilterReset() (or some other better name) API that accepts a list
of active uuids. each time qemu connects to nwfilter, it could send this
list, and nwfilter would delete all rules associated with that
hypervisor and a name/uuid not in the list; as a bonus it could also
reload the rules for guests that *are* still active) (similar to the way
the network driver reloads the iptables rules for all the networks when
it is restarted).