[RFC] Faster libvirtd restart with nwfilter rules, one more time

Hi, all. Some time ago I posted RFC [1] concerning an issue of unresponsive libvird during restart if there is large number of VMs that have network filters on their interfaces. It was identified that in most cases we don't need actually to reinstall network filter rules on daemon restart. Thus I proposed patches [2] that check whether we need to reapply rules or not. The first version has a drawback that daemon won't reapply rules if someone mangled them between daemon stop and start (and this can be done just by restarting firewalld). The second one is just ugly :) Around that time Florian Westphal in a letter off the mailing list suggested to use {iptables|ebtables}-restore to apply rules in one binary call. These binaries has --noflush option so that we won't reset current state of tables. We also need one more -L call for iptables/ebtables to query current filter state to be able to construct input for restore binaries. I wonder can we use this approach? I see currently only one issue - we won't use firealld to spawn rules. But why we need to spawn rules through firewalld if it present? We use passthrough mode anyway. I tried to dig history for hints but didn't found anything. Patch [3] introduced spawning rules thru firewalld-cmd. Nikolay [1] [RFC] Faster libvirtd restart with nwfilter rules https://www.redhat.com/archives/libvir-list/2018-September/msg01206.html [2] nwfilter: don't reinstantiate filters if they are not changed v1: https://www.redhat.com/archives/libvir-list/2018-October/msg00904.html v2: https://www.redhat.com/archives/libvir-list/2018-October/msg01317.html [3] network: use firewalld instead of iptables, when available v0: https://www.redhat.com/archives/libvir-list/2012-April/msg01236.html v1: https://www.redhat.com/archives/libvir-list/2012-August/msg00447.html ... v4: https://www.redhat.com/archives/libvir-list/2012-August/msg01097.html

ping On 20.03.2020 12:25, nshirokovskiy wrote:
Hi, all.
Some time ago I posted RFC [1] concerning an issue of unresponsive libvird during restart if there is large number of VMs that have network filters on their interfaces. It was identified that in most cases we don't need actually to reinstall network filter rules on daemon restart. Thus I proposed patches [2] that check whether we need to reapply rules or not.
The first version has a drawback that daemon won't reapply rules if someone mangled them between daemon stop and start (and this can be done just by restarting firewalld). The second one is just ugly :)
Around that time Florian Westphal in a letter off the mailing list suggested to use {iptables|ebtables}-restore to apply rules in one binary call. These binaries has --noflush option so that we won't reset current state of tables. We also need one more -L call for iptables/ebtables to query current filter state to be able to construct input for restore binaries.
I wonder can we use this approach? I see currently only one issue - we won't use firealld to spawn rules. But why we need to spawn rules through firewalld if it present? We use passthrough mode anyway. I tried to dig history for hints but didn't found anything. Patch [3] introduced spawning rules thru firewalld-cmd.
Nikolay
[1] [RFC] Faster libvirtd restart with nwfilter rules https://www.redhat.com/archives/libvir-list/2018-September/msg01206.html
[2] nwfilter: don't reinstantiate filters if they are not changed v1: https://www.redhat.com/archives/libvir-list/2018-October/msg00904.html v2: https://www.redhat.com/archives/libvir-list/2018-October/msg01317.html
[3] network: use firewalld instead of iptables, when available v0: https://www.redhat.com/archives/libvir-list/2012-April/msg01236.html v1: https://www.redhat.com/archives/libvir-list/2012-August/msg00447.html ... v4: https://www.redhat.com/archives/libvir-list/2012-August/msg01097.html

Copying Eric Garver as a knowledgeable maintainer of firewalld to confirm a question I have.... On Fri, Mar 20, 2020 at 12:25:49PM +0300, nshirokovskiy wrote:
Hi, all.
Some time ago I posted RFC [1] concerning an issue of unresponsive libvird during restart if there is large number of VMs that have network filters on their interfaces. It was identified that in most cases we don't need actually to reinstall network filter rules on daemon restart. Thus I proposed patches [2] that check whether we need to reapply rules or not.
The first version has a drawback that daemon won't reapply rules if someone mangled them between daemon stop and start (and this can be done just by restarting firewalld). The second one is just ugly :)
Around that time Florian Westphal in a letter off the mailing list suggested to use {iptables|ebtables}-restore to apply rules in one binary call. These binaries has --noflush option so that we won't reset current state of tables. We also need one more -L call for iptables/ebtables to query current filter state to be able to construct input for restore binaries.
I wonder can we use this approach? I see currently only one issue - we won't use firealld to spawn rules. But why we need to spawn rules through firewalld if it present? We use passthrough mode anyway. I tried to dig history for hints but didn't found anything. Patch [3] introduced spawning rules thru firewalld-cmd.
For as long as libvirt has done firewall stuff we've have the issue with other apps on the system breaking / discarding our rules. Originally this was the "firewall" sysvinit script. When firewalld came along, libvirt switched to creating rules using the firwalld passthrough mode API, in the belief that any time firewalld re-creates its rules, it would preserve any rules we'd created via the passthrough mode. I vaguely recall some recentish discussions though where I think Eric Garver mentioned we were mistaken, and that firewalld does *nothing* to preserve passthrough mode rules. Eric, from firewalld's POV, is there any functional difference between an application directly creating rules by spawning "iptables", vs creating the same rules via the firewalld passthrough API ? If there is no difference, then libvirt could stop using the firewalld passthrough API, and switch to the iptables bulk load tools.
Nikolay
[1] [RFC] Faster libvirtd restart with nwfilter rules https://www.redhat.com/archives/libvir-list/2018-September/msg01206.html
[2] nwfilter: don't reinstantiate filters if they are not changed v1: https://www.redhat.com/archives/libvir-list/2018-October/msg00904.html v2: https://www.redhat.com/archives/libvir-list/2018-October/msg01317.html
[3] network: use firewalld instead of iptables, when available v0: https://www.redhat.com/archives/libvir-list/2012-April/msg01236.html v1: https://www.redhat.com/archives/libvir-list/2012-August/msg00447.html ... v4: https://www.redhat.com/archives/libvir-list/2012-August/msg01097.html
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Thu, Apr 09, 2020 at 11:53:46AM +0100, Daniel P. Berrangé wrote:
Copying Eric Garver as a knowledgeable maintainer of firewalld to confirm a question I have....
On Fri, Mar 20, 2020 at 12:25:49PM +0300, nshirokovskiy wrote:
Hi, all.
Some time ago I posted RFC [1] concerning an issue of unresponsive libvird during restart if there is large number of VMs that have network filters on their interfaces. It was identified that in most cases we don't need actually to reinstall network filter rules on daemon restart. Thus I proposed patches [2] that check whether we need to reapply rules or not.
The first version has a drawback that daemon won't reapply rules if someone mangled them between daemon stop and start (and this can be done just by restarting firewalld). The second one is just ugly :)
Around that time Florian Westphal in a letter off the mailing list suggested to use {iptables|ebtables}-restore to apply rules in one binary call. These binaries has --noflush option so that we won't reset current state of tables.
Firewalld does exactly this. It's a significant performance gain. iptables updates are effectively: grab the _entire_ ruleset from kernel, modify the ruleset in userspace, then push the _entire_ ruleset back to the kernel. This is very costly. Using iptables-restore means there are less round trips to the kernel.
We also need one more -L call for iptables/ebtables to query current filter state to be able to construct input for restore binaries.
I wonder can we use this approach? I see currently only one issue - we won't use firealld to spawn rules. But why we need to spawn rules through firewalld if it present? We use passthrough mode anyway. I tried to dig history for hints but didn't found anything. Patch [3] introduced spawning rules thru firewalld-cmd.
For as long as libvirt has done firewall stuff we've have the issue with other apps on the system breaking / discarding our rules. Originally this was the "firewall" sysvinit script.
When firewalld came along, libvirt switched to creating rules using the firwalld passthrough mode API, in the belief that any time firewalld re-creates its rules, it would preserve any rules we'd created via the passthrough mode.
The confusion may stem from the fact that there are _three_ different "direct rule" APIs: 1) direct.addRule() These are tracked by firewalld and will survive firewalld reloads. It allows specifying a "priority", but rule indexing is handled by firewalld. 2) direct.addPassthrough() These are tracked by firewalld and will survive firewalld reloads. The user must specify the "index" in the iptables ruleset. 3) direct.passthrough() There are _NOT_ tracked by firewalld. They will _NOT_ survive a firewalld reload. This is a dumb passthrough to iptables. Firewalld has no memory of their execution immediately after they complete. I honestly don't understand why it exists as it provides no value. Unfortunately, libvirt currently uses #3 [1]. If libvirt used #2 then libvirt rules would persist firewalld reloads with libvirt taking zero action. firewalld reinstalls the addPassthrough rules after reload. The downside to #2 is that if a firewalld user executes "firewall-cmd --runtime-to-permanent" the libvirt rules will be copied to firewalld's _permanent_ configuration. This may be the reason libvirt is using passthrough (#3). [1] https://gitlab.com/libvirt/libvirt/-/blob/1e2ae2e3/src/util/virfirewalld.c#L...
I vaguely recall some recentish discussions though where I think Eric Garver mentioned we were mistaken, and that firewalld does *nothing* to preserve passthrough mode rules.
See above.
Eric, from firewalld's POV, is there any functional difference between an application directly creating rules by spawning "iptables", vs creating the same rules via the firewalld passthrough API ?
If using direct.passthrough (#3 above), then no.
If there is no difference, then libvirt could stop using the firewalld passthrough API, and switch to the iptables bulk load tools.
Seem sensible to me. [..] Hope that was helpful. Eric.

On 4/9/20 8:44 AM, Eric Garver wrote:
Copying Eric Garver as a knowledgeable maintainer of firewalld to confirm a question I have....
On Fri, Mar 20, 2020 at 12:25:49PM +0300, nshirokovskiy wrote:
Hi, all.
Some time ago I posted RFC [1] concerning an issue of unresponsive libvird during restart if there is large number of VMs that have network filters on their interfaces. It was identified that in most cases we don't need actually to reinstall network filter rules on daemon restart. Thus I proposed patches [2] that check whether we need to reapply rules or not.
The first version has a drawback that daemon won't reapply rules if someone mangled them between daemon stop and start (and this can be done just by restarting firewalld). The second one is just ugly :)
Around that time Florian Westphal in a letter off the mailing list suggested to use {iptables|ebtables}-restore to apply rules in one binary call. These binaries has --noflush option so that we won't reset current state of tables. Firewalld does exactly this. It's a significant performance gain. iptables updates are effectively: grab the _entire_ ruleset from kernel, modify the ruleset in userspace, then push the _entire_ ruleset back to
On Thu, Apr 09, 2020 at 11:53:46AM +0100, Daniel P. Berrangé wrote: the kernel. This is very costly. Using iptables-restore means there are less round trips to the kernel.
We also need one more -L call for iptables/ebtables to query current filter state to be able to construct input for restore binaries.
I wonder can we use this approach? I see currently only one issue - we won't use firealld to spawn rules. But why we need to spawn rules through firewalld if it present? We use passthrough mode anyway. I tried to dig history for hints but didn't found anything. Patch [3] introduced spawning rules thru firewalld-cmd. For as long as libvirt has done firewall stuff we've have the issue with other apps on the system breaking / discarding our rules. Originally this was the "firewall" sysvinit script.
When firewalld came along, libvirt switched to creating rules using the firwalld passthrough mode API, in the belief that any time firewalld re-creates its rules, it would preserve any rules we'd created via the passthrough mode. The confusion may stem from the fact that there are _three_ different "direct rule" APIs:
1) direct.addRule() These are tracked by firewalld and will survive firewalld reloads. It allows specifying a "priority", but rule indexing is handled by firewalld.
2) direct.addPassthrough() These are tracked by firewalld and will survive firewalld reloads. The user must specify the "index" in the iptables ruleset.
3) direct.passthrough() There are _NOT_ tracked by firewalld. They will _NOT_ survive a firewalld reload. This is a dumb passthrough to iptables. Firewalld has no memory of their execution immediately after they complete. I honestly don't understand why it exists as it provides no value.
Unfortunately, libvirt currently uses #3 [1]. If libvirt used #2 then libvirt rules would persist firewalld reloads with libvirt taking zero action. firewalld reinstalls the addPassthrough rules after reload.
The downside to #2 is that if a firewalld user executes "firewall-cmd --runtime-to-permanent" the libvirt rules will be copied to firewalld's _permanent_ configuration. This may be the reason libvirt is using passthrough (#3).
[1] https://gitlab.com/libvirt/libvirt/-/blob/1e2ae2e3/src/util/virfirewalld.c#L...
I vaguely recall some recentish discussions though where I think Eric Garver mentioned we were mistaken, and that firewalld does *nothing* to preserve passthrough mode rules. See above.
Eric, from firewalld's POV, is there any functional difference between an application directly creating rules by spawning "iptables", vs creating the same rules via the firewalld passthrough API ? If using direct.passthrough (#3 above), then no.
If there is no difference, then libvirt could stop using the firewalld passthrough API, and switch to the iptables bulk load tools. Seem sensible to me.
So in the end the only interaction we should have with firewalld is to watch for it to restart, and reload all our rules when that happens. Sigh. It will definitely be nice when firewalld is able to support all the rules libvirt needs (including output, forward, and nat) via addRule() [*]. (nudge nudge, wink wink :-)) The lack of a central authority for managing iptables (and now nftables) rules has been a thorn in the side for a very long time, and although it's obvious from this exchange that switching back to always using iptables directly is the right thing to do for now, it still feels like a step in the wrong direction. [*] BTW, this points out that there needs to be a way to addRule() rules such that they don't get moved into the permanent config when "firewall-cmd --runtime-to-permanent" is run. It would be impossible for libvirt to keep track of that.

On Mon, Apr 13, 2020 at 02:20:08PM -0400, Laine Stump wrote:
On 4/9/20 8:44 AM, Eric Garver wrote:
Copying Eric Garver as a knowledgeable maintainer of firewalld to confirm a question I have....
On Fri, Mar 20, 2020 at 12:25:49PM +0300, nshirokovskiy wrote:
Hi, all.
Some time ago I posted RFC [1] concerning an issue of unresponsive libvird during restart if there is large number of VMs that have network filters on their interfaces. It was identified that in most cases we don't need actually to reinstall network filter rules on daemon restart. Thus I proposed patches [2] that check whether we need to reapply rules or not.
The first version has a drawback that daemon won't reapply rules if someone mangled them between daemon stop and start (and this can be done just by restarting firewalld). The second one is just ugly :)
Around that time Florian Westphal in a letter off the mailing list suggested to use {iptables|ebtables}-restore to apply rules in one binary call. These binaries has --noflush option so that we won't reset current state of tables. Firewalld does exactly this. It's a significant performance gain. iptables updates are effectively: grab the _entire_ ruleset from kernel, modify the ruleset in userspace, then push the _entire_ ruleset back to
On Thu, Apr 09, 2020 at 11:53:46AM +0100, Daniel P. Berrangé wrote: the kernel. This is very costly. Using iptables-restore means there are less round trips to the kernel.
We also need one more -L call for iptables/ebtables to query current filter state to be able to construct input for restore binaries.
I wonder can we use this approach? I see currently only one issue - we won't use firealld to spawn rules. But why we need to spawn rules through firewalld if it present? We use passthrough mode anyway. I tried to dig history for hints but didn't found anything. Patch [3] introduced spawning rules thru firewalld-cmd. For as long as libvirt has done firewall stuff we've have the issue with other apps on the system breaking / discarding our rules. Originally this was the "firewall" sysvinit script.
When firewalld came along, libvirt switched to creating rules using the firwalld passthrough mode API, in the belief that any time firewalld re-creates its rules, it would preserve any rules we'd created via the passthrough mode. The confusion may stem from the fact that there are _three_ different "direct rule" APIs:
1) direct.addRule() These are tracked by firewalld and will survive firewalld reloads. It allows specifying a "priority", but rule indexing is handled by firewalld.
2) direct.addPassthrough() These are tracked by firewalld and will survive firewalld reloads. The user must specify the "index" in the iptables ruleset.
3) direct.passthrough() There are _NOT_ tracked by firewalld. They will _NOT_ survive a firewalld reload. This is a dumb passthrough to iptables. Firewalld has no memory of their execution immediately after they complete. I honestly don't understand why it exists as it provides no value.
Unfortunately, libvirt currently uses #3 [1]. If libvirt used #2 then libvirt rules would persist firewalld reloads with libvirt taking zero action. firewalld reinstalls the addPassthrough rules after reload.
The downside to #2 is that if a firewalld user executes "firewall-cmd --runtime-to-permanent" the libvirt rules will be copied to firewalld's _permanent_ configuration. This may be the reason libvirt is using passthrough (#3).
[1] https://gitlab.com/libvirt/libvirt/-/blob/1e2ae2e3/src/util/virfirewalld.c#L...
I vaguely recall some recentish discussions though where I think Eric Garver mentioned we were mistaken, and that firewalld does *nothing* to preserve passthrough mode rules. See above.
Eric, from firewalld's POV, is there any functional difference between an application directly creating rules by spawning "iptables", vs creating the same rules via the firewalld passthrough API ? If using direct.passthrough (#3 above), then no.
If there is no difference, then libvirt could stop using the firewalld passthrough API, and switch to the iptables bulk load tools. Seem sensible to me.
So in the end the only interaction we should have with firewalld is to watch for it to restart, and reload all our rules when that happens. Sigh.
Correct. For now that seems like the best approach. I acknowledge that this is very far from ideal. If libvirt were to use its own nftables table, then libvirts rules would be unaffected by firewalld's reload. firewalld only touches its own nftables table called "firewalld". Of course you'd still need the "libvirt" zone to allow the traffic through firewalld.
It will definitely be nice when firewalld is able to support all the rules libvirt needs (including output, forward, and nat) via addRule() [*]. (nudge nudge, wink wink :-))
First class forward/output support is in development. It's just a large amount of work that I've only been able to do in the background. You can track the progress via this card though: https://github.com/orgs/firewalld/projects/1#card-25963208 libirt will still have the issue of reloading the runtime only rules after firewalld restarts though. At the moment there is no API to batch changes.
The lack of a central authority for managing iptables (and now nftables) rules has been a thorn in the side for a very long time, and although it's obvious from this exchange that switching back to always using iptables directly is the right thing to do for now, it still feels like a step in the wrong direction.
Agreed.
[*] BTW, this points out that there needs to be a way to addRule() rules such that they don't get moved into the permanent config when "firewall-cmd --runtime-to-permanent" is run. It would be impossible for libvirt to keep track of that.
I agree this is problematic. firewalld should not be saving runtime changes done by non-users (e.g. libvirt). I filed an upstream issue [2]. At the moment I'm not sure how to address the issue. I don't think firewalld differentiates configuration changes that comes from users (firewall-cmd) from libvirt et al. [2]: https://github.com/firewalld/firewalld/issues/606

On 3/20/20 5:25 AM, nshirokovskiy wrote:
Hi, all.
Some time ago I posted RFC [1] concerning an issue of unresponsive libvird during restart if there is large number of VMs that have network filters on their interfaces. It was identified that in most cases we don't need actually to reinstall network filter rules on daemon restart. Thus I proposed patches [2] that check whether we need to reapply rules or not.
The first version has a drawback that daemon won't reapply rules if someone mangled them between daemon stop and start (and this can be done just by restarting firewalld). The second one is just ugly :)
Around that time Florian Westphal in a letter off the mailing list suggested to use {iptables|ebtables}-restore to apply rules in one binary call. These binaries has --noflush option so that we won't reset current state of tables. We also need one more -L call for iptables/ebtables to query current filter state to be able to construct input for restore binaries.
So are you considering doing something with this idea? At the end of our discussion, both libvirt and firewalld people agreed that we're gaining nothing from setting our rules via firewalld passthrough, and we would be potentially gaining *a lot* by setting them in batch mode with "iptables-restore -n". Perhaps we could just add a new firewall backend (in util/virfirewall.c) that checked for the presence of iptables-restore (and ip6tables-restore and ebtables-restore), and if they are found it would use a backend that just put all the rules for each layer together in a temporary file and send them to *-restore (the internals would need to be reorganized a bit, so that args like -w, -l, and -n could be added in during virFirewallApply (if necessary) rather than when initially adding rules). Ooh! I just tried it, and iptables-restore also accepts (and acts on) lines with "-D" to delete rules! So we could do everything in a single go - intermixing -D and -A rules in the same file (to minimize the time when the firewall would be incorrect while still taking advantage of the efficiency of doing everything in a batch).
I wonder can we use this approach? I see currently only one issue - we won't use firealld to spawn rules. But why we need to spawn rules through firewalld if it present? We use passthrough mode anyway. I tried to dig history for hints but didn't found anything. Patch [3] introduced spawning rules thru firewalld-cmd.
Nikolay
[1] [RFC] Faster libvirtd restart with nwfilter rules https://www.redhat.com/archives/libvir-list/2018-September/msg01206.html
[2] nwfilter: don't reinstantiate filters if they are not changed v1: https://www.redhat.com/archives/libvir-list/2018-October/msg00904.html v2: https://www.redhat.com/archives/libvir-list/2018-October/msg01317.html
[3] network: use firewalld instead of iptables, when available v0: https://www.redhat.com/archives/libvir-list/2012-April/msg01236.html v1: https://www.redhat.com/archives/libvir-list/2012-August/msg00447.html ... v4: https://www.redhat.com/archives/libvir-list/2012-August/msg01097.html

On 08.05.2020 06:46, Laine Stump wrote:
On 3/20/20 5:25 AM, nshirokovskiy wrote:
Hi, all. Some time ago I posted RFC [1] concerning an issue of unresponsive libvird during restart if there is large number of VMs that have network filters on their interfaces. It was identified that in most cases we don't need actually to reinstall network filter rules on daemon restart. Thus I proposed patches [2] that check whether we need to reapply rules or not. The first version has a drawback that daemon won't reapply rules if someone mangled them between daemon stop and start (and this can be done just by restarting firewalld). The second one is just ugly :) Around that time Florian Westphal in a letter off the mailing list suggested to use {iptables|ebtables}-restore to apply rules in one binary call. These binaries has --noflush option so that we won't reset current state of tables. We also need one more -L call for iptables/ebtables to query current filter state to be able to construct input for restore binaries.
So are you considering doing something with this idea? At the end of our discussion, both libvirt and firewalld people agreed that we're gaining nothing from setting our rules via firewalld passthrough, and we would be potentially gaining *a lot* by setting them in batch mode with "iptables-restore -n".
Perhaps we could just add a new firewall backend (in util/virfirewall.c) that checked for the presence of iptables-restore (and ip6tables-restore and ebtables-restore), and if they are found it would use a backend that just put all the rules for each layer together in a temporary file and send them to *-restore (the internals would need to be reorganized a bit, so that args like -w, -l, and -n could be added in during virFirewallApply (if necessary) rather than when initially adding rules).
Ooh! I just tried it, and iptables-restore also accepts (and acts on) lines with "-D" to delete rules! So we could do everything in a single go - intermixing -D and -A rules in the same file (to minimize the time when the firewall would be incorrect while still taking advantage of the efficiency of doing everything in a batch).
Hi! Yeah, I'd want to write such a patch. Just not sure when I have time to start get started. Nikolay
participants (5)
-
Daniel P. Berrangé
-
Eric Garver
-
Laine Stump
-
Nikolay Shirokovskiy
-
nshirokovskiy