[RFC] nftables: switch to nft -f
Hi, I'd like to propose switching from individual nft commands to nft -f in the nftables firewall backend — either as a new backend or as an extension of the existing one. The main motivation is performance. In benchmarking, nft -f is about 46% faster: The following commands are run on the .args files in the nwfilter nftables v6 patch. $ time for i in *.args; do n=$(echo $i | sed s/.args$//); sudo ./reset-tables.sh; sudo sh -e $i; done real 0m8.335s user 0m0.320s sys 0m0.564s $ time for i in *.output; do sudo ./reset-tables.sh skipvmap; sudo nft -f $i; done real 0m4.518s user 0m0.274s sys 0m0.498s On top of that, loading a full ruleset via nft -f is atomic for the whole set of changes, which allows us to remove the rollback logic and removes the need for tmp rules. One issue I see is that the current approach allows certain commands to fail silently, deleting something that doesn't exist won't abort the operation. That's not the case with nft -f, where a failure stops the whole load. One approach here is to only use nft -f for sections that don't contain ignore-errors commands, and run those separately as individual commands. As described earlier: to handle the current non-atomic nature, we have a few extra commands in place — a temporary jump rule, delete vmap entry, add vmap entry. The tmp jump can be replaced by running nft -f instead. I wonder what your opinions are about adding nft -f. Regards, Dion
On 4/8/26 10:08 AM, Dion Bosschieter wrote:
Hi,
I'd like to propose switching from individual nft commands to nft -f in the nftables firewall backend — either as a new backend or as an extension of the existing one.
The main motivation is performance. In benchmarking, nft -f is about 46% faster:
Interesting - I've always assumed the difference would be much greater. We talked about doing that with the iptables backend of the network driver at some point many years ago. I don't remember if we decided against it for some concrete reason, or simply because nobody ever had sufficient motivation to do it. (I think maybe even someone did a proof of concept at some time?) One thing that I've started a few messages about wrt your nwfilter nftables backend (and then got interrupted in the middle, never sent it, and lost the message :-() is the idea of keeping track of rules that have been added so they can be properly removed before new rules are added; this is especially important when restarting the daemon after switching the backend from iptables to nftables or vice versa. This may seem unimportant ("just reboot the host to clean everything out"), but users with many guests running on their host will *not* be happy about needing to migrate, suspend, or shut off all their guests in order to reboot just because the new libvirt version has switched the default nwfilter backend from iptables to nftables! :-O. (I did notice you and Daniel conversing about it, but was just breezing past the message, though "I should come back to this!" to myself, but then completely and totally forgot :-/ Anyway, I digress. The reason I bring this is up is because in the case of iptables, you can remove a rule or a chain simply by repeating the command you used to *add* the rule, replacing "-A" or "-I" with "-D". nftables doesn't allow this though - while you can remove an entire table, or a chain, by regurgitating the same command with s/add/delete/, but you can't remove a rule that way. Instead you need to provide the table, chain and "handle to "nft delete rule". The handle can be retrieved from stdout when the nft add rule command is run with "-a". (it can also be parsed out of "nft -a list ruleset", but that would be even more complicated). So, for any rules that are in a self-contained chain, you can just "nft delete chain blah", but if there are any individual rules, you'll need to add them with nft -a add rule, keep track of the handle, and delete them with "nft delete rule $table $chain $handle" (or something like that - it's been awhile) .....aaaannnnnndddd now I've digressed *even further*!! What I've been working up to saying is that adding all the rules at once from a file may preclude retrieving the handles for those rules, and they may actually be essential. If so, maybe there could be a hybrid approach, where everything that's self-contained in a chain used only for this particular interface is added using nft -f and a file, while a few rules that are mixed in with other rules in a common chain could be added individually so their handles could be saved. Alternately, another possibility would be to skip exec'ing of the nft command completely, and directly use the same API that the nft command uses (libnftables) (firewalld also uses this API - via python bindings of course). You'd get the advantages of atomic operations, and also likely even better performance than nft -f. I think in the long run this would be much nicer to have (although I don't know how handles of individual rules are handled in the libnftables API when you add multiple rules at a time). So, I guess my response to your question is "Yes, but..." :-)
The following commands are run on the .args files in the nwfilter nftables v6 patch.
$ time for i in *.args; do n=$(echo $i | sed s/.args$//); sudo ./ reset-tables.sh; sudo sh -e $i; done real 0m8.335s user 0m0.320s sys 0m0.564s
$ time for i in *.output; do sudo ./reset-tables.sh skipvmap; sudo nft -f $i; done real 0m4.518s user 0m0.274s sys 0m0.498s
On top of that, loading a full ruleset via nft -f is atomic for the whole set of changes, which allows us to remove the rollback logic and removes the need for tmp rules.
One issue I see is that the current approach allows certain commands to fail silently, deleting something that doesn't exist won't abort the operation. That's not the case with nft -f, where a failure stops the whole load. One approach here is to only use nft -f for sections that don't contain ignore-errors commands, and run those separately as individual commands.
As described earlier: to handle the current non-atomic nature, we have a few extra commands in place — a temporary jump rule, delete vmap entry, add vmap entry. The tmp jump can be replaced by running nft -f instead.
I wonder what your opinions are about adding nft -f.
Regards,
Dion
On 4/9/26 03:26, Laine Stump wrote:
On 4/8/26 10:08 AM, Dion Bosschieter wrote:
Hi,
I'd like to propose switching from individual nft commands to nft -f in the nftables firewall backend — either as a new backend or as an extension of the existing one.
The main motivation is performance. In benchmarking, nft -f is about 46% faster:
Interesting - I've always assumed the difference would be much greater.
Still significant enough to investigate is my opinion :)
We talked about doing that with the iptables backend of the network driver at some point many years ago. I don't remember if we decided against it for some concrete reason, or simply because nobody ever had sufficient motivation to do it. (I think maybe even someone did a proof of concept at some time?)
One thing that I've started a few messages about wrt your nwfilter nftables backend (and then got interrupted in the middle, never sent it, and lost the message :-() is the idea of keeping track of rules that have been added so they can be properly removed before new rules are added; this is especially important when restarting the daemon after switching the backend from iptables to nftables or vice versa. This may seem unimportant ("just reboot the host to clean everything out"), but users with many guests running on their host will *not* be happy about needing to migrate, suspend, or shut off all their guests in order to reboot just because the new libvirt version has switched the default nwfilter backend from iptables to nftables! :-O. (I did notice you and Daniel conversing about it, but was just breezing past the message, though "I should come back to this!" to myself, but then completely and totally forgot :-/
I did indeed solve this by calling the ->allTeardown on the old driver, after a daemon restart. That ensures that the old firewall gets removed after the new firewall is defined. I've send a patch in a thread on the v6 version.I could also add the patch and resend the series as v7. But I was wondering if I could get some feedback on that specific patch. Specifically because in that patch im now changing the nwfilter binding inside _gentech_driver.
Anyway, I digress. The reason I bring this is up is because in the case of iptables, you can remove a rule or a chain simply by repeating the command you used to *add* the rule, replacing "-A" or "-I" with "-D". nftables doesn't allow this though - while you can remove an entire table, or a chain, by regurgitating the same command with s/add/delete/, but you can't remove a rule that way. Instead you need to provide the table, chain and "handle to "nft delete rule". The handle can be retrieved from stdout when the nft add rule command is run with "-a". (it can also be parsed out of "nft -a list ruleset", but that would be even more complicated).
In the nwfilter nftables driver I submitted, it does indeed do a list -a and then grabs the # handle and uses this to issue a delete. For chains it is a bit easier, we can remove chains without the handle, which then also removes all of the rules in that chain. That happens inside nftablesHandleRemoveAll
So, for any rules that are in a self-contained chain, you can just "nft delete chain blah", but if there are any individual rules, you'll need to add them with nft -a add rule, keep track of the handle, and delete them with "nft delete rule $table $chain $handle" (or something like that - it's been awhile)
If "list" commands are run seperately without nft -f and users of the nft -f firewall backend delete based on the output of nft list -a, I think that should work. It could work for the nwfilter nftables backend, im not sure if that is also the case for the network nftables backend and the network bridge driver.
.....aaaannnnnndddd now I've digressed *even further*!! What I've been working up to saying is that adding all the rules at once from a file may preclude retrieving the handles for those rules, and they may actually be essential. If so, maybe there could be a hybrid approach, where everything that's self-contained in a chain used only for this particular interface is added using nft -f and a file, while a few rules that are mixed in with other rules in a common chain could be added individually so their handles could be saved.
That could also work, maybe via a way that users can specify wether or not rules should be run via nft -f? To me this sounds a bit dirty as it would require another boolean flag to the virFirewallAddCmdFull function. Im not too familier with how virFirewall is setup in order to suggest an implementation. I would be willing to try and write it.
Alternately, another possibility would be to skip exec'ing of the nft command completely, and directly use the same API that the nft command uses (libnftables) (firewalld also uses this API - via python bindings of course). You'd get the advantages of atomic operations, and also likely even better performance than nft -f. I think in the long run this would be much nicer to have (although I don't know how handles of individual rules are handled in the libnftables API when you add multiple rules at a time).
Yes indeed, that would be the nicest solution, but I think that this requires a bigger overhaul. We would get a dependency on libnftables and specific versions of the lib because of the interface that we depend on. Then virFirewall transactions can be atomic as well. I wonder if most of virFirewall nft usages can be kept the same, if libnftables allow for calls similiar to how one would call the nft command. For example: fwrule = virFirewallAddCmd(fw, layer, "add", "chain", "bridge", tableName, IN_CHAIN, NULL);
participants (2)
-
Dion Bosschieter -
Laine Stump