Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver

Wednesday, 3 May 2023

On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote:
...
 When I first started on this (long, protracted, repeatedly
interrupted
 for extended periods - many of these patches are > a year old) task, I
 considered doing an all-at-once complete replacement of iptables with
 nftables, since all the Linux distros we support have had nftables for
 several years, and I'm pretty sure nobody has it disabled (not even
 sure if it's possible to disable nftables while still enabling
 iptables, since they both use xtables in the kernel). But due to
 libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit
 fd5b15ff all the way back in July 2010 for details) which has no
 equivalent in nftables rules (and we don't *want* it to!!), and the
 desire to be able to easily switch back to iptables in case of an
 unforeseen regression, we decided that both iptables and nftables need
 to be supported (for now), with the default (for now) remaining as
 iptables. 
What happens if you use iptables -j CHECKSUM --checksum-fill, and
this is just the iptables compat shim that is secretly talking to
NFT in the kernel ?  Is the checksum stuff just ignored entirely
then ?

If so, then the key decision factor for us, is whether any of the
supported distros still officially support use of the iptables KMOD
instead of nft KMOD.

...
 So, what I've ended up with is:

 1) a network.conf file (didn't exist before) with a single setting
 "firewall_backend". If unset, the network driver tries to use iptables
 on the backend, and if that's missing, then tries to use nftables. 
For most distros I feel like nftables should be the default
and not require a user opt in.  That raises the question of
how do we deal with the upgrade path.

Historically when starting libvirt we'll blow away our existing
iptables rules and create them fresh. For an upgrade path we'll
need a variant which blows away iptables rules and instead creates
nftables rules. If that works, then we could default to nftables
without user config.

...
 We had spent some time in IRC discussing different ways of using new
 functionality available in nftables to make a more
 efficient/performant implemention of the desired filtering, and there
 are some really great possibilities that need to be explored, but in
 the end there were too many details up in the air, and I decided that
 it would be more "accomplishable" (coined a new word there!) to first
 replicate existing behavior with nftables, but do it inside a
 framework that makes it easy to modify the details in the future (in
 particular making it painless to switch back and forth between builds
 with differing filter models at runtime) - this way we'll be able to
 separate the infrastructure work from the details of the rules (which
 we can then more easily work on and experiment with). (This implies
 that the main objective right now is "get rid of iptables
 dependencies", not "make the filtering faster and more efficient").

I feel like filtering wrt virtual networks isn't our main
performance bottleneck. Most machines only have a couple
of virtual networks, so most host traffic only has to
match a few rules.

With nwfilter though, if we have 1000 VMs we'll have
1000 top level rules that need evaluating for every
packet. IOW, the performance optimizations allowed
by nftables are much more relevant to nwfilter.

...
 * allows switching between iptables/nftables backends without
   rebooting or restarting networks/guests.

   Because the commands required to remove a network's filter rules are
   now saved in the network status XML, each time libvirtd (or
   virtnetworkd) is restarted, it will execute exactly the commands
   needed to remove the filter rules that had been added by the
   previous libvirtd/virtnetworkd (rather than just making a guess, as
   we've always done up until now), and then add new rules using the
   current backend+binary's set of rules (while also saving the info
   needed for future removal of these new rules back into the network's
   status XML). 
Ok that's answering my question about upgrades then. It looks like we
should be able to default to nftables out of the box, if we assume
that absence of info in the network status XML implies iptables.

...
 * virFirewall does *not* provide a backend-agnostic interface [this
is fine]

   * We need to maintain a backward-compatible API for virFirewall so
     that we don't have to touch nwfilter code. Trying to make its API
     backend-agnostic would require individually considering/changing
     every nwfilter use of virFirewall.

   * instead virFirewall objects are just a way to build a collection
     of commands to execute to build a firewall, then execute them
     while collecting info for and building a collection of commands
     that will tear down that firewall in the future.

   Do I want to "fix" this in the future by making virFirewall a higher
   level interface that accepts tokens describing the type of rule to
   add (rather than backend-specific arguments to a backend-specific
   command)?  No. I think I like the way virFirewall works (as
   described in that previous bullet-point), instead I'm thinking that
   it is just slightly mis-named - I've lately been thinking of it as a
   "virNetFilterCmdList". Similarly, the virFirewallRules that it has a
   list of aren't really "rules", they are better described as commands
   or actions, so maybe they should be renamed to virNetfilterCmd or
   virNetfilterAction. But that is just cosmetic, so I didn't want to
   get into it in these patches (especially in case someone disagrees,
   or has a better idea for naming). 
The main thing I'd like to see is to maintain the clean split of
code between what is general purpose, and what is specific to the
virtual network driver and what is specific to the nwfilter
driver. AFAICT, this series mixes that up with circular dependancies
between viriptables.c, virfirewall.c and virnetfilter.c.

I'd like virfirwall.c to remain indepent of anything related to
the virtual network driver.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver