On Tue, Apr 23, 2024 at 01:27:05PM -0400, Laine Stump wrote:
On 4/23/24 7:15 AM, Daniel P. Berrangé wrote:
> On Sun, Apr 21, 2024 at 10:53:32PM -0400, Laine Stump wrote:
> > Support using nftables to setup the firewall for each virtual network,
> > rather than iptables. The initial implementation of the nftables
> > backend creates (almost) exactly the same ruleset as the iptables
> > backend, determined by running the following commands on a host that
> > has an active virtual network:
> >
> > iptables-save >iptables.txt
> > iptables-restore-translate -f iptables.txt
> >
> > (and the similar ip6tables-save/ip6tables-restore-translate for an
> > IPv6 network). Correctness of the new backend was checked by comparing
> > the output of:
> >
> > nft list ruleset
> >
> > when the backend is set to iptables and when it is set to nftables.
> >
> > This page was used as a guide:
> >
> >
https://wiki.nftables.org/wiki-nftables/index.php/Moving_from_iptables_to...
> >
> > The only differences between the rules created by the nftables backed
> > vs. the iptables backend (aside from a few inconsequential changes in
> > display order of some chains/options) are:
> >
> > 1) When we add nftables rules, rather than adding them in the
> > system-created "filter" and "nat" tables, we add them in a
private
> > table (ie only we should be using it) created by us called "libvirt"
> > (the system-created "filter" and "nat" tables can't be
used because
> > adding any rules to those tables directly with nft will cause failure
> > of any legacy application attempting to use iptables when it tries to
> > list the iptables rules (e.g. "iptables -S").
> >
> > (NB: in nftables only a single table is required for both nat and
> > filter rules - the chains for each are differentiated by specifying
> > different "hook" locations for the toplevel chain of each)
> >
> > 2) nftables doesn't support the "checksum mangle" rule (or any
> > equivalent functionality) that we have historically added to our
> > iptables rules, so the nftables rules we add have nothing related to
> > checksum mangling.
> >
> > (NB: The result of (2) is that if you a) have a very old guest (RHEL5
> > era or earlier) and b) that guest is using a virtio-net network
> > device, and c) the virtio-net device is using vhost packet processing
> > (the default) then DHCP on the guest will fail. You can work around
> > this by adding <driver name='qemu'/> to the <interface> XML
for the
> > guest).
> >
> > There are certainly much better nftables rulesets that could be used
> > instead of those implemented here, and everything is in place to make
> > future changes to the rules that are used simple and free of surprises
> > (e.g. the rules that are added have coresponding "removal" commands
> > added to the network status so that we will always remove exactly the
> > rules that were previously added rather than trying to remove the
> > rules that "this build of libvirt would have added" (which will be
> > incorrect the first time we run a libvirt with a newly modified
> > ruleset). For this initial implementation though, I wanted the
> > nftables rules to be as identical to the iptables rules as possible,
> > just to make it easier to verify that everything is working.
> >
> > The backend can be manually chosen using the firewall_backend setting
> > in /etc/libvirt/network.conf. libvirtd/virtnetworkd will read this
> > setting when it starts; if there is no explicit setting, it will look
> > for iptables commands on the host and, if they are found, will select
> > the iptables backend (this is the default for the sake of 100%
> > backward compatibility), but if iptables commands aren't found, then
> > it will use the nftables backend.
> >
> > (Since libvirt will automatically remove all its previous filter rules
> > and re-add rules using the current backend setting for all active
> > networks when it starts up, and the only noticeable change in behavior
> > between the iptables and nftables backends is that noted in item (2)
> > above, we could instead decide to make nftables the default backend
> > rather than iptables - it all depends on how important it is to work
> > properly on 15 year old guest OSes using DHCP with virtio-net
> > interfaces).
> >
> > Signed-off-by: Laine Stump <laine(a)redhat.com>
> > ---
> > po/POTFILES | 1 +
> > src/network/bridge_driver_conf.c | 3 +
> > src/network/bridge_driver_linux.c | 18 +-
> > src/network/meson.build | 1 +
> > src/network/network.conf | 17 +-
> > src/network/network_nftables.c | 940 ++++++++++++++++++++++++++++++
> > src/network/network_nftables.h | 28 +
> > src/util/virfirewall.c | 169 +++++-
> > src/util/virfirewall.h | 2 +
> > 9 files changed, 1174 insertions(+), 5 deletions(-)
> > create mode 100644 src/network/network_nftables.c
> > create mode 100644 src/network/network_nftables.h
> >
>
> > + if (needRollback) {
> > + virFirewallCmd *rollback = virFirewallAddRollbackCmd(firewall,
fwCmd->layer, NULL);
> > + const char *handleStart = NULL;
> > + size_t handleLen = 0;
> > + g_autofree char *handleStr = NULL;
> > + g_autofree char *rollbackStr = NULL;
> > +
> > + /* Search for "# handle n" in stdout of the nft add command
-
> > + * that is the handle of the table/rule/chain that will later
> > + * need to be deleted.
> > + */
>
> What are the uniqueness guarantees of handle numbers.
Each table has a monotonically increasing counter (I'd assume at least 32
bits), so it shouldn't be a problem.
But WAIT!!! - While typing this reply I've discovered something new!
Until about 45 minutes ago, I had assumed there was a single systemwide
counter. But based on your question, I asked egarver who told me that there
is a counter for each table. And we create our own tables (called "libvirt",
for both ip and ip6). I just tried manually deleting the libvirt table, and
in that case the counter does start over from 0! :-O
Oh, that's not terrible at all, as the unique constraint is thus
("libvirt", <handle>)
which eliminates any risk of us accidentally deleting stuff belonging
to the sysadmin or another app. If someone else creates a table
called 'libvirt' they get to keep all the broken pieces :-)
I can't decide if this is a case of "Ooh! We'd better
try to protect against
this!", or "Well, you deliberately broke it, so you get to pick up the
pieces!"
The latter.
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|