Re: [PATCH v2 24/27] network: add an nftables backend for network driver's firewall construction

Tuesday, 23 April 2024

On Sun, Apr 21, 2024 at 10:53:32PM -0400, Laine Stump wrote:
...
 Support using nftables to setup the firewall for each virtual
network,
 rather than iptables. The initial implementation of the nftables
 backend creates (almost) exactly the same ruleset as the iptables
 backend, determined by running the following commands on a host that
 has an active virtual network:

   iptables-save >iptables.txt
   iptables-restore-translate -f iptables.txt

 (and the similar ip6tables-save/ip6tables-restore-translate for an
 IPv6 network). Correctness of the new backend was checked by comparing
 the output of:

   nft list ruleset

 when the backend is set to iptables and when it is set to nftables.

 This page was used as a guide:

   https://wiki.nftables.org/wiki-nftables/index.php/Moving_from_iptables_to...

 The only differences between the rules created by the nftables backed
 vs. the iptables backend (aside from a few inconsequential changes in
 display order of some chains/options) are:

 1) When we add nftables rules, rather than adding them in the
 system-created "filter" and "nat" tables, we add them in a private
 table (ie only we should be using it) created by us called "libvirt"
 (the system-created "filter" and "nat" tables can't be used
because
 adding any rules to those tables directly with nft will cause failure
 of any legacy application attempting to use iptables when it tries to
 list the iptables rules (e.g. "iptables -S").

 (NB: in nftables only a single table is required for both nat and
 filter rules - the chains for each are differentiated by specifying
 different "hook" locations for the toplevel chain of each)

 2) nftables doesn't support the "checksum mangle" rule (or any
 equivalent functionality) that we have historically added to our
 iptables rules, so the nftables rules we add have nothing related to
 checksum mangling.

 (NB: The result of (2) is that if you a) have a very old guest (RHEL5
 era or earlier) and b) that guest is using a virtio-net network
 device, and c) the virtio-net device is using vhost packet processing
 (the default) then DHCP on the guest will fail. You can work around
 this by adding <driver name='qemu'/> to the <interface> XML for the
 guest).

 There are certainly much better nftables rulesets that could be used
 instead of those implemented here, and everything is in place to make
 future changes to the rules that are used simple and free of surprises
 (e.g. the rules that are added have coresponding "removal" commands
 added to the network status so that we will always remove exactly the
 rules that were previously added rather than trying to remove the
 rules that "this build of libvirt would have added" (which will be
 incorrect the first time we run a libvirt with a newly modified
 ruleset). For this initial implementation though, I wanted the
 nftables rules to be as identical to the iptables rules as possible,
 just to make it easier to verify that everything is working.

 The backend can be manually chosen using the firewall_backend setting
 in /etc/libvirt/network.conf. libvirtd/virtnetworkd will read this
 setting when it starts; if there is no explicit setting, it will look
 for iptables commands on the host and, if they are found, will select
 the iptables backend (this is the default for the sake of 100%
 backward compatibility), but if iptables commands aren't found, then
 it will use the nftables backend.

 (Since libvirt will automatically remove all its previous filter rules
 and re-add rules using the current backend setting for all active
 networks when it starts up, and the only noticeable change in behavior
 between the iptables and nftables backends is that noted in item (2)
 above, we could instead decide to make nftables the default backend
 rather than iptables - it all depends on how important it is to work
 properly on 15 year old guest OSes using DHCP with virtio-net
 interfaces).

 Signed-off-by: Laine Stump <laine(a)redhat.com&gt;
 ---
  po/POTFILES                       |   1 +
  src/network/bridge_driver_conf.c  |   3 +
  src/network/bridge_driver_linux.c |  18 +-
  src/network/meson.build           |   1 +
  src/network/network.conf          |  17 +-
  src/network/network_nftables.c    | 940 ++++++++++++++++++++++++++++++
  src/network/network_nftables.h    |  28 +
  src/util/virfirewall.c            | 169 +++++-
  src/util/virfirewall.h            |   2 +
  9 files changed, 1174 insertions(+), 5 deletions(-)
  create mode 100644 src/network/network_nftables.c
  create mode 100644 src/network/network_nftables.h

...
 +    if (needRollback) {
 +        virFirewallCmd *rollback = virFirewallAddRollbackCmd(firewall, fwCmd->layer,
NULL);
 +        const char *handleStart = NULL;
 +        size_t handleLen = 0;
 +        g_autofree char *handleStr = NULL;
 +        g_autofree char *rollbackStr = NULL;
 +
 +        /* Search for "# handle n" in stdout of the nft add command -
 +         * that is the handle of the table/rule/chain that will later
 +         * need to be deleted.
 +         */ 
What are the uniqueness guarantees of handle numbers.
Consider the scenario:

 1. Libvirt add a rule, and the kernel reports "handle 1729".
 2. Some admin or other application deletes *our* rule.
 3. We now stop our network and delete handle 1729.

Step (3) is obviously going to be redundant, but what happens in
practice. Will libvirt be *guaranteed* to get an error from
trying to dlete handle 1729. It would be bad if, between steps 2
and 3, the kernel could assume handle 1729 to a completely
differnt application adding rules - it would cause libvirt to
delete someone else's rule by mistake.

Obviously handles can be reused across reboots too. The status XML
should be on /run which should be a tmpfs thats purged. I wonder
though if, for safety, we should record the boot This makes me
wonder if we should record the current system boot timestamp,
and validate that matches before trying to delete based on handle
number ?

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [PATCH v2 24/27] network: add an nftables backend for network driver's firewall construction