On 5/4/23 6:44 AM, Daniel P. Berrangé wrote:
On Sun, Apr 30, 2023 at 11:19:33PM -0400, Laine Stump wrote:
> In the past virFirewall required all rollback rules for a group (those
> commands necessary to "undo" any rules that had been added in that
> group in case of a later failure) to be manually added by switching
> into "rollback mode" and then re-calling the inverse of the exact
> virFirewallAddRule*() APIs that had been called to add the original
> rules (ie. for each --insert command, for rollback we would need to
> add a rule with all arguments identical except that "--insert" would
> be replaced by "--delete").
>
> Because nftables can't search for rules to remove by comparing all the
> arguments (it instead expects *only* a handle that was issued when the
> rule was originally added), we want for the backends' vir*ApplyRule()
> functions to be able to automatically add a single rollback rule to
> the virFirewall object while applying its existing rules (this
> automatically added rule would then be able to include the handle
> returned by "nft add rule").
I think the mistake here is that we're trying to replicate the
iptables approach for rules 1:1.
Well, my idea was to *initially* replicate it 1:1 so that we could more
easily verify we haven't changed behavior in some way that we might miss
during any testing, but in a way that we could also easily change it later.
This was required in iptables world because there was only a single
global set of tables. libvirt's rules were mixed in with rules from
non-libvirt apps. Libvirt's rules for different virtual networks also
had to be mixed together, as we needed special ordering for the
forward rules.
With nft we can drastically simplify this all with two changes to
our approach
* Each virtual network should have a top level chain
ie instead of
table ip libvirt
we should have
table ip libvirt_net_default
My understanding has always been that each packet must get an ACCEPT
result from *all* of the tables, and if this was the case, then what
you're suggesting wouldn't work.
But after a short conversation with Eric Garver, I see that I've been
conflating "table" with "hook" in some strange way, and so my
understanding wasn't exactly correct.
nft table name lengths appear to have no size limit that
will matter to us
* Don't have the INPUT/FORWARD/OUTPUT/POSTROUTING chains at
all. We should be directly wiring up LIBVIRT_{INP,OUT,FWO,FWI,FWX}
The FWO, FWI, and FWX chains must have distinct prorities ie
Instead of
table ip libvirt_net_default {
chain INPUT {
type filter hook input priority filter; policy accept;
counter packets 173 bytes 12843 jump LIBVIRT_INP
}
chain FORWARD {
type filter hook forward priority filter; policy accept;
counter packets 0 bytes 0 jump LIBVIRT_FWX
counter packets 0 bytes 0 jump LIBVIRT_FWI
counter packets 0 bytes 0 jump LIBVIRT_FWO
}
chain OUTPUT {
type filter hook output priority filter; policy accept;
counter packets 27 bytes 2322 jump LIBVIRT_OUT
}
chain LIBVIRT_INP {
}
chain LIBVIRT_OUT {
}
chain LIBVIRT_FWO {
}
chain LIBVIRT_FWI {
}
chain LIBVIRT_FWX {
}
chain POSTROUTING {
type nat hook postrouting priority srcnat; policy accept;
counter packets 9 bytes 1026 jump LIBVIRT_PRT
}
chain LIBVIRT_PRT {
}
}
We should have
table ip libvirt_net_default {
chain LIBVIRT_INP {
type filter hook input priority filter; policy accept;
}
chain LIBVIRT_OUT {
type filter hook output priority filter; policy accept;
}
chain LIBVIRT_FWX {
type filter hook forward priority -10; policy accept;
}
chain LIBVIRT_FWI {
type filter hook forward priority -5; policy accept;
}
chain LIBVIRT_FWO {
type filter hook forward priority 0; policy accept;
}
chain LIBVIRT_PRT {
type nat hook postrouting priority srcnat; policy accept;
}
}
by giving different priorties to LIBVIRT_FWO/FWI/FWX, we ensure
that packets for different virtual networks get evaluated in the
desired order, without needing to be placed into the same top
level table.
I still haven't convinced myself that what you're proposing will work as
you say, but I'm going to try it to find out. If it does, then that does
simplify things quite a bit! :-)
With this change in approach, all this logic around dealing with
nftables handles during rollback goes away. The rollback
transaction is hardcoded to precisely:
nft delete table ip libvirt_net_default
nft delete table ip6 libvirt_net_default
This also mean we don't need to worry about tracking any rules
in the status XML for NFT. I wouldn't bother tracking rules
for iptables either, because we've done without it for years and
will hopefully delete iptables support entirely not too far away.
Yeah, that's a good point, although it would mean that we would want to
keep around the "delete this implied ruleset" code for longer, rather
than just stripping it out and relying completely on the exact recording
in the status XML within a few years (when we decide that enough new
versions have been released that it would be highly likely / impossible
for someone to upgrade libvirt from pre-9.3.0 without also rebooting the
host).
(Even if we can special case old iptables-based networks, in general we
will need *some kind* of info in the status XML for to tell us "what
kind of rules" were added when a network was started; my opinion is that
it might as well be verbose rather than some kind of high level opaque
token whose meaning could later be unintentionally changed by code changes)