.conf file setting(s) for packet filtering backend(s)

(this probably will make no sense to anyone who hasn't spent time looking at daemonConfig*, in which case you can go ahead and hit Delete now. At any rate I'm just tossing this out into the void to see if anyone has any ideas/opinions, so in *any* case feel free to hit delete!) Happy New Year! and time for another bit of confused ramblings trying to figure out how to do something that ends up being non-confused and straightforward. I'm currently working on switching the backend of the network driver from using iptables to using nftables. Due to some functionality that is not available with nftables (the rule that fixes up the checksum of DHCP packets which, btw, is only relevant for *very* old guests, e.g. RHEL5), this needs to be opt-in via a config file setting. In the meantime, in order to make this doable in a reasonable amount of time, I am *not* converting the nwfilter driver right away, and when I do it will need its own config file setting for opt-in. I've never before looked at the code for the .conf file settings at all. I had assumed there would be some sort of "pull" API, where code in the drivers could call, e.g. virConfGetString("filter_backend") and it would return the config setting to the caller. But when I look at it, I see that all daemons use the same daemonConfigLoadFile() called from remote_daemon.c:main() (which is shared by all the daemons) and the daemonConfig object that is created to hold the config settings that are read is only visible within main() - the only way that a config setting is used is by main() "pushing" it out to a static variable somewhere else where it is later retrieved by the interested party, e.g. the way that main() calls daemonSetupNetDevOpenvswitch(config), which then sets the static virNetDevOpenvswitchTimeout in util/virnetdevopenvswitch.c. (NB: util/virnetdevopenvswitch.c is linked into every deamon, so even for the daemons that don't use it, calls to virnetdevopenvswitch.c functions still compile properly (and calling them is harmless), so virNetDevOpenvswitchTimeout is set even for daemons that never call openvswitch APIs). If I could count on all builds using split daemons (i.e. separate virtnetworkd and virtnwfilterd) then I could add a similar API in virfirewall.c that remote_daemon.c:main() could use to set "filter_backend" into a static in virfirewall.c (which is used by both drivers) and everything would just happily work: virtnetworkd.conf: filter_backend = nftables virtnwfilterd.conf filter_backend = iptables However, I need to also deal with the possibility that the nwfilter and network drivers are in the same unified libvirtd binary, and in that case both drivers would have the same virfirewall.c:filter_backend setting, thus making it impossible to use the iptables backend for the nwfilter driver and nftables backend for the network driver. For that case I would need separate settings in the config for each driver, e.g. libvirtd.conf: network_filter_backend = nftables nwfilter_backend = iptables and then those would need to be stored off somewhere different for each driver, then they would use it to set the backend for each virFirewall object as it is created. Organizationally, it would make the most sense for these settings (and the API to set them) to be located in the drivers that use them (so, for example, network_filter_backend could live in network/bridge_driver_linux.c and nwfilter_backend could live in nwfilter/nwfilter_driver.c). But that would mean that remote_daemon.c:main() would need to directly call functions in those files, which is a no-no (because, in the case of split daemons, you either have one or the other at build time, but never both). So should I perhaps declare the nftables backend for nwfilter to be a lost cause until everyone moves to split daemons, add a "filter_backend" setting that is directly set in virfirewall.c (by remote_daemon.c:main()), and then provide some sort of override in virFirewallNew so calls from the nwfilter driver can say "ignore the filter_backend setting and use iptables"? Or should we make the virConf APIs beefier, and add facilities to save off the entire daemonConfig object and make its contents available via something like virConfGetString("network_filter_backend")? But if I did that, it would mean two differently-named config entries, and it would certainly be nice if I didn't have to introduce daemon-specific names like this that would need to be carried over from libvirtd.conf into virtnwfilterd.conf and virtnetworkd.conf (where differing names would no longer be required). I suppose I could go "full MS" and introduce the concept of sections to the conf file, so libvirtd.conf could have something like this: [network] filter_backend = nftables [nwfilter] filter_backend = iptables but that seems like a lot of work for something that will be obsolete in a year or so anyway (when the unified daemon is a thing of the past). Am I thinking about details too much and missing an obvious simple solution? (I think until/unless I or someone else comes up with a better idea, I'm going to go with a single filter_backend setting set in virfirewall.c that is somehow overridden when the virFirewall calls come from the nwfilter driver. Then if everyone moves to split daemons before I get around to making the nwfilter driver support nftables, the problem will end up just solving itself; otherwise I can think of some other piece of ugliness to make it work).

On Sun, Jan 02, 2022 at 09:41:37PM -0500, Laine Stump wrote:
I'm currently working on switching the backend of the network driver from using iptables to using nftables. Due to some functionality that is not available with nftables (the rule that fixes up the checksum of DHCP packets which, btw, is only relevant for *very* old guests, e.g. RHEL5), this needs to be opt-in via a config file setting. In the meantime, in order to make this doable in a reasonable amount of time, I am *not* converting the nwfilter driver right away, and when I do it will need its own config file setting for opt-in.
I've never before looked at the code for the .conf file settings at all. I had assumed there would be some sort of "pull" API, where code in the drivers could call, e.g. virConfGetString("filter_backend") and it would return the config setting to the caller. But when I look at it, I see that all daemons use the same daemonConfigLoadFile() called from remote_daemon.c:main() (which is shared by all the daemons) and the daemonConfig object that is created to hold the config settings that are read is only visible within main() - the only way that a config setting is used is by main() "pushing" it out to a static variable somewhere else where it is later retrieved by the interested party, e.g. the way that main() calls daemonSetupNetDevOpenvswitch(config), which then sets the static virNetDevOpenvswitchTimeout in util/virnetdevopenvswitch.c.
I'd consider the OVS approach to be a bad example Global state needing configurable parameters for stuff in util/ should generally be considered a design flaw. Global state should be exclusively in the drivers, and then the desired values passed into the util APIs explicitly. ie ovs_timeout should have been in qemu.conf (any other drivers' config files if appropriate).
(NB: util/virnetdevopenvswitch.c is linked into every deamon, so even for the daemons that don't use it, calls to virnetdevopenvswitch.c functions still compile properly (and calling them is harmless), so virNetDevOpenvswitchTimeout is set even for daemons that never call openvswitch APIs).
This is another bit of technical debt. We've been lazy with putting stuff into util that really ought not to be there. This stuff even gets into the libvirt.so that's used client side, so the argument that we had a single monolithic libvirtd didn't apply either.
If I could count on all builds using split daemons (i.e. separate virtnetworkd and virtnwfilterd) then I could add a similar API in virfirewall.c that remote_daemon.c:main() could use to set "filter_backend" into a static in virfirewall.c (which is used by both drivers) and everything would just happily work:
virtnetworkd.conf: filter_backend = nftables
virtnwfilterd.conf filter_backend = iptables
Putting these settings into virtnetworkd.conf and virtnwfilterd.conf certainly makes conceptual sense. The problem you mention is avoided by not having global state in virtfirewall.c. Just pass the setting into every API call whuere it is relevant.
However, I need to also deal with the possibility that the nwfilter and network drivers are in the same unified libvirtd binary, and in that case both drivers would have the same virfirewall.c:filter_backend setting, thus making it impossible to use the iptables backend for the nwfilter driver and nftables backend for the network driver. For that case I would need separate settings in the config for each driver, e.g.
libvirtd.conf: network_filter_backend = nftables nwfilter_backend = iptables
Definitely don't want this, as its just follwing thue mistake we did with ovs.
So should I perhaps declare the nftables backend for nwfilter to be a lost cause until everyone moves to split daemons, add a "filter_backend" setting that is directly set in virfirewall.c (by remote_daemon.c:main()), and then provide some sort of override in virFirewallNew so calls from the nwfilter driver can say "ignore the filter_backend setting and use iptables"?
I'm wondering how you're integrating nftables into virfirewall in general ? Currently we just have VIR_FIREWALL_LAYER_ETHERNET, VIR_FIREWALL_LAYER_IPV4, VIR_FIREWALL_LAYER_IPV6, which get mapped to ebtables, iptables and ip6tables internally. Previously they could also get mapped to firewalld but we removed that. This worked because both firewalld passthrough and the native commands took the same syntax, so the choice of backends was transparent to the caller. Now with use of nftables, we have completely different syntax for adding rules. IOW, the caller needs to decide which backend to use, in order to decide what syntax to use with virFirewallAddRule. IIUC, with nftables there is no split between ethernet, ipv4 and ipv6 filtering. This makes the VIR_FIREWALL_LAYER_* enum somewhat redundant/inappropriate as a high level conceptual thing. Since the arguments to virFirewallAddRule are inherantly tied to the specific firewall command, we shoudl probably just admit this in the API. IOW, rename typedef enum { VIR_FIREWALL_LAYER_ETHERNET, VIR_FIREWALL_LAYER_IPV4, VIR_FIREWALL_LAYER_IPV6, VIR_FIREWALL_LAYER_LAST, } virFirewallLayer; to typedef enum { VIR_FIREWALL_TOOL_EBTABLES, VIR_FIREWALL_TOOL_IPTABLES, VIR_FIREWALL_TOOL_IP6TABLES, VIR_FIREWALL_TOOL_LAST, } virFirewallTool; at which point we can just add VIR_FIREWALL_TOOL_NFTABLES Now we don't need any global config in firewall.c to select between nftables and traditional xtables commands - it is always explicitly given by the caller Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 1/4/22 7:57 AM, Daniel P. Berrangé wrote:
On Sun, Jan 02, 2022 at 09:41:37PM -0500, Laine Stump wrote:
I'm currently working on switching the backend of the network driver from using iptables to using nftables. Due to some functionality that is not available with nftables (the rule that fixes up the checksum of DHCP packets which, btw, is only relevant for *very* old guests, e.g. RHEL5), this needs to be opt-in via a config file setting. In the meantime, in order to make this doable in a reasonable amount of time, I am *not* converting the nwfilter driver right away, and when I do it will need its own config file setting for opt-in.
I've never before looked at the code for the .conf file settings at all. I had assumed there would be some sort of "pull" API, where code in the drivers could call, e.g. virConfGetString("filter_backend") and it would return the config setting to the caller. But when I look at it, I see that all daemons use the same daemonConfigLoadFile() called from remote_daemon.c:main() (which is shared by all the daemons) and the daemonConfig object that is created to hold the config settings that are read is only visible within main() - the only way that a config setting is used is by main() "pushing" it out to a static variable somewhere else where it is later retrieved by the interested party, e.g. the way that main() calls daemonSetupNetDevOpenvswitch(config), which then sets the static virNetDevOpenvswitchTimeout in util/virnetdevopenvswitch.c.
I'd consider the OVS approach to be a bad example
Global state needing configurable parameters for stuff in util/ should generally be considered a design flaw.
Okay, so then the setting of the host uuid is also a bad example (set into a static in util/viruuid.c), as is all the config for logging (set in statics in util/virlog.c by calling a function in util/virdaemon.c) :-) (Of course, those are things that *are* conceivably needed by all daemons, not just a subset of them, so I guess it's different, but still if you want to be pedantic, they do fit your description)
Global state should be exclusively in the drivers,
It just occurred to me that two different things are being discussed in parallel here without really thinking about it (at least by me) - daemon config, and driver config. In the past there was a single daemon, with its config in libvirtd.conf, and many drivers, each potentially having its own separate .conf file (but in practice there is only qemu.conf, lxc.conf, and libxl.conf - no .conf files for other drivers that I can see). Now with split daemons, each driver has its own daemon, and each daemon has its own .conf file (virtqemud.conf, virtlxc.conf, virtnetworkd.conf, ...) while drivers continue to have (or not have) their own separate conf file (qemu.conf, lxc.conf, libxl.conf). When the daemon split happened, I made the (incorrect) leap that the new virt*d.conf files were a convenient place for driver-specific config. Instead, I guess the virt*d.conf files should only be used for config related to operation of the daemon process, how it's connected to, logging of its error messages, etc., but shouldn't have any config specific to the driver that happens to be running in that daemon; for driver-specific things there should be a *.conf file (qemu.conf, and now it seems I will need network.conf) which is read by the driver itself. (not sure what should be done with ovs_timeout, which is, as you point out, in the wrong place) That seems like a lot of files, but I guess as long as it's got a (well documented) logic to it, it shouldn't be any worse than having fewer files each of a greater length :-) Anyway, rather than looking at what happens when virtnetworkd.conf is read and adding a new knob there, I really should be looking at qemu_conf.c and using that as an example to add parsing of a new network.conf (which doesn't currently exist) to the network driver (and later nwfilter.conf to the nwfilter driver)
and then the desired values passed into the util APIs explicitly.
Well, that *is* being done with ovs_timeout. It's just that the API being called is setting a static in the util/virnetdevopenvswitch.c (just as is done with logging config and host uuid), and then later used implicitly by other functions in the same file, rather than sending it as an arg to each API call that needs it. My guess is that since the setting is used for both qemu and lxc, the author figured putting a single instance of the config in libvirtd.conf would "make life simpler".
ie ovs_timeout should have been in qemu.conf (any other drivers' config files if appropriate).
Somehow I had always considered qemu.conf to be specifically for things related to starting the qemu process *only* (and not necessarily pertaining to the entire qemu driver), although even with that interpretation I guess ovs_timeout could be considered to be in that group as well (since it's used when running ovs-vsctl as part of preparing the network connection for a qemu process that will soon be started). I see now I've just been too narrow minded all this time.
(NB: util/virnetdevopenvswitch.c is linked into every deamon, so even for the daemons that don't use it, calls to virnetdevopenvswitch.c functions still compile properly (and calling them is harmless), so virNetDevOpenvswitchTimeout is set even for daemons that never call openvswitch APIs).
This is another bit of technical debt. We've been lazy with putting stuff into util that really ought not to be there.
I don't think it's a problem putting virnetdevopenvswitch.c into util, since it is used already by two different drivers (qemu and lxc) and could be used by others in the future. The problem is that a function from util/virnetdevopenvswitch.c gets called by virdeamon.c, even when nothing in the driver being run by the daemon ever *uses* the rest of the openvswitch API. util is a good place to put functions that are used by more than one driver (or conf); any file (function? I've truthfully never paid attention to that detail) that isn't referenced from a particular daemon binary just shouldn't be linked into it.
This stuff even gets into the libvirt.so that's used client side, so the argument that we had a single monolithic libvirtd didn't apply either.
Really? I have always just assumed that if nothing in a particular .o was referenced, then that .o wouldn't show up in the binary. And even if that isn't the case, then we could tailor the build to only include those sources that are actually used (although that would be cumbersome to maintain).
If I could count on all builds using split daemons (i.e. separate virtnetworkd and virtnwfilterd) then I could add a similar API in virfirewall.c that remote_daemon.c:main() could use to set "filter_backend" into a static in virfirewall.c (which is used by both drivers) and everything would just happily work:
virtnetworkd.conf: filter_backend = nftables
virtnwfilterd.conf filter_backend = iptables
Putting these settings into virtnetworkd.conf and virtnwfilterd.conf certainly makes conceptual sense.
Or maybe, based on what I say about "virtqemud.conf vs. qemu.conf" (and thus "virtnetworkd.conf vs. network.conf") above, they should be put in networkd.conf and nwfilter.conf. (again, I'm loathe to create "yet another" config file, but that seems the most logical thing to do).
The problem you mention is avoided by not having global state in virtfirewall.c. Just pass the setting into every API call whuere it is relevant.
Yes. That has been my plan - virFirewallNew() will have a single arg "backend" which is set to either VIR_NETFILTER_BACKEND_NFTABLES or VIR_NETFILTER_BACKEND_IPTABLES. That would be set in each virFirewall object when its created and used as the rules are added, and later when they are executed.
However, I need to also deal with the possibility that the nwfilter and network drivers are in the same unified libvirtd binary, and in that case both drivers would have the same virfirewall.c:filter_backend setting, thus making it impossible to use the iptables backend for the nwfilter driver and nftables backend for the network driver. For that case I would need separate settings in the config for each driver, e.g.
libvirtd.conf: network_filter_backend = nftables nwfilter_backend = iptables
Definitely don't want this, as its just follwing thue mistake we did with ovs.
Yes, I really want to avoid it. And after my Aha moment about virtnetworkd.conf vs. network.conf I think I can.
So should I perhaps declare the nftables backend for nwfilter to be a lost cause until everyone moves to split daemons, add a "filter_backend" setting that is directly set in virfirewall.c (by remote_daemon.c:main()), and then provide some sort of override in virFirewallNew so calls from the nwfilter driver can say "ignore the filter_backend setting and use iptables"?
I'm wondering how you're integrating nftables into virfirewall in general ?
Currently we just have
VIR_FIREWALL_LAYER_ETHERNET, VIR_FIREWALL_LAYER_IPV4, VIR_FIREWALL_LAYER_IPV6,
which get mapped to ebtables, iptables and ip6tables internally. Previously they could also get mapped to firewalld but we removed that. This worked because both firewalld passthrough and the native commands took the same syntax, so the choice of backends was transparent to the caller.
Now with use of nftables, we have completely different syntax for adding rules. IOW, the caller needs to decide which backend to use, in order to decide what syntax to use with virFirewallAddRule.
IIUC, with nftables there is no split between ethernet, ipv4 and ipv6 filtering. This makes the VIR_FIREWALL_LAYER_* enum somewhat redundant/inappropriate as a high level conceptual thing.
Since the arguments to virFirewallAddRule are inherantly tied to the specific firewall command, we shoudl probably just admit this in the API. IOW, rename
typedef enum { VIR_FIREWALL_LAYER_ETHERNET, VIR_FIREWALL_LAYER_IPV4, VIR_FIREWALL_LAYER_IPV6,
VIR_FIREWALL_LAYER_LAST, } virFirewallLayer;
to
typedef enum { VIR_FIREWALL_TOOL_EBTABLES, VIR_FIREWALL_TOOL_IPTABLES, VIR_FIREWALL_TOOL_IP6TABLES,
VIR_FIREWALL_TOOL_LAST, } virFirewallTool;
at which point we can just add
VIR_FIREWALL_TOOL_NFTABLES
Now we don't need any global config in firewall.c to select between nftables and traditional xtables commands - it is always explicitly given by the caller
But the "layer" still needs to be conveyed in some way because it conveys important information about the rule - the same rule could be used for IPv6 or IPv4, and the way we give that information now is via "layer" in the args when adding a rule. If we changed the meaning of layer, then we would just need to add a different arg in its place for the old meaning, so there's really no gain. The way I'm dealing with this in the first pass is to replace all the "iptablesBlahRules" calls with "virNetfilterBlahRules", which are backend-agnostic functions that use the backend setting in the virFirewall object they're operating on (set in virFirewallNew) to call either the old iptablesBlahRules() or a new virNftablesBlahRules(). Then when virFirewallApply() is called, the backend is checked to determine which command to add to the list of args (in the case of iptables it will be either iptables/ip6tables/ebtables depending on the layer, and in the case of nftables it will always be nft, but a family will need to be added (or maybe substituted) into the middle of the args depending on whether its ipv4 or ipv6 - this is slightly ugly but minimizes changes required to the nwfilter code (and will be eliminated as part of supporting nftables in nwfilter). Eventually, 1) virFirewallApply() should create a file containing all the rules and call nft once with that file rather than calling it over and over with each individual rule, 2) it should use the dbus API for nftables rather than the nft command, and 3) the virFirewallRule object should become more intelligent - rather than being just a list of backend-specific args, it can be an abstract representation of a rule, so that all the "add rules" functions can be unified (no separate functions for iptables vs nftables), and then the final virFirewallApply() would create the backend-specific set of args for each rule. (1) (and/or possibly (2) should happen right away, while (3) can only be done as a part of converting nwfilter to support the nftables backend.

On Thu, Jan 06, 2022 at 02:24:18PM -0500, Laine Stump wrote:
On 1/4/22 7:57 AM, Daniel P. Berrangé wrote:
On Sun, Jan 02, 2022 at 09:41:37PM -0500, Laine Stump wrote:
I'm currently working on switching the backend of the network driver from using iptables to using nftables. Due to some functionality that is not available with nftables (the rule that fixes up the checksum of DHCP packets which, btw, is only relevant for *very* old guests, e.g. RHEL5), this needs to be opt-in via a config file setting. In the meantime, in order to make this doable in a reasonable amount of time, I am *not* converting the nwfilter driver right away, and when I do it will need its own config file setting for opt-in.
I've never before looked at the code for the .conf file settings at all. I had assumed there would be some sort of "pull" API, where code in the drivers could call, e.g. virConfGetString("filter_backend") and it would return the config setting to the caller. But when I look at it, I see that all daemons use the same daemonConfigLoadFile() called from remote_daemon.c:main() (which is shared by all the daemons) and the daemonConfig object that is created to hold the config settings that are read is only visible within main() - the only way that a config setting is used is by main() "pushing" it out to a static variable somewhere else where it is later retrieved by the interested party, e.g. the way that main() calls daemonSetupNetDevOpenvswitch(config), which then sets the static virNetDevOpenvswitchTimeout in util/virnetdevopenvswitch.c.
I'd consider the OVS approach to be a bad example
Global state needing configurable parameters for stuff in util/ should generally be considered a design flaw.
Okay, so then the setting of the host uuid is also a bad example (set into a static in util/viruuid.c), as is all the config for logging (set in statics in util/virlog.c by calling a function in util/virdaemon.c) :-)
Ok true, there are reasonable exceptions like logging / uuid.
ie ovs_timeout should have been in qemu.conf (any other drivers' config files if appropriate).
Somehow I had always considered qemu.conf to be specifically for things related to starting the qemu process *only* (and not necessarily pertaining to the entire qemu driver), although even with that interpretation I guess ovs_timeout could be considered to be in that group as well (since it's used when running ovs-vsctl as part of preparing the network connection for a qemu process that will soon be started). I see now I've just been too narrow minded all this time.
I think I'd describe 'libvirtd.conf' as being for settings related to operation of the daemon / RPC system, while 'qemu.conf' is for settings related to operation of the QEMU driver or any features it uses. With this view point logging / host uuid is appropriate for libvirtd.conf but OVS timeouts, firewall settings are driver conf things
This stuff even gets into the libvirt.so that's used client side, so the argument that we had a single monolithic libvirtd didn't apply either.
Really? I have always just assumed that if nothing in a particular .o was referenced, then that .o wouldn't show up in the binary. And even if that isn't the case, then we could tailor the build to only include those sources that are actually used (although that would be cumbersome to maintain).
What you describe is true if your linking to a static library. The src/util stuff does indeed get into a static libvirt_util.a library, but this is then built into a dynamic library libvirt.so and all the APIs in src/util are exported (with @LIBVIRT_PRIVATE version tag). As a result no .o files in src/util will ever get dropped.
If I could count on all builds using split daemons (i.e. separate virtnetworkd and virtnwfilterd) then I could add a similar API in virfirewall.c that remote_daemon.c:main() could use to set "filter_backend" into a static in virfirewall.c (which is used by both drivers) and everything would just happily work:
virtnetworkd.conf: filter_backend = nftables
virtnwfilterd.conf filter_backend = iptables
Putting these settings into virtnetworkd.conf and virtnwfilterd.conf certainly makes conceptual sense.
Or maybe, based on what I say about "virtqemud.conf vs. qemu.conf" (and thus "virtnetworkd.conf vs. network.conf") above, they should be put in networkd.conf and nwfilter.conf. (again, I'm loathe to create "yet another" config file, but that seems the most logical thing to do).
Sorry, yes, I mis-typed. virtnetworkd.conf / virtnwfilterd.conf are the direct equivalent of libvirtd.conf so I shouldn't have written that. I meand network.conf / nwfilter.conf which would be driver config files, not daemon config files. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (2)
-
Daniel P. Berrangé
-
Laine Stump