On Wed, Dec 2, 2020 at 8:48 PM Laine Stump <laine(a)redhat.com> wrote:
netcf (the backend of libvirt's virInterface*() APIs) hasn't been
modified in over 2 years, and the last time there was a change
significant enough for an upstream release was in 2015 (!). It has never
been possible to reliably translate back and forth between native and
netcf/libvirt XML config for interfaces without losing some information,
and impossible to keep up with new functionality being added to host
network configuration in NetworkManager (especially since the modelling
was slightly different - netcf is based on the idea of each physical
interface having one configuration, while NetworkManager has potentially
several different "Connections" for any given hardware interface, and at
most one of these connections can be active at a time for the interface.
Or something like that.)
The libvirt virInterface*() API (and netcf behind it) originally arose
out of a request from the oVirt project that we (libvirt) provide a way
to provision the networking config on a compute node (I *think* this is
the case - I started working on libvirt when they were in the middle of
these discussions; netcf was originally implemented mostly by David
Lutterkort, and then handed off to me when he moved on to other
pastures). oVirt network provisioning usually meant adding a bridge,
assigning it an IP address, and attaching an ethernet (or two via a
bond) to that bridge. They wanted libvirt to provide this functionality
because (I guess?) they wanted to have a single connection to the node
that could perform all the setup they needed.
Although netcf could do that, in the end it didn't provide exactly what
they needed, so they "rolled their own" host network interface config
and didn't use netcf. In the meantime, the idea of using a
virtualization API to configure host network interfaces never took off
(Who'da thunk?).
netcf was designed so that a single C API + XML frontend could be
compiled with multiple different backends, for different network
configuration paradigms. A few other drivers were (partially) written
(e.g. SuSE and Windows), but in the real world, the only backends that
have been used have been the one that uses ifcfg files in
Fedora/RHEL/CentOS, and the one that uses the /etc/network/interfaces
file on debian/ubuntu.
In both cases, the backend understands a subset of what is possible in
those files, so some information from the config doesn't show up in the
XML, and thus won't be preserved if netcf is used to modify (i.e.
redefine) the interface. In addition, since a functionally identical
configuration can be represented in multiple ways in both those formats,
sometimes a config file is deemed unreadable by netcf, or it is read and
interpreted correctly, but the modified config is written back using the
"other" method.
Since ovirt chose not to use it, the only users of its interface
configuration capabilities that I've ever noticed in the wild were the
occasional user wanting to use the "virsh iface-bridge" command to put a
bridge behind their ethernet - certainly none of the higher level
virtualization management applications (that I'm aware of) use any of
the libvirt APIs that call to netcf (that means "anything starting with
"virInterface").
The small part of the virInterface APIs that have been used with some
regularity are just the functions that list current interfaces and get
their current live status (i.e. based on sending/receiving netlink
messages, not on the contents of the host network config files). Even on
that front, the one commonly used example that I know know of is
virt-manager, which had used virInterface*() to get a list of
bridges/ethernets available for guest network connections, but that
functionality was removed from virt-manager-3.0, which was released in
September of this year.
In spite of this sporadic use, there are occasional netcf BZes that get
filed complaining about certain device options missing, or erroneous
config resulting in a confusing error message. These are usually filed
by Red Hat virt QE, simply because it's a part of their test plan (i.e.
these reports generally don't reflect a failure on a production system).
Because the "fix" wouldn't provide any gain in the real world, these
BZes just sit in the queue and server only to make me (the netcf
maintainer) feel like even more of a procrastinator that I actually am.
(A sidebar: netcf was originally made into a separate library, rather
than just a few files within libvirt itself, because there was at least
shrugging verbal agreement that it would be used in places other than
libvirt (and thus there would be a community benefit in eliminated
duplicate code in the multiple projects); this also never materialized,
so in the end, it is a separate library that is only consumed by
libvirt, but because it's a separate library the "barrier to entry" for
anyone to make any changes to it is very high, and so it (effectively)
never sees any contributions from the outside.)
Because of all the above, I've thought for quite awhile that we should
deprecate netcf itself, along with all of libvirt's virInterface APIs
*except* the one that lists interfaces (virConnectListAllInterfaces) and
Switching to networkd/networkmanager from e/n/i caused
virConnectListAllInterfaces and such to fail for Ubuntu in 2018 [1].
This matches a bit the "occasional netcf BZes" that Laine mentioned,
just that we decided to cut it out and get rid of it and since then we
actively disable the netcf backend to avoid the various issues and
misunderstandings it can cause.
Other than "it won't make a difference for Ubuntu" the fact that makes
this worth for the thread is that it seems the original use case
wasn't important enough to get any complaints in 2.5 years since
disabling it early in Ubuntu 18.10.
IMHO like Daniel later suggested, maybe keeping the API but
neglecting the netcf backend is a good tradeoff to go for?
[1]:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1764314
the one that outputs an XML dump of the current status of interfaces
(virInterfaceGetXMLDesc). Since netcf is only used by libvirt anyway,
the part of functionality that performs those two tasks could be moved
into one or two C files within libvirt, removing the dependency on netcf
and making updates easier and more accessible to 3rd parties. As a
followup to this, we might provide another backend that would use
NetworkManager APIs to retrieve all this information rather than netlink
messages (which is what netcf currently does).
Alternately, Cole suggested in a separate email that since libvirt's
node device driver already reports various status information about
devices on the host, we could just beef up the output of
virNodeDeviceGetXMLDesc() for net_* devices to include more of the info
that's visible in "ip link" and "ip addr".
Or, we could just decide that it's okay for a management application
(the main consumer of libvirt) to need to use other APIs to get that
information (especially since that's what they already do anyway!)
So, that's my piece to speak. I'm looking for opinions and ideas on a
few different fronts:
1) Does this generally sound like a good direction? Or is there
something I'm ignoring that renders my points moot?
2) If we are going to do it, how should we proceed?
We obviously can't simply *remove* the virInterface API from libvirt
(since that would destroy backward compatibility guarantees), but could
immediately begin logging some sort of "this API is deprecated" message
when any of the functions are called, and then in a later release change
the APIs to return an error (while simultaneously removing netcf from
the build and dependency lists). At the same time, we would need to
decide if the "interface status" functionality needs to be maintained
within appropriate virInterface*() APIs, reproduced in
virNodeDeviceGetXMLDesc(), or just dropped altogether.
On the netcf side, there are several small patches that have been
sitting in git for a few years without being in any official release; it
would probably be nice to make one final release before closing up shop.
The mailing list could then be closed down, and some final message put
in a README in the git repo (on pagure.io) before putting it into some
archival state.
After those things are done, the various distros could be notified of
the newfound irrelevance of netcf, and given the opportunity to remove
the package from their releases.
Anything else?
--
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd