On Fri, Jan 05, 2024 at 03:20:04 -0500, Laine Stump wrote:
Historically libvirt hasn't differentiated between the name of a
loadable kernel module, and the name of the device driver that module
implements, but these two names can be (and usually are) at least
subtly different.
For example, the loadable module called "vfio_pci" implements a PCI
driver called "vfio-pci". We have always used the name "vfio-pci"
both
to load the module (with modprobe) and to check (in
/sys/bus/pci/drivers) if the driver is available. (This has happened
to work because modprobe "normalizes" all the names it is given by
replacing "-" with "_", so "vfio-pci" works for both
loading the
module and checking for the driver.)
When we recently gained the ability to manually specify the driver for
"virsh nodedev-detach", the fragility of this system became apparent -
if a user gave the "driver name" as "vfio_pci", then we would
modprobe
the module correctly, but then erroneously believe it hadn't been
loaded because /sys/bus/pci/drivers/vfio_pci didn't exist. For manual
specification of the driver name, we could deal with this by telling
the user "always use the correct name for the driver, don't assume
that it has the same name as the module", but it would still end up
confusing people, especially since some drivers do use underscore in
their name (e.g. the mlx5_vfio_pci driver/module).
This will only get worse when an upcoming patch starts automatically
determining the driver to use for VFIO-assigned devices - it will look
in the kernel's modules.alias file to find "best" VFIO variant
*module* for a device, and 3 out of 4 current examples of
vfio-pci/variant drivers have a mismatch between module name and
driver name, so the current code would end up properly loading the
module, but then erroneously think that the driver wasn't available.
This patch makes the code more forgiving by
1) checking for both $drivername and underscore($drivername) in
/sys/bus/pci/drivers
2) when we determine a module needs to be loaded, look at the link in
/sys/module/$modulename/driver/pci:$drivername to determine the
name of the driver we need to bind to the device(rather than just
assuming the driver has the same name as the module
Signed-off-by: Laine Stump <laine(a)redhat.com>
---
Change from V1: I tried to simplify the explanation in the commit log message
I don't think this addresses Jason's comment from V1, stating that we
should only deal with driver names and let modprobe find the correct
module.
Same thing when you are looking for the best *driver* for the device.
Yes you find a module name, but you can query the module itself for the
driver and just use the driver.
As Jason stated, we should not deal with modules at all.
Without addressing this I will not give a R-b.