(Sigh, resent because I forgot to update the list address in the reply :-/)
Sorry, I meant to respond to this earlier and forgot. Now I'm sending v2
of the patches, and wanted to make sure you didn't think I was just
ignoring your comments :-) (since this patch is still there in V2)
On 10/23/23 9:39 AM, Jason Gunthorpe wrote:
On Mon, Oct 23, 2023 at 12:54:37AM -0400, Laine Stump wrote:
> When we recently gained the ability to manually specify a driver to
> bind to with virsh nodedev-detach, the fragility of this system became
> apparent - if a user gives the driver name as "vfio_pci", then we
> would modprobe the module, but then erroneously believe it hadn't been
> loaded because /sys/bus/pci/drivers/vfio_pci didn't exist. For manual
> specification of the driver name, this could be dealt with by telling
> the user "always use the correct name for the driver, don't assume
> that it is the same as the modulename", but it would still end up
> confusing people, especially since some drivers do use underscore in
> their name (e.g. the mlx5_vfio_pci driver/module).
I think you are creating more confusion by allowing this. The driver
name from module.alias should be used consistently in its exact
format.
But the name in the modules.alias file *is not* the name of the driver.
It is the name of the module, and the name of the driver may or may not
be the same - that's the entire reason this issue came to light.
And in the end I don't do anything more than what modprobe itself does.
(Well, actually that's not true, since modprobe never has to actually
*use* the drivers that are implemented by the modules that it loads, it
doesn't need to understand that the name of the driver may be different
from the name of the module, and so it doesn't have to look at the link
in /sys/modules/$modulename/drivers/pci:$drivername to get the proper
name that must be written to sysfs to get the driver bound to the
device. It just unilaterally decides that all horizontal lines in a
module name are "_", and so it replaces all "-" with "_")
Really it is modprobe's practice of auto-translating all "-" to
"_",
combined with the inconsistency of naming between module names and
driver names that is causing the confusion; what this patch is doing is
helping to clear up that confusion, not make it worse. Essentially,
whether someone uses the module name or the driver name, libvirt will
get the correct driver bound to the device. (and we do have to allow
both module name and driver name; the former in cases where the module
hasn't been loaded yet and so the driver isn't available, and the latter
in cases where the driver is statically linked into the kernel, as
discussed below).
Log a 'did you mean XYZ" or something if it doesn't work out.
Pass the "driver name" directly to modprobe and let it sort it out
In the cases (e.g. vfio_pci/vfio-pci) where the module name and driver
name mismatch only due to -/_, what you say will work as long as the
*driver* name is used (i.e. "vfio-pci"), but if we use the name
discovered from modules.alias ("vfio_pci") then the modprobe will
succeed, but the bind will fail.
And we can't just assume that the driver name will always have "-" in
place of "_" and do an auto-replace in the name prior to binding the
device to the driver, because some drivers (in particular,
mlx5_vfio_pci/mlx5_vfio_pci) *don't* use "-" in the driver name. libvirt
has to do *something* in order to work in both of these cases, and I
think anything less than what I've done would be "half baked" (well,
half *something*, but I'd rather not swear on a public forum :-))
Forget about modules entirely in the libvirt level, don't even
check
it.
Yes, if we're manually specifying the driver, then (at least in the case
of vfio-pci and mlx5_vfio_pci) it *does* work to always use the driver
name (because modprobe will change vfio-pci to vfio_pci, and leave
mlx5_vfio_pci alone). But when we're discovering the variant driver via
modules.alias (which is the official method of finding the best VFIO
variant driver for a device), we don't start out with the driver name,
we start out with (a modalias string, that leads us to) the *module*
name, not a driver name. And while modprobe will of course accept the
name of the module to load, you can't use the module name to bind the
driver to the kernel - you have to do *something* to derive the driver
name from the module name. That's what this code does.
(I haven't seen an example where module name and driver name have more
differences than just "-" vs "_", and it appears to be convention to
"not do that", however it theoretically is completely doable - the
driver name is just a string in a C file. I'd rather not have my code
break at some random time in the future when somebody writes such a driver.)
> c) All of this is conveniently ignoring the possibility of a VFIO
> variant driver that is statically linked in the kernel. The entire
> design of variant driver auto-detection is based on doing a lookup
> in modules.alias, and that only lists *loadable modules* (not
> drivers), so unless I'm missing something, it would be impossible
> to auto-detect a VFIO variant driver that was statically
> linked. This is beyond libvirt's ability to fix; the best that
> could be done would be to manually specify the driver name in the
> libvirt config, which I suppose is better than nothing :-)
Why? This seems wrong. Statically linked drivers appear in
/sys/bus/drivers/XX too
Yes, they appear in /sys/bus/drivers. But there isn't (as far as I know
so far) a way to programmatically associate a statically linked variant
driver with a device using the information available from the running
kernel.
What I've been told until now is that the only way to find the proper
VFIO variant driver for a particular device is to take the modalias
string from that device's sysfs directory, and search for a line in the
modules.alias file that starts with "alias vfio_pci:" and has the "most
specific match" for the device's modalias - *that* is the name of the
*module* that contains the best VFIO variant driver for the device.
If a driver is statically linked into the kernel, then it doesn't have a
module, so it doesn't get a line in modules.alias (does it? I suppose I
should try that to verify, but it doesn't make sense), and so there is
no way to identify that driver as being a match for a device.
Or am I missing something?
In this case, as I've said in the commit logs somewhere, it's still
possible to use a statically linked variant driver for a device, you
just have to manually specify the driver name rather than having code
that discovers it (which is what you have to do for *all* variant
drivers until the final patch of this series anyway :-)