On 8/4/22 2:36 PM, Jason Gunthorpe wrote:
On Thu, Aug 04, 2022 at 12:18:26PM -0600, Alex Williamson wrote:
> On Thu, 4 Aug 2022 13:51:20 -0300
> Jason Gunthorpe <jgg(a)nvidia.com> wrote:
>
>> On Mon, Aug 01, 2022 at 09:49:28AM -0600, Alex Williamson wrote:
>>
>>>>>>> Fortunately these new vendor/device-specific drivers can be
easily
>>>>>>> identified as being "vfio-pci + extra stuff" - all
that's needed is to
>>>>>>> look at the output of the "modinfo $driver_name"
command to see if
>>>>>>> "vfio_pci" is in the alias list for the driver.
>>
>> We are moving in a direction on the kernel side to expose a sysfs
>> under the PCI device that definitively says it is VFIO enabled, eg
>> something like
>>
>> /sys/devices/pci0000:00/0000:00:1f.6/vfio/<N>
>>
>> Which is how every other subsystem in the kernel works. When this
>> lands libvirt can simply stat the vfio directory and confirm that the
>> device handle it is looking at is vfio enabled, for all things that
>> vfio support.
>>
>> My thinking had been to do the above work a bit later, but if libvirt
>> needs it right now then lets do it right away so we don't have to
>> worry about this hacky modprobe stuff down the road?
>
> That seems like a pretty long gap, there are vfio-pci variant drivers
> since v5.18 and this hasn't even been proposed for v6.0 (aka v5.20)
> midway through the merge window. We therefore have at least 3 kernels
> exposing devices in a way that libvirt can't make use of simply due to
> a driver matching test.
That is reasonable, but I'd say those three kernels only have two
drivers and they both have vfio as a substring in their name - so the
simple thing of just substring searching 'vfio' would get us over that
gap.
Looking at the aliases for exactly "vfio_pci" isn't that much more
complicated, and "feels" a lot more reliable than just doing a substring
search for "vfio" in the driver's name. (It would be, uh, .... "not
smart" to name a driver "vfio<anything>" if it wasn't actually a
vfio
variant driver (or the opposite), but I could imagine it happening; :-/)
> might be leveraged for managed='yes' with variant drivers. Once vfio
> devices expose a chardev themselves, libvirt might order the tests as:
I wasn't thinking to include the chardev part if we are to expedite
this. The struct device bit alone is enough and it doesn't have the
complex bits needed to make the cdev.
If you say you want to do it we'll do it for v6.1..
Since we already need to do something else as a stop-gap for the interim
(in order to avoid making driver developers wait any longer if for no
other reason), my opinion would be to not spend extra time splitting up
patches just to give us this functionality slightly sooner; we'll anyway
have something at least workable in place.
Definitely once it is there, libvirt should check for it, since it would
be quicker and just "feels even more reliable".
I'm updating my patches to directly look at modules.alias and will
resend based on that.