On Thu, 4 Aug 2022 15:11:07 -0400
Laine Stump <laine(a)redhat.com> wrote:
On 8/4/22 2:36 PM, Jason Gunthorpe wrote:
> On Thu, Aug 04, 2022 at 12:18:26PM -0600, Alex Williamson wrote:
>> On Thu, 4 Aug 2022 13:51:20 -0300
>> Jason Gunthorpe <jgg(a)nvidia.com> wrote:
>>
>>> On Mon, Aug 01, 2022 at 09:49:28AM -0600, Alex Williamson wrote:
>>>
>>>>>>>> Fortunately these new vendor/device-specific drivers can
be easily
>>>>>>>> identified as being "vfio-pci + extra stuff" -
all that's needed is to
>>>>>>>> look at the output of the "modinfo
$driver_name" command to see if
>>>>>>>> "vfio_pci" is in the alias list for the
driver.
>>>
>>> We are moving in a direction on the kernel side to expose a sysfs
>>> under the PCI device that definitively says it is VFIO enabled, eg
>>> something like
>>>
>>> /sys/devices/pci0000:00/0000:00:1f.6/vfio/<N>
>>>
>>> Which is how every other subsystem in the kernel works. When this
>>> lands libvirt can simply stat the vfio directory and confirm that the
>>> device handle it is looking at is vfio enabled, for all things that
>>> vfio support.
>>>
>>> My thinking had been to do the above work a bit later, but if libvirt
>>> needs it right now then lets do it right away so we don't have to
>>> worry about this hacky modprobe stuff down the road?
>>
>> That seems like a pretty long gap, there are vfio-pci variant drivers
>> since v5.18 and this hasn't even been proposed for v6.0 (aka v5.20)
>> midway through the merge window. We therefore have at least 3 kernels
>> exposing devices in a way that libvirt can't make use of simply due to
>> a driver matching test.
>
> That is reasonable, but I'd say those three kernels only have two
> drivers and they both have vfio as a substring in their name - so the
> simple thing of just substring searching 'vfio' would get us over that
> gap.
Looking at the aliases for exactly "vfio_pci" isn't that much more
complicated, and "feels" a lot more reliable than just doing a substring
search for "vfio" in the driver's name. (It would be, uh, .... "not
smart" to name a driver "vfio<anything>" if it wasn't actually a
vfio
variant driver (or the opposite), but I could imagine it happening; :-/)
>
>> might be leveraged for managed='yes' with variant drivers. Once vfio
>> devices expose a chardev themselves, libvirt might order the tests as:
>
> I wasn't thinking to include the chardev part if we are to expedite
> this. The struct device bit alone is enough and it doesn't have the
> complex bits needed to make the cdev.
>
> If you say you want to do it we'll do it for v6.1..
Since we already need to do something else as a stop-gap for the interim
(in order to avoid making driver developers wait any longer if for no
other reason), my opinion would be to not spend extra time splitting up
patches just to give us this functionality slightly sooner; we'll anyway
have something at least workable in place.
We also need to be careful in adding things piecemeal that libvirt can
determine when new functionality, such as vfio device chardevs, are
actually available and not simply a placeholder to fill a gap
elsewhere. Thanks,
Alex