Re: [PATCH] util: basic support for vendor-specific vfio drivers

5 Aug 2022

      On Fri, Aug 05, 2022 at 11:24:08AM -0600, Alex Williamson wrote:
...
On Thu, 4 Aug 2022 21:11:07 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:
...
On Thu, Aug 04, 2022 at 01:36:24PM -0600, Alex Williamson wrote:
...
...
...
That is reasonable, but I'd say those three kernels only have two
drivers and they both have vfio as a substring in their name - so the
simple thing of just substring searching 'vfio' would get us over that
gap.
Looking at the aliases for exactly "vfio_pci" isn't that much more 
complicated, and "feels" a lot more reliable than just doing a substring 
search for "vfio" in the driver's name. (It would be, uh, .... "not 
smart" to name a driver "vfio<anything>" if it wasn't actually a vfio 
variant driver (or the opposite), but I could imagine it happening; :-/)
This is still pretty hacky. I'm worried about what happens to the
kernel if this becames some crazy unintended uAPI that we never really
thought about carefully... This was not a use case when we designed
the modules.alias stuff at least.
BTW - why not do things the normal way?
1. readlink /sys/bus/pci/devices/XX/iommu_group
2. Compute basename of #1
3. Check if /dev/vfio/#2 exists (or /sys/class/vfio/#2)
It has a small edge case where a multi-device group might give a false
positive for an undrivered device, but for the purposes of libvirt
that seems pretty obscure.. (while the above has false negative
issues, obviously)
This is not a small edge case, it's extremely common.  We have a *lot*
of users assigning desktop GPUs and other consumer grade hardware, which
are usually multi-function devices without isolation exposed via ACS or
quirks.
The edge case is that the user has created a multi-device group,
manually assigned device 1 in the group to VFIO, left device 2 with no
driver and then told libvirt to manually use device 2. With the above
approach libvirt won't detect this misconfiguration and qemu will
fail.
...
The vfio group exists if any devices in the group are bound to a vfio
driver, but the device is not accessible from the group unless the
viability test passes.  That means QEMU may not be able to get access
to the device because the device we want isn't actually bound to a vfio
driver or another device in the group is not in a viable state.  Thanks,
This is a different misconfiguration that libvirt also won't detect,
right? In this case ownership claiming in the kernel will fail and
qemu will fail too, like above.

This, and the above, could be handled by having libvirt also open the
group FD and get the device. It would prove both correct binding and
viability.

I had understood the point of this logic was to give better error
reporting to users so that common misconfigurations could be diagnosed
earlier. When I say 'small edge case' I mean it seems like an unlikely
misconfiguration that someone would know to setup VFIO but then use
the wrong BDFs to do it - arguably less likely than someone would know
to setup VFIO but forget to unbind the other drivers in the group?

But maybe I don't get it at all ...

Jason