The Linux kernel has recently added support for device-specific VFIO
drivers to be used (instead of the current generic, one-size-fits-all
vfio_pci driver) when assigning a device to a guest with VFIO. The
intent of this is to (for example) support APIs for device-specific
setup that needs to be done before the device is handed over to the
guest, and to support migration of guests using these devices. More
details can be found in the comments of the patch emails here:
https://lore.kernel.org/kvm/20210826103912.128972-1-yishaih@nvidia.com/
There are a couple of issues that need to be resolved before that will
work well with libvirt-managed guests:
1) Although they've outlined a system for determining the best / most
specific driver for a device by extending the existing modules aliases,
the kernel people have left it up to userspace to actually parse the
module.aliases info to determine the best driver for a device. They've
even "thrown us a bone" in the form of an example python script to do
just that:
https://github.com/maxgurtovoy/linux_tools/blob/main/vfio/bind_vfio_pci_d...
Currently, if a device is assigned to a guest with <hostdev
managed='yes'> (the default) then libvirt will unbind the device from
whatever host driver its bound to, and bind it to the vfio_pci driver
(since, up to now, that has been the only driver available). In the
future, if we want to be able to take advantage of the extended
capabilities provided by the device-specific vfio drivers, we will need
to figure out which driver to load.
I personally don't like the idea of embedding the logic from the above
example program into libvirt - this really seems to me like something
that should be implemented in one place for use by many consumers
(libvirt being one of the consumers). One suggestion made by Alex
Williamson (in a private discussion, not on a mailing list) was that
possibly we could get the driverctl utility to add this functionality,
and then call driverctl to learn the name of the best-fit vfio driver
for a device. I haven't yet looked at it, but I've been told that
driverctl is a bash script :-O. If we decide that's a good route,
perhaps we could also convince someone to convert drivertctl to rust,
similar to what jjongsma did with mdevctl (nudge nudge, hint hint).
Or maybe I'm making it into a bigger deal than it needs to be, and we
should just implement the logic of the python script up above directly
in libvirt. Does anyone have an opinion?
(NB: If a device is assigned to a guest by libvirt with <hostdev
managed='no'> then libvirt will just blindly assume that it is already
bound to the correct driver, so until we get internal support for using
the device-specific drivers, users can still get the same functionality
by binding their devices to the device-specific vfio driver externally
to libvirt).
=====
2) There may be cases where, in spite of a device-specific driver being
available, the user prefers to use the generic vfio_pci driver. To
support that we will need to have a place in our config to set the
driver name.
We already have a <driver> subelement of <hostdev> that was originally
added to allow choosing between VFIO device assignment and legacy KVM
device assignment:
<driver name='vfio|kvm'/>
All support for KVM device assignment was removed a few years ago, so in
practice the driver name is always "vfio". The most natural looking way
to support device-specific drivers would be to use this name attribute
to specify the driver name, e.g. if you wanted to let libvirt select the
best driver, you would use:
<driver name='vfio'/>
(what it's currently set to in everyone's configs). But if you wanted to
force use of the generic driver, you'd use:
<driver name='vfio_pci'/>
or if you wanted to force use of another driver that wasn't the 'best
fit' according to the module aliases, you could use, e.g.:
<driver name='vfio_pci_xyzzy'/>
I'm uncomfortable with the fact that we're effectively "re-using" the
name attribute for a new purpose though - up until now it has been
"which device assignment method?", but this changes it to "which vfio
driver should be loaded?". So maybe we need a new element. I'm not sure
what to call it though. How about this?
<driver name='vfio' modname='vfio_pci'/>
Does anyone have a better idea for the name?