On Mon, Sep 13, 2021 at 10:20:08AM -0400, Laine Stump wrote:
The Linux kernel has recently added support for device-specific VFIO
drivers
to be used (instead of the current generic, one-size-fits-all vfio_pci
driver) when assigning a device to a guest with VFIO. The intent of this is
to (for example) support APIs for device-specific setup that needs to be
done before the device is handed over to the guest, and to support migration
of guests using these devices. More details can be found in the comments of
the patch emails here:
https://lore.kernel.org/kvm/20210826103912.128972-1-yishaih@nvidia.com/
There are a couple of issues that need to be resolved before that will work
well with libvirt-managed guests:
1) Although they've outlined a system for determining the best / most
specific driver for a device by extending the existing modules aliases, the
kernel people have left it up to userspace to actually parse the
module.aliases info to determine the best driver for a device. They've even
"thrown us a bone" in the form of an example python script to do just that:
https://github.com/maxgurtovoy/linux_tools/blob/main/vfio/bind_vfio_pci_d...
Currently, if a device is assigned to a guest with <hostdev managed='yes'>
(the default) then libvirt will unbind the device from whatever host driver
its bound to, and bind it to the vfio_pci driver (since, up to now, that has
been the only driver available). In the future, if we want to be able to
take advantage of the extended capabilities provided by the device-specific
vfio drivers, we will need to figure out which driver to load.
I personally don't like the idea of embedding the logic from the above
example program into libvirt - this really seems to me like something that
should be implemented in one place for use by many consumers (libvirt being
one of the consumers). One suggestion made by Alex Williamson (in a private
discussion, not on a mailing list) was that possibly we could get the
driverctl utility to add this functionality, and then call driverctl to
learn the name of the best-fit vfio driver for a device. I haven't yet
looked at it, but I've been told that driverctl is a bash script :-O. If we
decide that's a good route, perhaps we could also convince someone to
convert drivertctl to rust, similar to what jjongsma did with mdevctl (nudge
nudge, hint hint).
Or maybe I'm making it into a bigger deal than it needs to be, and we should
just implement the logic of the python script up above directly in libvirt.
Does anyone have an opinion?
The python script does a whole lot more than what we would want,
so isn't usable in any case. eg it loads drivers, and rebinds
them itself. All we want is to know the name of the vfio_BLAH
module that is applicable, as we already take care of everything
else.
So we have to parse the module.alias file to extract the vfio lines
that contain the glob match rule.
If we then format the info about the device we have in the matching
structure, potentially all we need do is invoke g_pattern_match_simple()
to compare the two.
So I'd ignore the demo program referenced above. I don't particularly
see a need to wait for some new tool to be written either, especially
if it is going to be in bash.
2) There may be cases where, in spite of a device-specific driver
being
available, the user prefers to use the generic vfio_pci driver. To support
that we will need to have a place in our config to set the driver name.
We already have a <driver> subelement of <hostdev> that was originally added
to allow choosing between VFIO device assignment and legacy KVM device
assignment:
<driver name='vfio|kvm'/>
All support for KVM device assignment was removed a few years ago, so in
practice the driver name is always "vfio". The most natural looking way to
support device-specific drivers would be to use this name attribute to
specify the driver name, e.g. if you wanted to let libvirt select the best
driver, you would use:
<driver name='vfio'/>
(what it's currently set to in everyone's configs). But if you wanted to
force use of the generic driver, you'd use:
<driver name='vfio_pci'/>
or if you wanted to force use of another driver that wasn't the 'best fit'
according to the module aliases, you could use, e.g.:
<driver name='vfio_pci_xyzzy'/>
I'm uncomfortable with the fact that we're effectively "re-using" the
name
attribute for a new purpose though - up until now it has been "which device
assignment method?", but this changes it to "which vfio driver should be
loaded?".
Yes, that definitely isn't right, as this existing attribute is
describing the assignment type. That it happens to match the kmod
name was just co-incidence.
So maybe we need a new element. I'm not sure what to
call it
though. How about this?
<driver name='vfio' modname='vfio_pci'/>
Does anyone have a better idea for the name?
That name seems fine to me.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|