On Wed, 18 Apr 2018 12:31:53 -0600
Alex Williamson <alex.williamson(a)redhat.com> wrote:
On Mon, 9 Apr 2018 12:35:10 +0200
Gerd Hoffmann <kraxel(a)redhat.com> wrote:
> This little series adds three drivers, for demo-ing and testing vfio
> display interface code. There is one mdev device for each interface
> type (mdpy.ko for region and mbochs.ko for dmabuf).
Erik Skultety brought up a good question today regarding how libvirt is
meant to handle these different flavors of display interfaces and
knowing whether a given mdev device has display support at all. It
seems that we cannot simply use the default display=auto because
libvirt needs to specifically configure gl support for a dmabuf type
interface versus not having such a requirement for a region interface,
perhaps even removing the emulated graphics in some cases (though I
don't think we have boot graphics through either solution yet).
Additionally, GVT-g seems to need the x-igd-opregion support
enabled(?), which is a non-starter for libvirt as it's an experimental
option!
Currently the only way to determine display support is through the
VFIO_DEVICE_QUERY_GFX_PLANE ioctl, but for libvirt to probe that on
their own they'd need to get to the point where they could open the
vfio device and perform the ioctl. That means opening a vfio
container, adding the group, setting the iommu type, and getting the
device. I was initially a bit appalled at asking libvirt to do that,
but the alternative is to put this information in sysfs, but doing that
we risk that we need to describe every nuance of the mdev device
through sysfs and it becomes a dumping ground for every possible
feature an mdev device might have.
So I was ready to return and suggest that maybe libvirt should probe
the device to know about these ancillary configuration details, but
then I remembered that both mdev vGPU vendors had external dependencies
to even allow probing the device. KVMGT will fail to open the device
if it's not associated with an instance of KVM and NVIDIA vGPU, I
believe, will fail if the vGPU manager process cannot find the QEMU
instance to extract the VM UUID. (Both of these were bad ideas)
Here's another proposal that's really growing on me:
* Fix the vendor drivers! Allow devices to be opened and probed
without these external dependencies.
* Libvirt uses the existing vfio API to open the device and probe the
necessary ioctls, if it can't probe the device, the feature is
unavailable, ie. display=off, no migration.
I'm really having a hard time getting behind inventing a secondary API
just to work around arbitrary requirements from mdev vendor drivers.
vfio was never intended to be locked to QEMU or KVM, these two vendor
drivers are the only examples of such requirements, and we're only
encouraging this behavior if we add a redundant API for device
probing. Any solution on the table currently would require changes to
the mdev vendor drivers, so why not this change? Please defend why
each driver needs these external dependencies and why the device open
callback is the best, or only, place in the stack to enforce that
dependency. Let's see what we're really dealing with here. Thanks,
Alex