On 6/18/19 8:04 PM, Daniel Henrique Barboza wrote:
> Hi,
>
> This is labeled as RFC but it's more like a FYI to let people know and
> comment beforehand. Shiva sent a 28 patch series last year that
> implements
> hotplug/unplug support for PCI multifunction devices [1]. The design
> motivation of his work was based in a RFC sent to this mailing list back
> in 2016 [2].
>
> I'll briefly summarize the goals and motivations here. What we have
> today
> in Libvirt:
>
> - no hotplug/unplug support for multifunction PCI devices
>
> This is explained in details in [2]. When hotplugging a multifunction
> device, QEMU will queue the hotplug operation of all non-zero
> functions and,
> when function 0 is hotplugged, all functions are hotplugged together.
> This
> is true for all archs that supports PCI multifunction devices in
> QEMU. For
> unplug it varies: x86 will unplug all functions if any function is
> unplugged,
> ppc64 needs to unplug each one.
>
> Due to the nature of how Libvirt hotplug works now, hotplug of these
> devices
> is not possible. All hotplugs are considered in an isolated manner.
> Even if
> we hotplug each function in the proper order (i.e. leaving function 0
> last),
> Libvirt can assign different slots in the guest for each. Similar
> problems
> happens with hot-unplug.
>
> This feature aims to address these by creating a new <devices> element,
> exclusive for multifunction devices, that aggregates all functions of
> a device
> in a single operation. To handle this new element, the existing
> attach/detach
> functions in the QEMU driver now handles multiple devices.
> Attaching/detaching
> a single device is routed away from the specialized multifunction
> code to be
> handled to the existing attach/detach code base.
>
> - no support for partial assignment of functions
>
> We can't make the assumption that the guest will always assign all
> devices
> of a multifunction device. Some functions might be a security risk to
> expose
> to a guest, or the device can behave differently depending on the amount
> of functions assigned.
>
> Even if the 'leftover' functions can't be used to anything else in
> the host,
> the decision of full/partial assignment of functions should come from
> the
> user, not us. We can't predict how any other hardware vendor will setup
> its devices.
>
> This patch series also handles this case.
>
> ------
>
> The latest version of this feature, rebase to cdd362e0e7a (the current
> master as I'm writing this), can be found at:
>
>
https://github.com/danielhb/libvirt/tree/multifunction-rc2
>
> This is a work still ongoing that it's not ready for contribution yet
> (first patches that changes the unit test code are breaking existing
> tests).
I haven't looked at your patches that closely, but I think I might
have a fix for you. I mean, I'm working around this area too (see my
latest upstream patches) and I've found that our vfio-pci driver has
'bind' action suspended - it fails. This is because RHEL kernel
behaves like that (but not the vanilla one). I've spoke to a kernel
developer and he experienced the same behaviour, but did not know from
the top of his head what is the root cause. Anyway, it's not that big
of a deal because libvirt prefers 'driver_override' and thuse it'll
never try 'bind' on non-ancient kernels.
And my patch does exactly that - it introduces 'driver_override' to
virpcimock.c:
https://github.com/zippy2/libvirt/commit/2b8ca7d4598ef479b19b61f942a17d35...
Actually, there are two more patches needed:
https://github.com/zippy2/libvirt/commits/nvme
2b8ca7d459 virpcimock: Create driver_override file in device dirs
9631289730 Revert "virpcitest: Test virPCIDeviceDetach failure"
4310fbbc40 virpcimock: Move actions checking one level up
Thanks for that. I was getting test errors after applying a patch that reads
"virpcitest: Change the stub driver to vfio from pci-stub", and that was the
reason. Applying your patches fixed the issue.
Small detail: patch 2b8ca7d459 fails to compile standalone in my machine
because 'ret' in pci_driver_handle_change can be returned uninitialized
after
adding the 'driver_override' conditional that does nothing. Declaring
'int ret = 0;' in pci_driver_handle_change fixed it for me.
Thanks again,
DHB