
On 12/16/19 6:03 PM, Daniel Henrique Barboza wrote:
On 12/16/19 7:28 PM, Cole Robinson wrote:
On 12/16/19 8:36 AM, Daniel Henrique Barboza wrote:
changes from version 3 [1]: - removed last 2 patches that made function 0 of PCI multifunction devices mandatory - new patch: news.xml update - changed 'since' version to 6.0.0 in patch 4 - unassigned hostdevs are now getting qemu aliases
[1] https://www.redhat.com/archives/libvir-list/2019-November/msg01263.html
Daniel Henrique Barboza (5): Introducing new address type='unassigned' for PCI hostdevs qemu: handle unassigned PCI hostdevs in command line virhostdev.c: check all IOMMU devs in virHostdevPreparePCIDevices formatdomain.html.in: document <address type='unassigned'/> news.xml: add address type='unassigned' entry
Codewise it looks fine now. But I'm looking more closely at patch #3 and realizing that it can explicitly reject a previously accepted VM config. And indeed, now that I give it a test with my GPU passthrough setup, it is rejecting my previosly working config.
error: Requested operation is not valid: All devices of the same IOMMU group 1 of the PCI device 0000:01:00.0 must belong to domain win10
I've attached the nodedev XML for the three devices with iommuGroup 1. Only the two nvidia devices are assigned to my VM, but not the PCIe controller device.
Is the libvirt heuristic missing something? Or is this acting as expected?
You mentioned that you declared 3 devices of IOMMU group 1. Unless the code in patch 3 has a bug, there are more PCI hostdevs in IOMMU group 1 that were left out of the domain XML.
I didn't quite gather that this is a change to reject previously accepted configurations, so I will defer to Laine and Alex as to whether this should be committed.
I mentioned in the commit msg of patch 03 that this would break working configurations that didn't comply with the new 'all devices of the IOMMU group must be included in the domain XML' directive. Perhaps this is worth mentioning in the 'news' page to warn users about it.
No, this shouldn't be a requirement at all. In my mind the purpose of these patches is to make something work (in a safe manner) that failed before, *not* to add new restrictions that break things that already work. (Sorry I wasn't paying more attention to the patches earlier).
About breaking existing configurations, there is the possibility of not going forward with patch 03, which is enforcing this rule of declaring all the IOMMU group. Existing domains will keep working as usual, the option to unassign devices will still be present, but the user will have to deal with the potential QEMU errors if not all PCI devices were detached from the host.
In this case, the 'unassigned' type will become more of a ON/OFF switch to add/remove the PCI hostdev from the guest without removing it from the domain XML. It is still useful, but we lose the idea of all the IOMMU devices being described in the domain XML, which is something Laine mentioned it would be desirable in one of the RFCs.
I don't actually recall saying that :-). I haven't looked in the list archives, but what I *can* imagine myself saying is that only devices mentioned in the XML should be manipulated in any way by libvirt. So, for example, you shouldn't unbind device X from its host driver if there is nothing in the XML telling you to do that. But if a device isn't mentioned in the XML, and is already bound to some driver that is acceptable to the VFIO subsystem (e.g. vfio-pci, pci-stub or no driver at all (? is that right Alex?)) then that should not create any problem. Doing otherwise would break too many existing configs. (For example, my own assigned-GPU config, which assumes that all the devices are already bound to the proper driver, and uses "managed='no'")