
On 11/20/2015 11:30 PM, Alex Williamson wrote:
On Fri, 2015-11-20 at 12:24 -0500, Laine Stump wrote:
Seems safe, but is this really what we want to do? I haven't read/understood the remaining patches yet, but this makes it sound like what is going to happen is that all of the devices will be unbound from vfio-pci immediately, so they are "in limbo", and will then be reprobed once all devices are unused (and therefore unbound from vfio-pci).
I think that may be a bit dangerous. Instead, we should leave the devices bound to vfio-pci until all of them are unused, and at that time, we should unbind them all from vfio-pci, then reprobe them all. (again, I may have misunderstood the direction, if so ignore this). I agree, we should not unbind any device from vfio-pci until all the devices in the IOMMU group have been detached from
On Fri, 2015-11-20 at 11:33 -0500, Laine Stump wrote: the guest. ... and I've just looked back at my original comment about this in the BZ, and see that at that time I only suggested delaying the reprobe, but said nothing about delaying the unbind. And I'm not as sure about the necessity of waiting as I was 1/2 an hour ago. I suppose the issue is
On 11/20/2015 11:58 AM, Andrea Bolognani wrote: that it brings all those unbound devices one step closer to getting bound to the host driver. However, that will happen only if those device's PCI addresses are written to "drivers_reprobe" in sysfs (right? is there any other way a more "global" reprobe could happen and snatch up everything that's currently unbound?) Any load of a module will snatch up any unclaimed devices that match it, so if you unbind and leave the devices orpaned, a random module load could cause much badness. Adding a new_id will also cause a device scan, so if that happened to match the device: random badness.
So maybe I'd better ask someone who knows more about this than me - Alex, is there an issue with unbinding some devices in an iommu group from vfio-pci at an earlier time, and leaving then unbound to any driver at all while some other devices in the group are still in use by the guest? Is there an advantage to keeping them all bound to vfio-pci until none of them are used, and then unbinding/reprobing them all at the same time? Or should we unbind each from vfio-pci immediately when they are detached from the guest, and reprobe them all once they're all unbound? Unbinding them from vfio-pci leaves them susceptible to random bad things happen, as outlined above, and potentially limits vfio's ability to do things like bus resets. For instance imagine a 2-port NIC where each port is a PCI function, the functions are grouped together and the devices don't support any sort of internal reset. If both devices are bound to vfio-pci, then the user owns them both and we can do a bus reset. If one of those devices gets released from the user, as soon as it's unbound from vfio-pci it's no longer in our control and the bus rest option is gone.
The best course of action would be to leave any managed devices bound to vfio-pci until all of the devices within the group are no longer in use. Thanks, Hi Laine, Alex,
I am actually queuing the unbind from vfio until the last device reattach is requested when any device in the iommu group is in use by the guest. So, I believe this is taken care. Patch 9 is doing this. Thanks, Shiva
Alex