On 11/20/2015 11:30 PM, Alex Williamson wrote:
On Fri, 2015-11-20 at 12:24 -0500, Laine Stump wrote:
> On 11/20/2015 11:58 AM, Andrea Bolognani wrote:
>> On Fri, 2015-11-20 at 11:33 -0500, Laine Stump wrote:
>>> Seems safe, but is this really what we want to do? I haven't
>>> read/understood the remaining patches yet, but this makes it sound like
>>> what is going to happen is that all of the devices will be unbound from
>>> vfio-pci immediately, so they are "in limbo", and will then be
reprobed
>>> once all devices are unused (and therefore unbound from vfio-pci).
>>>
>>> I think that may be a bit dangerous. Instead, we should leave the
>>> devices bound to vfio-pci until all of them are unused, and at that
>>> time, we should unbind them all from vfio-pci, then reprobe them all.
>>> (again, I may have misunderstood the direction, if so ignore this).
>> I agree, we should not unbind any device from vfio-pci until
>> all the devices in the IOMMU group have been detached from
>> the guest.
> ... and I've just looked back at my original comment about this in the
> BZ, and see that at that time I only suggested delaying the reprobe, but
> said nothing about delaying the unbind. And I'm not as sure about the
> necessity of waiting as I was 1/2 an hour ago. I suppose the issue is
> that it brings all those unbound devices one step closer to getting
> bound to the host driver. However, that will happen only if those
> device's PCI addresses are written to "drivers_reprobe" in sysfs
(right?
> is there any other way a more "global" reprobe could happen and snatch
> up everything that's currently unbound?)
Any load of a module will snatch up any unclaimed devices that match it,
so if you unbind and leave the devices orpaned, a random module load
could cause much badness. Adding a new_id will also cause a device
scan, so if that happened to match the device: random badness.
> So maybe I'd better ask someone who knows more about this than me -
> Alex, is there an issue with unbinding some devices in an iommu group
> from vfio-pci at an earlier time, and leaving then unbound to any driver
> at all while some other devices in the group are still in use by the
> guest? Is there an advantage to keeping them all bound to vfio-pci until
> none of them are used, and then unbinding/reprobing them all at the same
> time? Or should we unbind each from vfio-pci immediately when they are
> detached from the guest, and reprobe them all once they're all unbound?
Unbinding them from vfio-pci leaves them susceptible to random bad
things happen, as outlined above, and potentially limits vfio's ability
to do things like bus resets. For instance imagine a 2-port NIC where
each port is a PCI function, the functions are grouped together and the
devices don't support any sort of internal reset. If both devices are
bound to vfio-pci, then the user owns them both and we can do a bus
reset. If one of those devices gets released from the user, as soon as
it's unbound from vfio-pci it's no longer in our control and the bus
rest option is gone.
The best course of action would be to leave any managed devices bound to
vfio-pci until all of the devices within the group are no longer in use.
Thanks,
Hi Laine, Alex,
I am actually queuing the unbind from vfio until the last device
reattach is requested
when any device in the iommu group is in use by the guest.
So, I believe this is taken care. Patch 9 is doing this.
Thanks,
Shiva
Alex