On Fri, 2015-11-20 at 12:26 +0530, Shivaprasad bhat wrote:
On Fri, Nov 20, 2015 at 4:54 AM, Alex Williamson
<alex.williamson(a)redhat.com> wrote:
> On Sat, 2015-11-14 at 14:06 +0530, Shivaprasad G Bhat wrote:
>> Before unbind from stub driver libvirt should be sure the guest is not using
>> the device anymore. The libvirt today waits for pci-stub driver alone in
/proc/iommu.
>> The same wait is needed for vfio devices too.
>>
>> Signed-off-by: Shivaprasad G Bhat <sbhat(a)linux.vnet.ibm.com>
>> ---
>> src/util/virhostdev.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c
>> index f24ccd8..9f15f34 100644
>> --- a/src/util/virhostdev.c
>> +++ b/src/util/virhostdev.c
>> @@ -756,6 +756,13 @@ virHostdevReattachPCIDevice(virPCIDevicePtr dev,
virHostdevManagerPtr mgr)
>> usleep(100*1000);
>> retries--;
>> }
>> + } else if (STREQ(virPCIDeviceGetStubDriver(dev), "vfio-pci")) {
>> + int retries = 100;
>> + while (virPCIDeviceWaitForCleanup(dev, "vfio-pci")
>> + && retries) {
>> + usleep(100*1000);
>> + retries--;
>> + }
>> }
>>
>> if (virPCIDeviceReattach(dev, mgr->activePCIHostdevs,
>
> Hi,
>
> Laine pointed out this patch to me, and it seems entirely unnecessary.
> Can you explain what it fixes?
>
> The reason that pci-stub and legacy KVM device assignment needs to do
> this little delay is because pci-stub itself is not in control of the
> device, it will happily release the device at any point in time,
> regardless of whether KVM is still making use of it. With vfio-pci, all
> the device access occurs through the vfio-pci driver, so you can be sure
> that if the vfio-pci driver unbinds from the device it is unused. In
> fact, the unbind will block until it is unused.
>
Hi Alex,
Thanks for taking a look.
I didn't want to leave out any case where the host crash would occur because
of early release of the device. I could see the /dev/iommu reporting
the vfio-pci
using the device when the guest is actually using it. The behavior of vfio-pci
that you mentioned, is it true across architectures? If yes, we can confidently
drop this wait.
As Laine suggests, if we paper over every kernel bug with random delays
instead of fixing the actual problem, we're left with a very fragile
solution that nobody understands.
There are two aspects to returning devices to host drivers, the first is
the integrity of the group, which is libvirt's responsibility. If
libvirt reassigns a device within a group to the host driver while other
devices within the group are still in use, it compromises the isolation
of the group and the kernel will necessarily BUG_ON. This is improper
group management. Adding delays and testing unintentional markers just
seems like fragile ways to avoid addressing the real issue here.
The other aspect is a device that should be free and clear of any sort
of group management issues and we're simply unbinding from the vfio
driver and re-binding to the host driver. As I mentioned, vfio-pci
won't release a device until it's unused, so needing to add a delay and
test another file to avoid a kernel crash either sounds like a bug in
the kernel or a hack to work around not addressing the problem of proper
group management above. Thanks,
Alex