On Mon, 16 Dec 2019 21:09:20 -0300
Daniel Henrique Barboza <danielhb413(a)gmail.com> wrote:
On 12/16/19 8:43 PM, Alex Williamson wrote:
> On Mon, 16 Dec 2019 20:24:56 -0300
> Daniel Henrique Barboza <danielhb413(a)gmail.com> wrote:
>
>>
>> The code isn't forcing a device to be assigned to the guest. It is forcing
>> all the IOMMU devices to be declared in the domain XML to be detached from
>> the host.
>
> Detached from the host by unbinding from host drivers and binding to
> vfio-pci and "partially" assigned to the guest? That's wrong, we
can't
> do that. Not only will vfio-pci not bind to anything but endpoints,
> you'll break the host binding bridges that might be part of the group,
> and there are valid use cases for sequestering a device with pci-stub
> rather than vfio-pci to add another barrier to the user getting access
> to the device.
>
>> What I did was to extend a verification Libvirt already does, to check for
>> PCI devices of the same IOMMU X being used by other domains, to check the
>> the host as well. Guest start fails if there is any device left in IOMMU X
>> that's not present in the domain.
>
> Yep, can't do that.
Thanks for the info.
To keep the discussion focused, this is the error I'm trying to dodge:
error: internal error: qemu unexpectedly closed the monitor: 2019-10-04T12:39:41.091312Z
qemu-system-ppc64: -device
vfio-pci,host=0001:09:00.3,id=hostdev0,bus=pci.2.0,addr=0x1.0x3:
vfio 0001:09:00.3: group 1 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.
This happens when not all PCI devices from IOMMU group 1 are bind to vfio_pci,
regardless
of whether QEMU is going to use all of them in the guest. Binding all the IOMMU
devices to vfio-pci makes QEMU satisfied, in this particular case.
What is the minimal condition to avoid this error? What Libvirt is doing ATM is not
enough
(it will fail to launch with this QEMU error above), and what I'm proposing is
wrong.
Can we say that all PCI endpoints of the same IOMMU must be assigned to vfio-pci?
Yes, but libvirt should not assume that it can manipulate the bindings
of adjacent devices without being explicitly directed to do so. The
error may be a hindrance to you, but it might also prevent, for
example, the only other NIC in the system being detached from the host
driver. Is it worth making the VM run without explicitly listing all
devices to assign at the cost of disrupting host services or subverting
the additional isolation a user might be attempting to configure with
having unused devices bound to vfio-pci. This seems like a bad idea,
the VM should be configured to explicitly list every device it needs to
have assigned or partially assigned. Thanks,
Alex