...
>
> Sorry for the delay in responding. The problem is that all the V100 GPUs
> support NVLink, but it may or may not be connected up. This is detected
> at runtime during GPU initialization, which seems like much too heavy of
> an operation to perform as part of passthrough initialization. And that's
> why vfio-pci pieces rely on device tree information to figure it out.
>
> Alexey, would it be possible for vfio-pci to export the information in a
> way more friendly to libvirt?
The only information needed here is whether a specific GPU has RAM or
not. This can easily be found from the device-tree, imho quite friendly
already. VFIO gets to know about these new capabilities when the VFIO
PCI device is opened, and we rather want to avoid going that far in
libvirt (open a VFIO container, attach a group, get a vfio-pci fd from
it, enumerate regions - 2 PCI resets on the way, delays, meh).
Agreed, we talked about this when dealing with other stuff already and we really
don't want libvirt to need to open a VFIO container just to query a few
attributes/settings which it then would process or pass directly to QEMU, we'd
therefore need a different way consumable to libvirt...
btw the first "find" for "ibm,npu" can be skipped - NVLinks have to
be
passed too or the entire RAM thing won't work. "find" for the memory
node can also be dropped really - if NVLink bridge OF node has
"memory-region", then VFIO will most likely expose RAM and QEMU will try
using it anyway.
I'm not sure I follow ^this, can you be more specific about how you imagine
libvirt detecting it?
Thanks,
Erik