[AMD Official Use Only]


Hi,

I am using Fedora 33, with the following KVM, qemu and libvirt versions:

QEMU 5.1.0

libvirt 6.6.0

KVM 5.14.18

 

We have done pass-through of a PCIe NVMe device to the guest running on FC33

using either virt-manager or virsh and then we do the hot-unplug of the device

while it is attached to the guest.

 

The device is no longer seen on the guest hardware device list on virt-manager

and then we hotplug the device again and we are able to use it on the Host,

but when we try to re-attach it to the guest, we get the following error message:

 

Requested operation is not valid, PCI device 0000:c4::00.0 is in use by driver QEMU,

Domain fedora 33.

 

So somehow libvirt still thinks the hot-unplugged device is attached.

              

Tracing the flow of hot un-plug event from guest to host :

 

->Guest pcie hotplug support detected the NVMe driver unplug (from guest kernel logs):

 

pciehp: Slot (0-6): Attention button pressed

 

pciehp: Slot (0-6): Powering off due to to button press.

 

-> Also looks like the guest notified Host/KVM (from host kernel logs):

 

pcieport: 0000:c4:0000.0: pciehp: Slot (208): Card not present

 

-> Correspondingly, vfio-pci module notified Qemu :

 

vfio-pci: 0000:c4:0000.0: Relaying device request to user (#0)

 

-> Then the un-plugged device reset is done.

 

vfio-pci: vfio_bar_restore: reset recovery - restoring BARs

 

pci 0000:c4:00.0: Removing from iommu group 105.

 

-> Next tried to verify if libvirt detected the DELETED_DRIVE event from qemu.

 

Running SystemTap script to capture events between qemu and libvirt :

 

stap examples/systemtap/qemu-monitor.stp

 

When the NVMe drive is attached to VM the following log output is seen from SystemTap:

 

execute "device-add", driver: "vfio-pci", host: "0000:c4:00.0", id: "hostdev0", bus: "pci.7", addr: "0".

 

When we hot-unplug the NVMe drive, the following log output is seen from SystemTap:

 

event: DEVICE_DELETED, device: "hostdev0", path: "/machine/peripheral/hostdev0".

 

So it looks like that qemu sent the "DEVICE_DELTED" event to libvirt, but libvirt has still not removed the attached

device from its bookeeping list.

 

I understand there is already a thread from 20202, discussing a similar issue :

https://www.spinics.net/linux/fedora/libvirt-users/msg12590.html

 

But I am not sure if there is any fix/support added for this recently.

 

Looking for any feedback related to above and PCI device passthrough and hotplug support.

 

Thanks,

Ashish