Hey,
I'm out of my QOM depth, so I'll just beg for help in advance. I
noticed in testing vfio-pci hotunplug that the host seems to be trying
to reclaim the device before QEMU is actually done with it, there's a
very short race where libvirt has seen the DEVICE_DELETED event and
tries to unbind the physical device from vfio-pci, the use count is
clearly non-zero because the host driver tries to send a device
request, but that event channel has already been torn down. Nearly
immediately after, QEMU finally releases the device, but we can't do a
proper reset due to some issues with device references in the kernel.
When I run gdb on QEMU with breakpoints at
qapi_event_send_device_deleted() and vfio_instance_finalize(), the
QAPI even happens first. Clearly this is horribly wrong, right? I
can't unmap my references to the vfio device file until my
instance_finalize is called, so I'm always going to have that open when
libvirt takes the DEVICE_DELETED event as a cue to return the device to
host drivers. The call chains look like this:
#0 qapi_event_send_device_deleted (has_device=true,
device=0x7f5ca3e36fb0 "hostdev0",
path=0x7f5c89e84fe0 "/machine/peripheral/hostdev0",
errp=0x7f5ca241f9e8 <error_abort>) at qapi-event.c:412
#1 0x00007f5ca1701608 in device_unparent (obj=0x7f5ca43ffc00)
at hw/core/qdev.c:1115
#2 0x00007f5ca18b7891 in object_finalize_child_property (obj=0x7f5ca380f500,
name=0x7f5ca3f21da0 "hostdev0", opaque=0x7f5ca43ffc00) at qom/object.c:1362
#3 0x00007f5ca18b56b2 in object_property_del_child (obj=0x7f5ca380f500,
child=0x7f5ca43ffc00, errp=0x0) at qom/object.c:422
#4 0x00007f5ca18b5790 in object_unparent (obj=0x7f5ca43ffc00)
at qom/object.c:441
#5 0x00007f5ca16c1f31 in acpi_pcihp_eject_slot (s=0x7f5ca4c41268, bsel=0,
slots=4) at hw/acpi/pcihp.c:139
#0 vfio_instance_finalize (obj=0x7f5ca43ffc00)
at /net/gimli/home/alwillia/Work/qemu.git/hw/vfio/pci.c:2731
#1 0x00007f5ca18b57c0 in object_deinit (obj=0x7f5ca43ffc00,
type=0x7f5ca376f490) at qom/object.c:448
#2 0x00007f5ca18b5831 in object_finalize (data=0x7f5ca43ffc00)
at qom/object.c:462
#3 0x00007f5ca18b6782 in object_unref (obj=0x7f5ca43ffc00) at qom/object.c:896
#4 0x00007f5ca1550cc0 in memory_region_unref (mr=0x7f5ca43fff00)
at /net/gimli/home/alwillia/Work/qemu.git/memory.c:1476
#5 0x00007f5ca1553886 in do_address_space_destroy (as=0x7f5ca43ffe10)
at /net/gimli/home/alwillia/Work/qemu.git/memory.c:2272
It appears that DEVICE_DELETED only means the VM is done with the
device but libvirt is interpreting it as QEMU is done with the device.
Which is correct? Do we need a new event or do we need to fix the
ordering of this event? An ordering fix would be more compatible with
existing libvirt. Thanks,
Alex