On 2013年03月08日 17:25, Jiri Denemark wrote:
On Fri, Mar 08, 2013 at 09:50:55 +0100, Markus Armbruster wrote:
> Osier Yang<jyang(a)redhat.com> writes:
>
>> I'm wondering if it could be long time to wait for the device_del
>> completes (AFAIK from previous bugs, it can be, though it should be
>> fine for most of the cases). If it's too long, it will be a problem
>> for management, because it looks like hanging. We can have a timeout
>> for the device_del in libvirt, but the problem is the device_del
>> can be still in progress by qemu, which could cause the inconsistency.
>> Unless qemu has some command to cancel the device_del.
>
> I'm afraid cancelling isn't possible, at least not for PCI.
I don't think we need anything like that. We just need the device
deletion API to return immediately without actually removing stuff from
domain definition (unless the device was really removed fast enough,
e.g., USB devices are removed before device_del returns) and then remove
the device from domain definition when we get the event from QEMU or
when libvirtd reconnects to a domain and sees a particular device is no
longer present. After all, devices may be removed even if we didn't ask
for it (when the removal is initiated by a guest OS). And we should also
provide similar event for higher level apps.
Removing the device from domain config unless we get the event from
qemu or find the device disappeared by polling makes sense. That's
the mainly reason for we want the event and polling actually.
But the problem is our APIs don't want to have long time hanging.
If we don't change the APIs and return quickly just like what we
do currently, it's confused for user, because when he wants to
attach the device again while the device_del is still in progress,
he will get the error like "Device ID *** is in used", however,
our detaching APIs return success prior to that.
I.E, if device_del needs long time to complete in some cases?
can we live with it?
Osier