
On 2013年03月08日 17:25, Jiri Denemark wrote:
On Fri, Mar 08, 2013 at 09:50:55 +0100, Markus Armbruster wrote:
Osier Yang<jyang@redhat.com> writes:
I'm wondering if it could be long time to wait for the device_del completes (AFAIK from previous bugs, it can be, though it should be fine for most of the cases). If it's too long, it will be a problem for management, because it looks like hanging. We can have a timeout for the device_del in libvirt, but the problem is the device_del can be still in progress by qemu, which could cause the inconsistency. Unless qemu has some command to cancel the device_del.
I'm afraid cancelling isn't possible, at least not for PCI.
I don't think we need anything like that. We just need the device deletion API to return immediately without actually removing stuff from domain definition (unless the device was really removed fast enough, e.g., USB devices are removed before device_del returns) and then remove the device from domain definition when we get the event from QEMU or when libvirtd reconnects to a domain and sees a particular device is no longer present. After all, devices may be removed even if we didn't ask for it (when the removal is initiated by a guest OS). And we should also provide similar event for higher level apps.
Removing the device from domain config unless we get the event from qemu or find the device disappeared by polling makes sense. That's the mainly reason for we want the event and polling actually. But the problem is our APIs don't want to have long time hanging. If we don't change the APIs and return quickly just like what we do currently, it's confused for user, because when he wants to attach the device again while the device_del is still in progress, he will get the error like "Device ID *** is in used", however, our detaching APIs return success prior to that. I.E, if device_del needs long time to complete in some cases? can we live with it? Osier