On 2013年03月12日 23:11, Laine Stump wrote:
On 03/08/2013 04:25 AM, Jiri Denemark wrote:
> On Fri, Mar 08, 2013 at 09:50:55 +0100, Markus Armbruster wrote:
>> Osier Yang<jyang(a)redhat.com> writes:
>>
>>> I'm wondering if it could be long time to wait for the device_del
>>> completes (AFAIK from previous bugs, it can be, though it should be
>>> fine for most of the cases). If it's too long, it will be a problem
>>> for management, because it looks like hanging. We can have a timeout
>>> for the device_del in libvirt, but the problem is the device_del
>>> can be still in progress by qemu, which could cause the inconsistency.
>>> Unless qemu has some command to cancel the device_del.
>> I'm afraid cancelling isn't possible, at least not for PCI.
> I don't think we need anything like that. We just need the device
> deletion API to return immediately without actually removing stuff from
> domain definition
I don't think we can do that - it changes the user-visible semantics. I
think we need to continue to remove the device from the XML immediately,
but internally keep track of the fact that this device (and the qemu id
used to refer to it) can't yet be re-used.
Yeah, I think there is agreement now, either in this thread (I pasted
the conclusion with talking with Jirka), or in the comments of the
related bug (#BZ 813752).
The qemu driver currently has activeHostPciDevs and inactiveHostPciDevs.
Maybe we also need a "zombieHostPciDevs" for devices that we've sent the
device_del command for, but haven't yet received notice that they're
actually removed.
Having an internal list may help us improve the error message and quit
earlier instead of going through to qemu, but we will need internal
XMLs of domain anyway, otherwise there is no way to known which devices
are pending for the qemu event or need polling.
And OTOH, I'm wondering how much benifit we can get from the new
internal list, any other benifit except quiting a bit earlier and
more sensible error message than the error from qemu?. If no,
Is it deserved to maintain an hairy internal lost (from the experience
of activePciHostDevs)? I will say it's not.
(BTW, shouldn't these lists of devices be global to all of libvirt,
rather than qemu-specific?)
Good point, it should be global to avoid conflicts between VMs of
diffrent drivers.
Osier