On 3/12/19 7:57 PM, Peter Krempa wrote:
On Tue, Mar 12, 2019 at 16:41:15 +0100, Peter Krempa wrote:
> On Tue, Mar 12, 2019 at 16:13:20 +0100, Michal Privoznik wrote:
>>
https://bugzilla.redhat.com/show_bug.cgi?id=1623389
>>
>> If a device is detached twice from the same domain the following
>> race condition may happen:
>>
>> 1) The first DetachDevice() call will issue "device_del" on qemu
>> monitor, but since the DEVICE_DELETED event did not arrive in
>> time, the API ends claiming "Device detach request sent
>> successfully".
>>
>> 2) The second DetachDevice() therefore still find the device in
>> the domain and thus proceeds to detaching it again. It calls
>> EnterMonitor() and qemuMonitorSend() trying to issue "device_del"
>> command again. This gets both domain lock and monitor lock
>> released.
>>
>> 3) At this point, qemu sends us the DEVICE_DELETED event which is
>> going to be handled by the event loop which ends up calling
>> qemuDomainSignalDeviceRemoval() to determine who is going to
>> remove the device from domain definition. Whether it is the
>> caller that marked the device for removal or whether it is going
>> to be the event processing thread.
>>
>> 4) Because the device was marked for removal,
>> qemuDomainSignalDeviceRemoval() returns true, which means the
>> event is to be processed by the thread that has marked the device
>> for removal (and is currently still trying to issue "device_del"
>> command)
>>
>> 5) The thread finally issues the "device_del" command, which
>> fails (obviously) and therefore it calls
>> qemuDomainResetDeviceRemoval() to reset the device marking and
>> quits immediately after, NOT removing any device from the domain
>> definition.
>>
>> At this point, the device is still present in the domain
>> definition but doesn't exist in qemu anymore. Worse, there is no
>> way to remove it from the domain definition.
>>
>> Solution is to note down that we've seen the event and if the
>> second "device_del" fails, not take it as a failure but carry on
>> with the usual execution.
>>
>> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
>> ---
>> src/qemu/qemu_domain.h | 1 +
>> src/qemu/qemu_hotplug.c | 83 +++++++++++++++++++++++++++++------------
>> 2 files changed, 60 insertions(+), 24 deletions(-)
>>
>> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
>> index 9f468e5661..fb361515ba 100644
>> --- a/src/qemu/qemu_domain.h
>> +++ b/src/qemu/qemu_domain.h
>> @@ -218,6 +218,7 @@ typedef qemuDomainUnpluggingDevice
*qemuDomainUnpluggingDevicePtr;
>> struct _qemuDomainUnpluggingDevice {
>> const char *alias;
>> qemuDomainUnpluggingDeviceStatus status;
>> + bool eventSeen; /* True if DEVICE_DELETED event arrived. */
>> };
>>
>>
>> diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
>> index 574477e916..93c0e14adf 100644
>> --- a/src/qemu/qemu_hotplug.c
>> +++ b/src/qemu/qemu_hotplug.c
>> @@ -70,22 +70,47 @@ unsigned long long qemuDomainRemoveDeviceWaitTime = 1000ull *
5;
>> /**
>> * qemuDomainDeleteDevice:
>> * @mon: qemu monitor
>> + * @vm: domain object
>> * @alias: device to remove
>> *
>> * A simple wrapper around qemuMonitorDelDevice().
>> - * @mon must be locked upon entry.
>> + * @mon must be locked upon entry, @vm shan't.
>
> Please use either the more contemporary short form or the long form.
>
>> *
>> * Returns: 0 on success,
>> * -1 otherwise.
>> */
>> static inline int
>> qemuDomainDeleteDevice(qemuMonitorPtr mon,
>> + virDomainObjPtr vm,
>> const char *alias)
>> {
>> - if (qemuMonitorDelDevice(mon, alias) < 0)
>> - return -1;
>> + qemuDomainObjPrivatePtr priv;
>> + int ret = 0;
>>
>> - return 0;
>> + if (qemuMonitorDelDevice(mon, alias) < 0) {
>> + if (vm) {
>> + /* It is safe to lock and unlock both @mon and @vm
>> + * here because:
>> + * a) qemuDomainObjEnterMonitor() ensures @mon is
>> + * ref()'d
>> + * b) The API that is calling us ensures that @vm is
>> + * ref()'d
>
> Don't break the two lines.
>
>> + */
>> + virObjectUnlock(mon);
>> + virObjectLock(vm);
>> + priv = vm->privateData;
>> + if (priv->unplug.eventSeen)
>
> You can't do this without also clearing priv->unplug.alias. If the event
> was not seen at this point you need to make sure that it will
> unconditionally be handled via the separate event handler thread prior
> to giving up the lock on VM. By not clearing it, you still have another
> chance at the event handler thinking the unplug will be handled here
> without actually doing so.
>
>> + virResetLastError();
>
> Additionally it would be better if qemuMonitorJSONDelDevice actually
> returns the error string or object without reporting it so that we can
> decide here not to report it at all rather than resetting it.
After some more thinking I think that:
1) The new helper should encapsulate also the call to enter the monitor
- that way you can avoid the super-hacky way that re-locks the
monitor.
- you can then intergrate setting of the alias into private data
- resetting of it after exit of the monitor
- checking that the event was seen
This should work nicely as AFAIK all code paths using 'device_del'
should only ever call this one API and do everything else.
2) Refactor the monitor APIs for device_del so that they return the
error message reported by the monitor as success and return the copy
of the error message. This will probably also require a different
monitor error checking function
That way you can get the error message and report it if you decide
so.
Alternatively, I can introduce an argument to qemuMonitorDelDevice()
which would supress error reporting for this specific case (we still
want to report errors from QEMU_CHECK_MONITOR(),
qemuMonitorJSONMakeCommand(), ...), return say -2 so that my code knows
device_del failed because the device is already missing.
If any code path requires more than one monitor call, the 1) will also
ensure that the alias which we are waiting for is set only during
controlled amount of time and thus we are sure that the event thread
handles any DEVICE_DELETED events otherwise.
I started writing it this way. But problem is the
qemuDomainDetachExtensionDevice() which is called in some rollback
codes. For instance in qemuDomainAttachDiskGeneric:
qemuDomainObjEnterMonitor(driver, vm);
if (qemuHotplugDiskSourceAttach(priv->mon, diskdata) < 0)
goto exit_monitor;
if (qemuDomainAttachExtensionDevice(priv->mon, &disk->info) < 0)
goto exit_monitor;
if (qemuMonitorAddDevice(priv->mon, devstr) < 0) {
ignore_value(qemuDomainDetachExtensionDevice(priv->mon,
&disk->info));
goto exit_monitor;
}
if (qemuDomainObjExitMonitor(driver, vm) < 0) {
But I guess I can do what I'm doing now - if some argument is NULL then
I can assume caller has done EnterMonitor() and not call it again.
Michal