[libvirt] [PATCH v2 0/4] fix ejecting removable media and some cleanup

Pavel Hrdina (4): virCondWaitUntil: add another return value virDomainObjSignal: drop this function monitor: detect that eject fails because the tray is locked qemu_hotplug: try harder to eject media src/conf/domain_conf.c | 27 ++++++++-------- src/conf/domain_conf.h | 1 - src/libvirt_private.syms | 1 - src/qemu/qemu_hotplug.c | 73 +++++++++++++++++++++----------------------- src/qemu/qemu_monitor_json.c | 14 +++++++++ src/qemu/qemu_monitor_text.c | 10 ++++++ src/qemu/qemu_process.c | 6 ++-- 7 files changed, 77 insertions(+), 55 deletions(-) -- 2.4.4

We should distinguish between success and timeout, to let the user handle those two events differently. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/conf/domain_conf.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 2c3b96b..988818c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2686,15 +2686,25 @@ virDomainObjWait(virDomainObjPtr vm) } +/** + * Waits for domain condition to be triggered for a specific period of time. + * + * Returns: + * -1 in case of error + * 0 on success + * 1 on timeout + */ int virDomainObjWaitUntil(virDomainObjPtr vm, unsigned long long whenms) { - if (virCondWaitUntil(&vm->cond, &vm->parent.lock, whenms) < 0 && - errno != ETIMEDOUT) { - virReportSystemError(errno, "%s", - _("failed to wait for domain condition")); - return -1; + if (virCondWaitUntil(&vm->cond, &vm->parent.lock, whenms) < 0) { + if (errno != ETIMEDOUT) { + virReportSystemError(errno, "%s", + _("failed to wait for domain condition")); + return -1; + } + return 1; } return 0; } -- 2.4.4

There are multiple consumers for the domain condition and we should always wake them all. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/conf/domain_conf.c | 7 ------- src/conf/domain_conf.h | 1 - src/libvirt_private.syms | 1 - src/qemu/qemu_process.c | 4 ++-- 4 files changed, 2 insertions(+), 11 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 988818c..9efe175 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2661,13 +2661,6 @@ virDomainObjEndAPI(virDomainObjPtr *vm) void -virDomainObjSignal(virDomainObjPtr vm) -{ - virCondSignal(&vm->cond); -} - - -void virDomainObjBroadcast(virDomainObjPtr vm) { virCondBroadcast(&vm->cond); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index aeba5a5..59655c1 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2442,7 +2442,6 @@ void virDomainObjEndAPI(virDomainObjPtr *vm); bool virDomainObjTaint(virDomainObjPtr obj, virDomainTaintFlags taint); -void virDomainObjSignal(virDomainObjPtr vm); void virDomainObjBroadcast(virDomainObjPtr vm); int virDomainObjWait(virDomainObjPtr vm); int virDomainObjWaitUntil(virDomainObjPtr vm, diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 1566d11..720afdf 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -412,7 +412,6 @@ virDomainObjParseNode; virDomainObjSetDefTransient; virDomainObjSetMetadata; virDomainObjSetState; -virDomainObjSignal; virDomainObjTaint; virDomainObjUpdateModificationImpact; virDomainObjWait; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index ba84182..a688feb 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1004,7 +1004,7 @@ qemuProcessHandleBlockJob(qemuMonitorPtr mon ATTRIBUTE_UNUSED, /* We have a SYNC API waiting for this event, dispatch it back */ diskPriv->blockJobType = type; diskPriv->blockJobStatus = status; - virDomainObjSignal(vm); + virDomainObjBroadcast(vm); } else { /* there is no waiting SYNC API, dispatch the update to a thread */ if (VIR_ALLOC(processEvent) < 0) @@ -1500,7 +1500,7 @@ qemuProcessHandleSpiceMigrated(qemuMonitorPtr mon ATTRIBUTE_UNUSED, } priv->job.spiceMigrated = true; - virDomainObjSignal(vm); + virDomainObjBroadcast(vm); cleanup: virObjectUnlock(vm); -- 2.4.4

On 06/29/2015 11:17 AM, Pavel Hrdina wrote:
There are multiple consumers for the domain condition and we should always wake them all.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/conf/domain_conf.c | 7 ------- src/conf/domain_conf.h | 1 - src/libvirt_private.syms | 1 - src/qemu/qemu_process.c | 4 ++-- 4 files changed, 2 insertions(+), 11 deletions(-)
Hmmm... This is perhaps quite a difference as you're now unblocking all threads for HandleBlockJob and HandleSpiceMigrated. Using virCondBroadcast means all threads waiting on the condition will be unblocked as opposed to unblocking 'at least one of the threads' when using virCondSignal - for some reason that just doesn't feel right. I can certainly understand unblocking everything waiting for HandleTrayChange since it's a single physical block and perhaps HandleSpiceMigrate since that seems to be a single SPICE server transferring things (such as perhaps ports). Whereas, it just feels like there could be multiple causes for BlockJob wanting synchronized access. I guess I'd like to read others input on this. The other patches seem mostly reasonable - it's just this one that's causing me to think longer and question things. John
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 988818c..9efe175 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2661,13 +2661,6 @@ virDomainObjEndAPI(virDomainObjPtr *vm)
void -virDomainObjSignal(virDomainObjPtr vm) -{ - virCondSignal(&vm->cond); -} - - -void virDomainObjBroadcast(virDomainObjPtr vm) { virCondBroadcast(&vm->cond); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index aeba5a5..59655c1 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2442,7 +2442,6 @@ void virDomainObjEndAPI(virDomainObjPtr *vm); bool virDomainObjTaint(virDomainObjPtr obj, virDomainTaintFlags taint);
-void virDomainObjSignal(virDomainObjPtr vm); void virDomainObjBroadcast(virDomainObjPtr vm); int virDomainObjWait(virDomainObjPtr vm); int virDomainObjWaitUntil(virDomainObjPtr vm, diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 1566d11..720afdf 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -412,7 +412,6 @@ virDomainObjParseNode; virDomainObjSetDefTransient; virDomainObjSetMetadata; virDomainObjSetState; -virDomainObjSignal; virDomainObjTaint; virDomainObjUpdateModificationImpact; virDomainObjWait; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index ba84182..a688feb 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1004,7 +1004,7 @@ qemuProcessHandleBlockJob(qemuMonitorPtr mon ATTRIBUTE_UNUSED, /* We have a SYNC API waiting for this event, dispatch it back */ diskPriv->blockJobType = type; diskPriv->blockJobStatus = status; - virDomainObjSignal(vm); + virDomainObjBroadcast(vm); } else { /* there is no waiting SYNC API, dispatch the update to a thread */ if (VIR_ALLOC(processEvent) < 0) @@ -1500,7 +1500,7 @@ qemuProcessHandleSpiceMigrated(qemuMonitorPtr mon ATTRIBUTE_UNUSED, }
priv->job.spiceMigrated = true; - virDomainObjSignal(vm); + virDomainObjBroadcast(vm);
cleanup: virObjectUnlock(vm);

On Wed, Jul 01, 2015 at 17:29:09 -0400, John Ferlan wrote:
On 06/29/2015 11:17 AM, Pavel Hrdina wrote:
There are multiple consumers for the domain condition and we should always wake them all.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/conf/domain_conf.c | 7 ------- src/conf/domain_conf.h | 1 - src/libvirt_private.syms | 1 - src/qemu/qemu_process.c | 4 ++-- 4 files changed, 2 insertions(+), 11 deletions(-)
Hmmm... This is perhaps quite a difference as you're now unblocking all threads for HandleBlockJob and HandleSpiceMigrated.
Using virCondBroadcast means all threads waiting on the condition will be unblocked as opposed to unblocking 'at least one of the threads' when using virCondSignal - for some reason that just doesn't feel right.
Well, since a later patch in this series adds a code path that may signal a condition asyncrhonously to any known combination of threads and in the future we might add more than one waiting thread the change is desired.
I can certainly understand unblocking everything waiting for HandleTrayChange since it's a single physical block and perhaps HandleSpiceMigrate since that seems to be a single SPICE server transferring things (such as perhaps ports). Whereas, it just feels like there could be multiple causes for BlockJob wanting synchronized access.
Every single instance of waiting needs to check a extra variable to decide whether it should be unblocked, since pthreads may cause spurious wakeup in some cases so the existing code is robust to that. Peter

Modify the eject monitor functions to parse the return code and detect, whether the error contains "is locked" to report this type of failure to upper layers. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/qemu/qemu_monitor_json.c | 14 ++++++++++++++ src/qemu/qemu_monitor_text.c | 10 ++++++++++ 2 files changed, 24 insertions(+) diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index d3e98d4..9aac342 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -44,6 +44,7 @@ #include "virprobe.h" #include "virstring.h" #include "cpu/cpu_x86.h" +#include "c-strcasestr.h" #ifdef WITH_DTRACE_PROBES # include "libvirt_qemu_probes.h" @@ -2116,6 +2117,14 @@ int qemuMonitorJSONSetCPU(qemuMonitorPtr mon, } +/** + * Run QMP command to eject a media from ejectable device. + * + * Returns: + * -2 on error, when the tray is locked + * -1 on all other errors + * 0 on success + */ int qemuMonitorJSONEjectMedia(qemuMonitorPtr mon, const char *dev_name, bool force) @@ -2134,6 +2143,11 @@ int qemuMonitorJSONEjectMedia(qemuMonitorPtr mon, if (ret == 0) ret = qemuMonitorJSONCheckError(cmd, reply); + VIR_DEBUG("%s", virJSONValueToString(reply, false)); + + if (ret < 0 && c_strcasestr(virJSONValueToString(reply, false), "is locked")) + ret = -2; + virJSONValueFree(cmd); virJSONValueFree(reply); return ret; diff --git a/src/qemu/qemu_monitor_text.c b/src/qemu/qemu_monitor_text.c index 2e77534..2aa0460 100644 --- a/src/qemu/qemu_monitor_text.c +++ b/src/qemu/qemu_monitor_text.c @@ -1169,6 +1169,14 @@ int qemuMonitorTextSetCPU(qemuMonitorPtr mon, int cpu, bool online) } +/** + * Run HMP command to eject a media from ejectable device. + * + * Returns: + * -2 on error, when the tray is locked + * -1 on all other errors + * 0 on success + */ int qemuMonitorTextEjectMedia(qemuMonitorPtr mon, const char *dev_name, bool force) @@ -1187,6 +1195,8 @@ int qemuMonitorTextEjectMedia(qemuMonitorPtr mon, * device not found, device is locked ... * No message is printed on success it seems */ if (c_strcasestr(reply, "device ")) { + if (c_strcasestr(reply, "is locked")) + ret = -2; virReportError(VIR_ERR_OPERATION_FAILED, _("could not eject media on %s: %s"), dev_name, reply); goto cleanup; -- 2.4.4

Some guests lock the tray and QEMU eject command will simply fail to eject the media. But the guest OS can handle this attempt to eject the media and can unlock the tray and open it. In this case, we should try again to actually eject the media. If the first attempt fails to detect a tray_open we will fail with error, from monitor. If we receive that event, we know, that the guest properly reacted to the eject request, unlocked the tray and opened it. In this case, we need to run the command again to actually eject the media from the device. The reason to call it again is, that QEMU don't wait for the guest to react and report an error, that the tray is locked. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1147471 Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/qemu/qemu_hotplug.c | 73 +++++++++++++++++++++++-------------------------- src/qemu/qemu_process.c | 2 ++ 2 files changed, 36 insertions(+), 39 deletions(-) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 0628964..17595b7 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -59,7 +59,7 @@ VIR_LOG_INIT("qemu.qemu_hotplug"); -#define CHANGE_MEDIA_RETRIES 10 +#define CHANGE_MEDIA_TIMEOUT 5000 /* Wait up to 5 seconds for device removal to finish. */ unsigned long long qemuDomainRemoveDeviceWaitTime = 1000ull * 5; @@ -166,12 +166,13 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, virStorageSourcePtr newsrc, bool force) { - int ret = -1; + int ret = -1, rc; char *driveAlias = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; - int retries = CHANGE_MEDIA_RETRIES; const char *format = NULL; char *sourcestr = NULL; + bool ejectRetry = false; + unsigned long long now; if (!disk->info.alias) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -193,36 +194,31 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, if (!(driveAlias = qemuDeviceDriveHostAlias(disk, priv->qemuCaps))) goto error; - qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force); - if (qemuDomainObjExitMonitor(driver, vm) < 0) { - ret = -1; - goto cleanup; - } - - if (ret < 0) - goto error; + do { + qemuDomainObjEnterMonitor(driver, vm); + rc = qemuMonitorEjectMedia(priv->mon, driveAlias, force); + if (qemuDomainObjExitMonitor(driver, vm) < 0) + goto cleanup; - virObjectRef(vm); - /* we don't want to report errors from media tray_open polling */ - while (retries) { - if (disk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN) - break; + if (rc == -2) { + /* we've already tried, error out */ + if (ejectRetry) + goto error; + VIR_DEBUG("tray is locked, wait for the guest to unlock " + "the tray and try to eject it again"); + ejectRetry = true; + } else if (rc < 0) { + goto error; + } - retries--; - virObjectUnlock(vm); - VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", retries); - usleep(500 * 1000); /* sleep 500ms */ - virObjectLock(vm); - } - virObjectUnref(vm); + if (virTimeMillisNow(&now) < 0) + goto error; - if (retries <= 0) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("Unable to eject media")); - ret = -1; - goto error; - } + while (disk->tray_status != VIR_DOMAIN_DISK_TRAY_OPEN) { + if (virDomainObjWaitUntil(vm, now + CHANGE_MEDIA_TIMEOUT) != 0) + goto error; + } + } while (ejectRetry && rc != 0); if (!virStorageSourceIsEmpty(newsrc)) { if (qemuGetDriveSourceString(newsrc, conn, &sourcestr) < 0) @@ -237,19 +233,17 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, } } qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorChangeMedia(priv->mon, - driveAlias, - sourcestr, - format); - if (qemuDomainObjExitMonitor(driver, vm) < 0) { - ret = -1; + rc = qemuMonitorChangeMedia(priv->mon, + driveAlias, + sourcestr, + format); + if (qemuDomainObjExitMonitor(driver, vm) < 0) goto cleanup; - } } - virDomainAuditDisk(vm, disk->src, newsrc, "update", ret >= 0); + virDomainAuditDisk(vm, disk->src, newsrc, "update", rc >= 0); - if (ret < 0) + if (rc < 0) goto error; /* remove the old source from shared device list */ @@ -259,6 +253,7 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, virStorageSourceFree(disk->src); disk->src = newsrc; newsrc = NULL; + ret = 0; cleanup: VIR_FREE(driveAlias); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index a688feb..648ba00 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1152,6 +1152,8 @@ qemuProcessHandleTrayChange(qemuMonitorPtr mon ATTRIBUTE_UNUSED, VIR_WARN("Unable to save status on vm %s after tray moved event", vm->def->name); } + + virDomainObjBroadcast(vm); } virObjectUnlock(vm); -- 2.4.4

On 06/29/2015 11:17 AM, Pavel Hrdina wrote:
Some guests lock the tray and QEMU eject command will simply fail to eject the media. But the guest OS can handle this attempt to eject the media and can unlock the tray and open it. In this case, we should try again to actually eject the media.
If the first attempt fails to detect a tray_open we will fail with error, from monitor. If we receive that event, we know, that the guest properly reacted to the eject request, unlocked the tray and opened it. In this case, we need to run the command again to actually eject the media from the device. The reason to call it again is, that QEMU don't wait for the guest to react and report an error, that the tray is locked.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1147471
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/qemu/qemu_hotplug.c | 73 +++++++++++++++++++++++-------------------------- src/qemu/qemu_process.c | 2 ++ 2 files changed, 36 insertions(+), 39 deletions(-)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 0628964..17595b7 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -59,7 +59,7 @@
VIR_LOG_INIT("qemu.qemu_hotplug");
-#define CHANGE_MEDIA_RETRIES 10 +#define CHANGE_MEDIA_TIMEOUT 5000
/* Wait up to 5 seconds for device removal to finish. */ unsigned long long qemuDomainRemoveDeviceWaitTime = 1000ull * 5; @@ -166,12 +166,13 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, virStorageSourcePtr newsrc, bool force) { - int ret = -1; + int ret = -1, rc; char *driveAlias = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; - int retries = CHANGE_MEDIA_RETRIES; const char *format = NULL; char *sourcestr = NULL; + bool ejectRetry = false; + unsigned long long now;
if (!disk->info.alias) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -193,36 +194,31 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, if (!(driveAlias = qemuDeviceDriveHostAlias(disk, priv->qemuCaps))) goto error;
- qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force); - if (qemuDomainObjExitMonitor(driver, vm) < 0) { - ret = -1; - goto cleanup; - } - - if (ret < 0) - goto error; + do { + qemuDomainObjEnterMonitor(driver, vm); + rc = qemuMonitorEjectMedia(priv->mon, driveAlias, force); + if (qemuDomainObjExitMonitor(driver, vm) < 0) + goto cleanup;
- virObjectRef(vm); - /* we don't want to report errors from media tray_open polling */ - while (retries) { - if (disk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN) - break; + if (rc == -2) { + /* we've already tried, error out */ + if (ejectRetry) + goto error; + VIR_DEBUG("tray is locked, wait for the guest to unlock " + "the tray and try to eject it again"); + ejectRetry = true; + } else if (rc < 0) { + goto error; + }
- retries--; - virObjectUnlock(vm); - VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", retries); - usleep(500 * 1000); /* sleep 500ms */ - virObjectLock(vm); - } - virObjectUnref(vm); + if (virTimeMillisNow(&now) < 0) + goto error;
- if (retries <= 0) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("Unable to eject media")); - ret = -1; - goto error; - } + while (disk->tray_status != VIR_DOMAIN_DISK_TRAY_OPEN) { + if (virDomainObjWaitUntil(vm, now + CHANGE_MEDIA_TIMEOUT) != 0) + goto error; + }
Seems this should be if (rc == -2) wait for TRAY_MOVED (need a new event) else wait for TRAY_OPEN
+ } while (ejectRetry && rc != 0);
if (!virStorageSourceIsEmpty(newsrc)) { if (qemuGetDriveSourceString(newsrc, conn, &sourcestr) < 0) @@ -237,19 +233,17 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, } } qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorChangeMedia(priv->mon, - driveAlias, - sourcestr, - format); - if (qemuDomainObjExitMonitor(driver, vm) < 0) { - ret = -1; + rc = qemuMonitorChangeMedia(priv->mon, + driveAlias, + sourcestr, + format); + if (qemuDomainObjExitMonitor(driver, vm) < 0) goto cleanup; - } }
- virDomainAuditDisk(vm, disk->src, newsrc, "update", ret >= 0); + virDomainAuditDisk(vm, disk->src, newsrc, "update", rc >= 0);
- if (ret < 0) + if (rc < 0) goto error;
/* remove the old source from shared device list */ @@ -259,6 +253,7 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, virStorageSourceFree(disk->src); disk->src = newsrc; newsrc = NULL; + ret = 0;
cleanup: VIR_FREE(driveAlias); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index a688feb..648ba00 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1152,6 +1152,8 @@ qemuProcessHandleTrayChange(qemuMonitorPtr mon ATTRIBUTE_UNUSED, VIR_WARN("Unable to save status on vm %s after tray moved event", vm->def->name); }
Seems there should be some sort of DEVICE_TRAY_MOVED event... as well as OPEN and CLOSE John
+ + virDomainObjBroadcast(vm); }
virObjectUnlock(vm);

On Wed, Jul 01, 2015 at 05:39:49PM -0400, John Ferlan wrote:
On 06/29/2015 11:17 AM, Pavel Hrdina wrote:
Some guests lock the tray and QEMU eject command will simply fail to eject the media. But the guest OS can handle this attempt to eject the media and can unlock the tray and open it. In this case, we should try again to actually eject the media.
If the first attempt fails to detect a tray_open we will fail with error, from monitor. If we receive that event, we know, that the guest properly reacted to the eject request, unlocked the tray and opened it. In this case, we need to run the command again to actually eject the media from the device. The reason to call it again is, that QEMU don't wait for the guest to react and report an error, that the tray is locked.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1147471
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/qemu/qemu_hotplug.c | 73 +++++++++++++++++++++++-------------------------- src/qemu/qemu_process.c | 2 ++ 2 files changed, 36 insertions(+), 39 deletions(-)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 0628964..17595b7 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -59,7 +59,7 @@
VIR_LOG_INIT("qemu.qemu_hotplug");
-#define CHANGE_MEDIA_RETRIES 10 +#define CHANGE_MEDIA_TIMEOUT 5000
/* Wait up to 5 seconds for device removal to finish. */ unsigned long long qemuDomainRemoveDeviceWaitTime = 1000ull * 5; @@ -166,12 +166,13 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, virStorageSourcePtr newsrc, bool force) { - int ret = -1; + int ret = -1, rc; char *driveAlias = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; - int retries = CHANGE_MEDIA_RETRIES; const char *format = NULL; char *sourcestr = NULL; + bool ejectRetry = false; + unsigned long long now;
if (!disk->info.alias) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -193,36 +194,31 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, if (!(driveAlias = qemuDeviceDriveHostAlias(disk, priv->qemuCaps))) goto error;
- qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force); - if (qemuDomainObjExitMonitor(driver, vm) < 0) { - ret = -1; - goto cleanup; - } - - if (ret < 0) - goto error; + do { + qemuDomainObjEnterMonitor(driver, vm); + rc = qemuMonitorEjectMedia(priv->mon, driveAlias, force); + if (qemuDomainObjExitMonitor(driver, vm) < 0) + goto cleanup;
- virObjectRef(vm); - /* we don't want to report errors from media tray_open polling */ - while (retries) { - if (disk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN) - break; + if (rc == -2) { + /* we've already tried, error out */ + if (ejectRetry) + goto error; + VIR_DEBUG("tray is locked, wait for the guest to unlock " + "the tray and try to eject it again"); + ejectRetry = true; + } else if (rc < 0) { + goto error; + }
- retries--; - virObjectUnlock(vm); - VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", retries); - usleep(500 * 1000); /* sleep 500ms */ - virObjectLock(vm); - } - virObjectUnref(vm); + if (virTimeMillisNow(&now) < 0) + goto error;
- if (retries <= 0) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("Unable to eject media")); - ret = -1; - goto error; - } + while (disk->tray_status != VIR_DOMAIN_DISK_TRAY_OPEN) { + if (virDomainObjWaitUntil(vm, now + CHANGE_MEDIA_TIMEOUT) != 0) + goto error; + }
Seems this should be
if (rc == -2) wait for TRAY_MOVED (need a new event) else wait for TRAY_OPEN
There is no event TRAY_OPEN or TRAY_CLOSE. QEMU has only one event, DEVICE_TRAY_MOVED. This event is handled by qemuMonitorJSONHandleTrayChange, which will set the reason to OPEN/CLOSE and emit a qemu monitor event. For this qemu monitor event there is a handler called qemuProcessHandleTrayChange, where the tray_status is set to VIR_DOMAIN_DISK_TRAY_OPEN or VIR_DOMAIN_DISK_TRAY_CLOSED. This handler also wakes up that condition we are waiting in this code. There is no need for a new event. We are already using everything, that is available. Pavel
+ } while (ejectRetry && rc != 0);
if (!virStorageSourceIsEmpty(newsrc)) { if (qemuGetDriveSourceString(newsrc, conn, &sourcestr) < 0) @@ -237,19 +233,17 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, } } qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorChangeMedia(priv->mon, - driveAlias, - sourcestr, - format); - if (qemuDomainObjExitMonitor(driver, vm) < 0) { - ret = -1; + rc = qemuMonitorChangeMedia(priv->mon, + driveAlias, + sourcestr, + format); + if (qemuDomainObjExitMonitor(driver, vm) < 0) goto cleanup; - } }
- virDomainAuditDisk(vm, disk->src, newsrc, "update", ret >= 0); + virDomainAuditDisk(vm, disk->src, newsrc, "update", rc >= 0);
- if (ret < 0) + if (rc < 0) goto error;
/* remove the old source from shared device list */ @@ -259,6 +253,7 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, virStorageSourceFree(disk->src); disk->src = newsrc; newsrc = NULL; + ret = 0;
cleanup: VIR_FREE(driveAlias); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index a688feb..648ba00 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1152,6 +1152,8 @@ qemuProcessHandleTrayChange(qemuMonitorPtr mon ATTRIBUTE_UNUSED, VIR_WARN("Unable to save status on vm %s after tray moved event", vm->def->name); }
Seems there should be some sort of DEVICE_TRAY_MOVED event... as well as OPEN and CLOSE
John
+ + virDomainObjBroadcast(vm); }
virObjectUnlock(vm);

On 07/03/2015 09:51 AM, Pavel Hrdina wrote:
On Wed, Jul 01, 2015 at 05:39:49PM -0400, John Ferlan wrote:
On 06/29/2015 11:17 AM, Pavel Hrdina wrote:
Some guests lock the tray and QEMU eject command will simply fail to eject the media. But the guest OS can handle this attempt to eject the media and can unlock the tray and open it. In this case, we should try again to actually eject the media.
If the first attempt fails to detect a tray_open we will fail with error, from monitor. If we receive that event, we know, that the guest properly reacted to the eject request, unlocked the tray and opened it. In this case, we need to run the command again to actually eject the media from the device. The reason to call it again is, that QEMU don't
s/don't/doesn't
wait for the guest to react and report an error, that the tray is locked.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1147471
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/qemu/qemu_hotplug.c | 73 +++++++++++++++++++++++-------------------------- src/qemu/qemu_process.c | 2 ++ 2 files changed, 36 insertions(+), 39 deletions(-)
I suppose if I had gone back to the commit message and reread it in light of what I was reading in the qemu-devel message linked in the bz and what I saw in the code, then perhaps it would all make sense. Of course the text you add below now makes it clear what the order of processing is and why the while loop is used. Personally I think this is a case where that commit message could be placed into the code to make it "clearer" what is being done and why. I know there are those against that because it's in the commit message or perhaps some linked bug, but if I'm reading code and wondering what it does, I'm not necessarily going back to the commit/bug that added the code (OK, at least not at first). Whether you add a comment into the code to indicate why this while loop is important and how the processing works due to the event and locked media... ACK series, John
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 0628964..17595b7 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -59,7 +59,7 @@
VIR_LOG_INIT("qemu.qemu_hotplug");
-#define CHANGE_MEDIA_RETRIES 10 +#define CHANGE_MEDIA_TIMEOUT 5000
/* Wait up to 5 seconds for device removal to finish. */ unsigned long long qemuDomainRemoveDeviceWaitTime = 1000ull * 5; @@ -166,12 +166,13 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, virStorageSourcePtr newsrc, bool force) { - int ret = -1; + int ret = -1, rc; char *driveAlias = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; - int retries = CHANGE_MEDIA_RETRIES; const char *format = NULL; char *sourcestr = NULL; + bool ejectRetry = false; + unsigned long long now;
if (!disk->info.alias) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -193,36 +194,31 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, if (!(driveAlias = qemuDeviceDriveHostAlias(disk, priv->qemuCaps))) goto error;
- qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force); - if (qemuDomainObjExitMonitor(driver, vm) < 0) { - ret = -1; - goto cleanup; - } - - if (ret < 0) - goto error; + do { + qemuDomainObjEnterMonitor(driver, vm); + rc = qemuMonitorEjectMedia(priv->mon, driveAlias, force); + if (qemuDomainObjExitMonitor(driver, vm) < 0) + goto cleanup;
- virObjectRef(vm); - /* we don't want to report errors from media tray_open polling */ - while (retries) { - if (disk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN) - break; + if (rc == -2) { + /* we've already tried, error out */ + if (ejectRetry) + goto error; + VIR_DEBUG("tray is locked, wait for the guest to unlock " + "the tray and try to eject it again"); + ejectRetry = true; + } else if (rc < 0) { + goto error; + }
- retries--; - virObjectUnlock(vm); - VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", retries); - usleep(500 * 1000); /* sleep 500ms */ - virObjectLock(vm); - } - virObjectUnref(vm); + if (virTimeMillisNow(&now) < 0) + goto error;
- if (retries <= 0) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("Unable to eject media")); - ret = -1; - goto error; - } + while (disk->tray_status != VIR_DOMAIN_DISK_TRAY_OPEN) { + if (virDomainObjWaitUntil(vm, now + CHANGE_MEDIA_TIMEOUT) != 0) + goto error; + }
Seems this should be
if (rc == -2) wait for TRAY_MOVED (need a new event) else wait for TRAY_OPEN
There is no event TRAY_OPEN or TRAY_CLOSE. QEMU has only one event, DEVICE_TRAY_MOVED. This event is handled by qemuMonitorJSONHandleTrayChange, which will set the reason to OPEN/CLOSE and emit a qemu monitor event. For this qemu monitor event there is a handler called qemuProcessHandleTrayChange, where the tray_status is set to VIR_DOMAIN_DISK_TRAY_OPEN or VIR_DOMAIN_DISK_TRAY_CLOSED. This handler also wakes up that condition we are waiting in this code. There is no need for a new event. We are already using everything, that is available.
Pavel
participants (3)
-
John Ferlan
-
Pavel Hrdina
-
Peter Krempa