Re: [libvirt] [PATCH] qemu_hotplug: try harder to eject media

Thursday, 25 June 2015

On 06/25/2015 01:52 PM, Pavel Hrdina wrote:
...
 On Thu, Jun 25, 2015 at 12:06:49PM -0400, John Ferlan wrote:
>
>
> On 06/23/2015 07:13 AM, Pavel Hrdina wrote:
>> Some guests lock the tray and QEMU eject command will simply fail to
>> eject the media.  But the guest OS can handle this attempt to eject the
>> media and can unlock the tray and open it. In this case, we should try
>> again to actually eject the media.
>>
>> If the first attempt fails we will wait for tray to open and try again
>> to eject the media, but if the second attempt fails, returns with error.
>>
>> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1147471
>>
>> Signed-off-by: Pavel Hrdina <phrdina(a)redhat.com&gt;
>> ---
>>  src/qemu/qemu_hotplug.c | 33 ++++++++++++++++++++++++++++-----
>>  1 file changed, 28 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
>> index 0628964..0cebaad 100644
>> --- a/src/qemu/qemu_hotplug.c
>> +++ b/src/qemu/qemu_hotplug.c
>> @@ -201,13 +201,17 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver,
>>      }
>>  
>>      if (ret < 0)
>> -        goto error;
>> +        VIR_DEBUG("tray is probably locked, wait for the guest to unlock
"
>> +                  "the tray and open it");
>
> There are more reasons than waiting for the unlock that would cause a
> negative value to be returned, so this seems "dangerous"..

 Yes, I know that.  That's why the code will wait for the event DEVICE_TRAY_MOVED
 and only if that event where emitted from QEMU, then we will tray again.
 Otherwise the original error from the first attempt will be used.

I didn't see DEVICE_TRAY_MOVED in this patch...  The wait loop I see
below was VIR_DOMAIN_DISK_TRAY_OPEN.

...
>
> !mon
> if json
>    !cmd
>    qemuMonitorJSONCommand failure
>    qemuMonitorJSONCheckError failure
> else
>    virAsprintf Failure
>    qemuMonitorHMPCommand failure
>    c_strcasestr failure
>
>
> Is there no way to determine that the failure was due to a locked tray
> from the error message?  The bz notes a DEVICE_TRAY_MOVED event, but I
> don't see that here.

 Yes, there is a way, that we can parse the error message and wait only, if it
 contains the phrase "is locked".  For that QEMU event we have a handler
 registered and this handler updates the disk->tray_status.

 IOW, checking the disk->tray_status is the same as waiting for DEVICE_TRAY_MOVED
 event.

Right and I didn't see that so it wasn't clear to me how this patch
fixed the problem....

...
>
> I would think that if this is "handle-able" from qemu/json only, then
> the qemuMonitorJSONEjectMedia would be the best place to at least check
> for this path of retry logic.
>
> You'll have the json error returned (per the link in bz comment 3):
>
> {"error": {"class": "GenericError", "desc":
"Device 'cd' is locked"}}
>
> That should be parse-able by this code and either handled back in the
> caller or in this code.

 Also HMP returns the same error, so it's definitely parse-able for both
 monitors.  At first I didn't want to go through parsing the error message, but
 it makes sense to retry only in case, that the error message contains "is
 locked" message.

>
> Since qemuMonitorJSONEjectMedia doesn't list return values, perhaps use
> "-1" to mean total failure, 0 to mean pseudo-success as it does today,
> and 1 to indicate tray was locked and that the caller can/should retry.
>

> FWIW: By pseudo-success... I see that if 0 is currently returned, the
> while loop will look for the "tray_status == VIR_DOMAIN_DISK_TRAY_OPEN"
> and fail if not eventually seen
>
> Consider the following...
>
>     bool retry_lock_tray = false;
>
>     do {
>         qemuDomainObjEnterMonitor(driver, vm);
>         ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force);
>         if (qemuDomainObjExitMonitor(driver, vm) < 0) {
>             ret = -1;
>             goto cleanup;
>         }
>         /* We already tried, but got the locked error again */
>         if (ret == 1 && retry_lock_tray) {
>             some sort of failure since you retried
>             ret = -1;
>             goto error;
>         }
>         /* We were locked, but haven't retried */
>         if (ret == 1) {
>             Wait some period of time for the DEVICE_TRAY_MOVED event
>             if event found
>                 retry_lock_tray = true;
>         }
>     } while (retry_lock_tray);
>
>     if (ret < 0)
>         goto error;
>
>     if (ret == 1) {
>         ERROR tray locked by guest, never got DEVICE_TRAY_MOVED
>         ret = -1;
>         got error;
>     }
>

 I'll post a v2.

>
>
>>  
>>      virObjectRef(vm);
>>      /* we don't want to report errors from media tray_open polling */
>>      while (retries) {
>> -        if (disk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN)
>> +        if (disk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN) {
>> +            ret = 0;
>> +            virResetLastError();
>>              break;
>> +        }
>>  
>>          retries--;
>>          virObjectUnlock(vm);
>
> If it's locked how would this loop ever succeed?  That is how would the
> tray open "magically" if it's locked and we haven't sent another
eject
> request?

 I don't understand this question??  The vm object is locked before we enter this
 function and this loop only unlock the vm object for the time it is waiting for
 the event to occur.

So given the patch that I read, I saw nothing that polled for
DEVICE_TRAY_MOVED, thus how would tray_status become TRAY_OPEN? I think
the answer is - it won't under this scenario, hence the code beyond it
which does the eject again under the assumption that the reason we had a
-1 return was because it was locked.  That second eject doesn't go
through this wait for TRAY_OPEN event either, so hence my suggestion for
a do...while loop.

Or are you indicating that TRAY_OPEN setting occurs for a TRAY_MOVED.
It just wasn't clear from what I read in the patch

John
...
>
>> @@ -218,10 +222,29 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver,
>>      virObjectUnref(vm);
>>  
>>      if (retries <= 0) {
>> -        virReportError(VIR_ERR_OPERATION_FAILED, "%s",
>> -                       _("Unable to eject media"));
>> -        ret = -1;
>> +        /* Report a new error only if the first attempt don't fail and we
don't
>> +         * receive VIR_DOMAIN_DISK_TRAY_OPEN event, otherwise report the error
>> +         * from first attempt. */
>> +        if (ret == 0) {
>> +            virReportError(VIR_ERR_OPERATION_FAILED, "%s",
>> +                           _("Unable to eject media"));
>> +            ret = -1;
>> +        }
>>          goto error;
>> +    } else {
>> +        /* QEMU will send eject request to guest, but if the tray is locked,
>> +         * always returns with error.  However the guest can handle the eject
>> +         * request, unlock the tray and open it.  In case this happens, we
>> +         * should try to eject the media once more. */
>> +        qemuDomainObjEnterMonitor(driver, vm);
>> +        ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force);
>> +        if (qemuDomainObjExitMonitor(driver, vm) < 0) {
>> +            ret = -1;
>> +            goto cleanup;
>> +        }
>> +
>> +        if (ret < 0)
>> +            goto error;
>>      }
>>  
>>      if (!virStorageSourceIsEmpty(newsrc)) {
>> 

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [PATCH] qemu_hotplug: try harder to eject media