[libvirt] [PATCH 0/2] qemu: Rework CD/DVD changing

The first patch is bare bug fix (without any bug reported though). The second is again a bug fix, but not so easy to spot. Basically, when we change a media, we should issue 'eject' first, then wait until the tray gets open, after which we can finally issue 'change' command. Even for bare 'eject' we ought to check status, shouldn't we? Michal Privoznik (2): qemu_monitor: Fix tray-open attribute in query-block qemu_hotplug: Rework media changing process src/qemu/qemu_hotplug.c | 51 +++++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 3 files changed, 52 insertions(+), 7 deletions(-) -- 1.8.0.2

With our code, we fail to query for tray-open attribute currently. That's because in HMP it is 'tray-open' and in QMP it's 'tray_open'. It always has been. However, we got it exactly the opposite. --- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index de5f115..0aa3bcb 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -1588,7 +1588,7 @@ int qemuMonitorJSONGetBlockInfo(qemuMonitorPtr mon, /* Don't check for success here, because 'tray-open' is presented iff * medium is ejected. */ - ignore_value(virJSONValueObjectGetBoolean(dev, "tray-open", + ignore_value(virJSONValueObjectGetBoolean(dev, "tray_open", &info->tray_open)); /* Missing io-status indicates no error */ diff --git a/src/qemu/qemu_monitor_text.c b/src/qemu/qemu_monitor_text.c index 6506f9d..bc0a11d 100644 --- a/src/qemu/qemu_monitor_text.c +++ b/src/qemu/qemu_monitor_text.c @@ -809,10 +809,10 @@ int qemuMonitorTextGetBlockInfo(qemuMonitorPtr mon, VIR_DEBUG("error reading locked: %s", p); else info->locked = (tmp != 0); - } else if (STRPREFIX(p, "tray_open=")) { - p += strlen("tray_open="); + } else if (STRPREFIX(p, "tray-open=")) { + p += strlen("tray-open="); if (virStrToLong_i(p, &dummy, 10, &tmp) == -1) - VIR_DEBUG("error reading tray_open: %s", p); + VIR_DEBUG("error reading tray-open: %s", p); else info->tray_open = (tmp != 0); } else if (STRPREFIX(p, "io-status=")) { -- 1.8.0.2

On 01/25/13 13:20, Michal Privoznik wrote:
With our code, we fail to query for tray-open attribute currently. That's because in HMP it is 'tray-open' and in QMP it's 'tray_open'. It always has been. However, we got it exactly the opposite. --- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index de5f115..0aa3bcb 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -1588,7 +1588,7 @@ int qemuMonitorJSONGetBlockInfo(qemuMonitorPtr mon, /* Don't check for success here, because 'tray-open' is presented iff * medium is ejected. */ - ignore_value(virJSONValueObjectGetBoolean(dev, "tray-open", + ignore_value(virJSONValueObjectGetBoolean(dev, "tray_open", &info->tray_open));
Indeed. The returned value is with the underscore, while the documentation uses the dash.
/* Missing io-status indicates no error */ diff --git a/src/qemu/qemu_monitor_text.c b/src/qemu/qemu_monitor_text.c index 6506f9d..bc0a11d 100644 --- a/src/qemu/qemu_monitor_text.c +++ b/src/qemu/qemu_monitor_text.c
+ } else if (STRPREFIX(p, "tray-open=")) {
This too matches the reality. ACK. I think it would be worth notifying the qemu folks as the docs are mistaken. Peter

On 01/25/2013 06:05 AM, Peter Krempa wrote:
On 01/25/13 13:20, Michal Privoznik wrote:
With our code, we fail to query for tray-open attribute currently. That's because in HMP it is 'tray-open' and in QMP it's 'tray_open'. It always has been. However, we got it exactly the opposite. --- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index de5f115..0aa3bcb 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -1588,7 +1588,7 @@ int qemuMonitorJSONGetBlockInfo(qemuMonitorPtr mon, /* Don't check for success here, because 'tray-open' is presented iff * medium is ejected. */ - ignore_value(virJSONValueObjectGetBoolean(dev, "tray-open", + ignore_value(virJSONValueObjectGetBoolean(dev, "tray_open", &info->tray_open));
Indeed. The returned value is with the underscore, while the documentation uses the dash.
Where did you find that? Because qemu.git/qapi-schema.json says: { 'type': 'BlockInfo', 'data': {'device': 'str', 'type': 'str', 'removable': 'bool', 'locked': 'bool', '*inserted': 'BlockDeviceInfo', '*tray_open': 'bool', '*io-status': 'BlockDeviceIoStatus', '*dirty': 'BlockDirtyInfo' } } with the underscore, and that's the file I refer to for documentation of QMP. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 25.01.2013 14:31, Eric Blake wrote:
On 01/25/2013 06:05 AM, Peter Krempa wrote:
On 01/25/13 13:20, Michal Privoznik wrote:
With our code, we fail to query for tray-open attribute currently. That's because in HMP it is 'tray-open' and in QMP it's 'tray_open'. It always has been. However, we got it exactly the opposite. --- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index de5f115..0aa3bcb 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -1588,7 +1588,7 @@ int qemuMonitorJSONGetBlockInfo(qemuMonitorPtr mon, /* Don't check for success here, because 'tray-open' is presented iff * medium is ejected. */ - ignore_value(virJSONValueObjectGetBoolean(dev, "tray-open", + ignore_value(virJSONValueObjectGetBoolean(dev, "tray_open", &info->tray_open));
Indeed. The returned value is with the underscore, while the documentation uses the dash.
Where did you find that? Because qemu.git/qapi-schema.json says:
{ 'type': 'BlockInfo', 'data': {'device': 'str', 'type': 'str', 'removable': 'bool', 'locked': 'bool', '*inserted': 'BlockDeviceInfo', '*tray_open': 'bool', '*io-status': 'BlockDeviceIoStatus', '*dirty': 'BlockDirtyInfo' } }
with the underscore, and that's the file I refer to for documentation of QMP.
qmp-commands.hx:- "tray-open": only present if removable, true if the device has a tray,

On 01/25/13 14:31, Eric Blake wrote:
On 01/25/2013 06:05 AM, Peter Krempa wrote:
On 01/25/13 13:20, Michal Privoznik wrote:
With our code, we fail to query for tray-open attribute currently. That's because in HMP it is 'tray-open' and in QMP it's 'tray_open'. It always has been. However, we got it exactly the opposite. --- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index de5f115..0aa3bcb 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -1588,7 +1588,7 @@ int qemuMonitorJSONGetBlockInfo(qemuMonitorPtr mon, /* Don't check for success here, because 'tray-open' is presented iff * medium is ejected. */ - ignore_value(virJSONValueObjectGetBoolean(dev, "tray-open", + ignore_value(virJSONValueObjectGetBoolean(dev, "tray_open", &info->tray_open));
Indeed. The returned value is with the underscore, while the documentation uses the dash.
Where did you find that? Because qemu.git/qapi-schema.json says:
{ 'type': 'BlockInfo', 'data': {'device': 'str', 'type': 'str', 'removable': 'bool', 'locked': 'bool', '*inserted': 'BlockDeviceInfo', '*tray_open': 'bool', '*io-status': 'BlockDeviceIoStatus', '*dirty': 'BlockDirtyInfo' } }
with the underscore, and that's the file I refer to for documentation of QMP.
I usualy refer to qemu.git/qmp-commands.hx: query-block ----------- Show the block devices. Each block device information is stored in a json-object and the returned value is a json-array of all devices. Each json-object contain the following: - "device": device name (json-string) - "type": device type (json-string) - deprecated, retained for backward compatibility - Possible values: "unknown" - "removable": true if the device is removable, false otherwise (json-bool) - "locked": true if the device is locked, false otherwise (json-bool) - "tray-open": only present if removable, true if the device has a tray, and it is open (json-bool) - "inserted": only present if the device is inserted, it is a json-object containing the following: [...]

On 25.01.2013 14:05, Peter Krempa wrote:
On 01/25/13 13:20, Michal Privoznik wrote:
With our code, we fail to query for tray-open attribute currently. That's because in HMP it is 'tray-open' and in QMP it's 'tray_open'. It always has been. However, we got it exactly the opposite. --- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index de5f115..0aa3bcb 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -1588,7 +1588,7 @@ int qemuMonitorJSONGetBlockInfo(qemuMonitorPtr mon, /* Don't check for success here, because 'tray-open' is
1: ^^
presented iff * medium is ejected. */ - ignore_value(virJSONValueObjectGetBoolean(dev, "tray-open", + ignore_value(virJSONValueObjectGetBoolean(dev, "tray_open", &info->tray_open));
Indeed. The returned value is with the underscore, while the documentation uses the dash.
/* Missing io-status indicates no error */ diff --git a/src/qemu/qemu_monitor_text.c b/src/qemu/qemu_monitor_text.c index 6506f9d..bc0a11d 100644 --- a/src/qemu/qemu_monitor_text.c +++ b/src/qemu/qemu_monitor_text.c
+ } else if (STRPREFIX(p, "tray-open=")) {
This too matches the reality.
ACK. I think it would be worth notifying the qemu folks as the docs are mistaken.
Peter
Thanks. I've pushed this one since this is pure bug fix unrelated to the latter patch of mine. I've sent patch to qemu devel list as well: http://lists.nongnu.org/archive/html/qemu-devel/2013-01/msg04804.html And I've updated the forgot comment [1] as well. Michal

https://bugzilla.redhat.com/show_bug.cgi?id=892289 It seems like with new udev within guest OS, the tray is locked, so we need to: - 'eject' - wait shortly - 'change' However, the 'wait shortly' step is better to be substituted with active polling on tray_open attribute on the device. Moreover, even when doing bare 'eject', we should check for 'tray_open' as guest may have locked the tray. --- src/qemu/qemu_hotplug.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 48 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 8103183..f373c44 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -48,6 +48,7 @@ #include "virstoragefile.h" #define VIR_FROM_THIS VIR_FROM_QEMU +#define CHANGE_MEDIA_RETRIES 10 int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, virDomainObjPtr vm, @@ -59,6 +60,10 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, int ret; char *driveAlias = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; + virHashTablePtr table = NULL; + struct qemuDomainDiskInfo *info = NULL; + int retries = CHANGE_MEDIA_RETRIES; + virErrorPtr origError = NULL; for (i = 0 ; i < vm->def->ndisks ; i++) { if (vm->def->disks[i]->bus == disk->bus && @@ -86,7 +91,7 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, origdisk->device != VIR_DOMAIN_DISK_DEVICE_CDROM) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Removable media not supported for %s device"), - virDomainDiskDeviceTypeToString(disk->device)); + virDomainDiskDeviceTypeToString(disk->device)); return -1; } @@ -105,8 +110,45 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, goto error; qemuDomainObjEnterMonitorWithDriver(driver, vm); + ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force); + + /* we don't want to report errors from media tray_open polling */ + origError = virSaveLastError(); + while (retries--) { + virHashFree(table); + table = qemuMonitorGetBlockInfo(priv->mon); + if (!table) + goto exit_monitor; + + info = qemuMonitorBlockInfoLookup(table, origdisk->info.alias); + if (!info) { + virHashFree(table); + goto exit_monitor; + } + + if (info->tray_open) + break; + + usleep(200 * 1000); /* sleep 200ms */ + } + virHashFree(table); + if (disk->src) { + /* deliberately don't depend on 'ret' as 'eject' may have failed for the + * fist time and we are gonna check the drive state anyway */ const char *format = NULL; + + /* We haven't succeeded yet */ + ret = -1; + + if (retries <= 0) { + virFreeError(origError); + origError = NULL; + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("Unable to eject media before changing it")); + goto exit_monitor; + } + if (disk->type != VIR_DOMAIN_DISK_TYPE_DIR) { if (disk->format > 0) format = virStorageFileFormatTypeToString(disk->format); @@ -116,8 +158,11 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, ret = qemuMonitorChangeMedia(priv->mon, driveAlias, disk->src, format); - } else { - ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force); + } +exit_monitor: + if (origError) { + virSetError(origError); + virFreeError(origError); } qemuDomainObjExitMonitorWithDriver(driver, vm); -- 1.8.0.2

On Fri, Jan 25, 2013 at 01:20:43PM +0100, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=892289
It seems like with new udev within guest OS, the tray is locked, so we need to: - 'eject' - wait shortly - 'change'
However, the 'wait shortly' step is better to be substituted with active polling on tray_open attribute on the device. Moreover, even when doing bare 'eject', we should check for 'tray_open' as guest may have locked the tray.
QEMU emits an event when the tray state changes, so we should just be listening for that. In the case where we have an old QEMU lacking the tray state event, then we're also missing the tray state in the block info query, so no worse off. So just skip the polling and wait for the event to arrive. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

https://bugzilla.redhat.com/show_bug.cgi?id=892289 It seems like with new udev within guest OS, the tray is locked, so we need to: - 'eject' - wait for tray to open - 'change' Moreover, even when doing bare 'eject', we should check for 'tray_open' as guest may have locked the tray. However, the waiting phase shouldn't be unbounded, so I've chosen 10 retries maximum, each per 500ms. This should give enough time for guest to eject a media and open the tray. --- diff to v1: - listen to TRAY event instead of active polling src/qemu/qemu_hotplug.c | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 8103183..18c4109 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -48,6 +48,7 @@ #include "virstoragefile.h" #define VIR_FROM_THIS VIR_FROM_QEMU +#define CHANGE_MEDIA_RETRIES 10 int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, virDomainObjPtr vm, @@ -59,6 +60,7 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, int ret; char *driveAlias = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; + int retries = CHANGE_MEDIA_RETRIES; for (i = 0 ; i < vm->def->ndisks ; i++) { if (vm->def->disks[i]->bus == disk->bus && @@ -86,7 +88,7 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, origdisk->device != VIR_DOMAIN_DISK_DEVICE_CDROM) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Removable media not supported for %s device"), - virDomainDiskDeviceTypeToString(disk->device)); + virDomainDiskDeviceTypeToString(disk->device)); return -1; } @@ -105,8 +107,31 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, goto error; qemuDomainObjEnterMonitorWithDriver(driver, vm); + ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force); + + /* we don't want to report errors from media tray_open polling */ + while (retries--) { + if (origdisk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN) + break; + + VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", retries); + usleep(500 * 1000); /* sleep 500ms */ + } + if (disk->src) { + /* deliberately don't depend on 'ret' as 'eject' may have failed for the + * fist time and we are gonna check the drive state anyway */ const char *format = NULL; + + /* We haven't succeeded yet */ + ret = -1; + + if (retries <= 0) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("Unable to eject media before changing it")); + goto exit_monitor; + } + if (disk->type != VIR_DOMAIN_DISK_TYPE_DIR) { if (disk->format > 0) format = virStorageFileFormatTypeToString(disk->format); @@ -116,9 +141,8 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, ret = qemuMonitorChangeMedia(priv->mon, driveAlias, disk->src, format); - } else { - ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force); } +exit_monitor: qemuDomainObjExitMonitorWithDriver(driver, vm); virDomainAuditDisk(vm, origdisk->src, disk->src, "update", ret >= 0); -- 1.8.0.2

On 01/25/2013 07:53 AM, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=892289
It seems like with new udev within guest OS, the tray is locked, so we need to: - 'eject' - wait for tray to open - 'change'
Moreover, even when doing bare 'eject', we should check for 'tray_open' as guest may have locked the tray. However, the waiting phase shouldn't be unbounded, so I've chosen 10 retries maximum, each per 500ms. This should give enough time for guest to eject a media and open the tray. ---
diff to v1: - listen to TRAY event instead of active polling
ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 26.01.2013 00:07, Eric Blake wrote:
On 01/25/2013 07:53 AM, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=892289
It seems like with new udev within guest OS, the tray is locked, so we need to: - 'eject' - wait for tray to open - 'change'
Moreover, even when doing bare 'eject', we should check for 'tray_open' as guest may have locked the tray. However, the waiting phase shouldn't be unbounded, so I've chosen 10 retries maximum, each per 500ms. This should give enough time for guest to eject a media and open the tray. ---
diff to v1: - listen to TRAY event instead of active polling
ACK.
Thanks, pushed. Michal

Hi Michal, Daniel It seems there is something wrong with the 2/2 part of this patchset. When I do an "change-media" command in virsh, it doesn't do any better than before, but even worse(I must to wait 5 secs to see the error). I'm not family with libvirt, just add some log things in the qemu_hotplug.c and found the tray_status never change to open, but with michal's original patch (which do active poll on tray_status), I can do "change-media" successfully every time. IMHO, there must be some bug in the processing of qemu tray change. Thanks, Wei On 01/25/2013 08:20 PM, Michal Privoznik wrote:
The first patch is bare bug fix (without any bug reported though). The second is again a bug fix, but not so easy to spot. Basically, when we change a media, we should issue 'eject' first, then wait until the tray gets open, after which we can finally issue 'change' command. Even for bare 'eject' we ought to check status, shouldn't we?
Michal Privoznik (2): qemu_monitor: Fix tray-open attribute in query-block qemu_hotplug: Rework media changing process
src/qemu/qemu_hotplug.c | 51 +++++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 3 files changed, 52 insertions(+), 7 deletions(-)

On 17.05.2013 07:47, Li Wei wrote:
Hi Michal, Daniel
It seems there is something wrong with the 2/2 part of this patchset. When I do an "change-media" command in virsh, it doesn't do any better than before, but even worse(I must to wait 5 secs to see the error).
I'm not family with libvirt, just add some log things in the qemu_hotplug.c and found the tray_status never change to open, but with michal's original patch (which do active poll on tray_status), I can do "change-media" successfully every time.
Yes & no. Thing is - originally we ignored tray status in the guest. So in case guest has locked the tray and thus ignored our eject request, we have changed the media anyway, resulting in possible I/O errors within guest. With my patch, we send eject request to the qemu, and actively poll for the tray to open. We check for up to 10 times every 200ms. So guest has 2 seconds to open the tray. If the guest doesn't eject media and open the tray we don't proceed with forcibly changing the media. These patches are actually a bug fix for https://bugzilla.redhat.com/show_bug.cgi?id=892289
IMHO, there must be some bug in the processing of qemu tray change.
Thanks, Wei
On 01/25/2013 08:20 PM, Michal Privoznik wrote:
The first patch is bare bug fix (without any bug reported though). The second is again a bug fix, but not so easy to spot. Basically, when we change a media, we should issue 'eject' first, then wait until the tray gets open, after which we can finally issue 'change' command. Even for bare 'eject' we ought to check status, shouldn't we?
Michal Privoznik (2): qemu_monitor: Fix tray-open attribute in query-block qemu_hotplug: Rework media changing process
src/qemu/qemu_hotplug.c | 51 +++++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 3 files changed, 52 insertions(+), 7 deletions(-)

On 05/17/2013 04:07 PM, Michal Privoznik wrote:
On 17.05.2013 07:47, Li Wei wrote:
Hi Michal, Daniel
It seems there is something wrong with the 2/2 part of this patchset. When I do an "change-media" command in virsh, it doesn't do any better than before, but even worse(I must to wait 5 secs to see the error).
I'm not family with libvirt, just add some log things in the qemu_hotplug.c and found the tray_status never change to open, but with michal's original patch (which do active poll on tray_status), I can do "change-media" successfully every time.
Yes & no. Thing is - originally we ignored tray status in the guest. So in case guest has locked the tray and thus ignored our eject request, we have changed the media anyway, resulting in possible I/O errors within guest.
With my patch, we send eject request to the qemu, and actively poll for the tray to open. We check for up to 10 times every 200ms. So guest has 2 seconds to open the tray. If the guest doesn't eject media and open the tray we don't proceed with forcibly changing the media.
These patches are actually a bug fix for https://bugzilla.redhat.com/show_bug.cgi?id=892289
Hmm... I think you misunderstood my meaning(sorry for my pool english), simply to say, I can get expected behavior by apply patch V1 but not V2. In V1, we do active poll on qemuMonitorGetBlockInfo(), and in V2 we just wait for origdisk->tray_status to become VIR_DOMAIN_DISK_TRAY_OPEN but this never happened. Thanks, Wei
IMHO, there must be some bug in the processing of qemu tray change.
Thanks, Wei
On 01/25/2013 08:20 PM, Michal Privoznik wrote:
The first patch is bare bug fix (without any bug reported though). The second is again a bug fix, but not so easy to spot. Basically, when we change a media, we should issue 'eject' first, then wait until the tray gets open, after which we can finally issue 'change' command. Even for bare 'eject' we ought to check status, shouldn't we?
Michal Privoznik (2): qemu_monitor: Fix tray-open attribute in query-block qemu_hotplug: Rework media changing process
src/qemu/qemu_hotplug.c | 51 +++++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 3 files changed, 52 insertions(+), 7 deletions(-)

On 17.05.2013 10:16, Li Wei wrote:
On 05/17/2013 04:07 PM, Michal Privoznik wrote:
On 17.05.2013 07:47, Li Wei wrote:
Hi Michal, Daniel
It seems there is something wrong with the 2/2 part of this patchset. When I do an "change-media" command in virsh, it doesn't do any better than before, but even worse(I must to wait 5 secs to see the error).
I'm not family with libvirt, just add some log things in the qemu_hotplug.c and found the tray_status never change to open, but with michal's original patch (which do active poll on tray_status), I can do "change-media" successfully every time.
Yes & no. Thing is - originally we ignored tray status in the guest. So in case guest has locked the tray and thus ignored our eject request, we have changed the media anyway, resulting in possible I/O errors within guest.
With my patch, we send eject request to the qemu, and actively poll for the tray to open. We check for up to 10 times every 200ms. So guest has 2 seconds to open the tray. If the guest doesn't eject media and open the tray we don't proceed with forcibly changing the media.
These patches are actually a bug fix for https://bugzilla.redhat.com/show_bug.cgi?id=892289
Hmm... I think you misunderstood my meaning(sorry for my pool english), simply to say, I can get expected behavior by apply patch V1 but not V2.
In V1, we do active poll on qemuMonitorGetBlockInfo(), and in V2 we just wait for origdisk->tray_status to become VIR_DOMAIN_DISK_TRAY_OPEN but this never happened.
Thanks, Wei
Aaah, now I understand. So you are saying that the event is not being delivered. What's the qemu version? If you turn the debug logs on, do you see DEVICE_TRAY_MOVED in them? http://wiki.libvirt.org/page/DebugLogs Or maybe the timeout is just short for your guest to eject the tray. So after a while, does 'virsh change-media' succeed? Michal
IMHO, there must be some bug in the processing of qemu tray change.
Thanks, Wei
On 01/25/2013 08:20 PM, Michal Privoznik wrote:
The first patch is bare bug fix (without any bug reported though). The second is again a bug fix, but not so easy to spot. Basically, when we change a media, we should issue 'eject' first, then wait until the tray gets open, after which we can finally issue 'change' command. Even for bare 'eject' we ought to check status, shouldn't we?
Michal Privoznik (2): qemu_monitor: Fix tray-open attribute in query-block qemu_hotplug: Rework media changing process
src/qemu/qemu_hotplug.c | 51 +++++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 3 files changed, 52 insertions(+), 7 deletions(-)
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 05/17/2013 04:52 PM, Michal Privoznik wrote:
On 17.05.2013 10:16, Li Wei wrote:
On 05/17/2013 04:07 PM, Michal Privoznik wrote:
On 17.05.2013 07:47, Li Wei wrote:
Hi Michal, Daniel
It seems there is something wrong with the 2/2 part of this patchset. When I do an "change-media" command in virsh, it doesn't do any better than before, but even worse(I must to wait 5 secs to see the error).
I'm not family with libvirt, just add some log things in the qemu_hotplug.c and found the tray_status never change to open, but with michal's original patch (which do active poll on tray_status), I can do "change-media" successfully every time.
Yes & no. Thing is - originally we ignored tray status in the guest. So in case guest has locked the tray and thus ignored our eject request, we have changed the media anyway, resulting in possible I/O errors within guest.
With my patch, we send eject request to the qemu, and actively poll for the tray to open. We check for up to 10 times every 200ms. So guest has 2 seconds to open the tray. If the guest doesn't eject media and open the tray we don't proceed with forcibly changing the media.
These patches are actually a bug fix for https://bugzilla.redhat.com/show_bug.cgi?id=892289
Hmm... I think you misunderstood my meaning(sorry for my pool english), simply to say, I can get expected behavior by apply patch V1 but not V2.
In V1, we do active poll on qemuMonitorGetBlockInfo(), and in V2 we just wait for origdisk->tray_status to become VIR_DOMAIN_DISK_TRAY_OPEN but this never happened.
Thanks, Wei
Aaah, now I understand. So you are saying that the event is not being delivered. What's the qemu version?
I have the following version: virsh # version Compiled against library: libvirt 1.0.5 Using library: libvirt 1.0.5 Using API: QEMU 1.0.5 Running hypervisor: QEMU 1.4.1
If you turn the debug logs on, do you see DEVICE_TRAY_MOVED in them? http://wiki.libvirt.org/page/DebugLogs
Or maybe the timeout is just short for your guest to eject the tray. So after a while, does 'virsh change-media' succeed?
Yes, I can see the DEVICE_TRAY_MOVED in the log file, it goes after the debug message "qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 0". does that mean the timeout is too short? If the guest needs too much time to eject, I'm wondering why when I use the v1 patch, I can get succeed in about 1 to 2 seconds. I doubt if there are other problems cause tray_status not updated in time? I appended the log file for your reference, if you need any other information, please tell me. Thanks, Wei
Michal
IMHO, there must be some bug in the processing of qemu tray change.
Thanks, Wei
On 01/25/2013 08:20 PM, Michal Privoznik wrote:
The first patch is bare bug fix (without any bug reported though). The second is again a bug fix, but not so easy to spot. Basically, when we change a media, we should issue 'eject' first, then wait until the tray gets open, after which we can finally issue 'change' command. Even for bare 'eject' we ought to check status, shouldn't we?
Michal Privoznik (2): qemu_monitor: Fix tray-open attribute in query-block qemu_hotplug: Rework media changing process
src/qemu/qemu_hotplug.c | 51 +++++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_monitor_json.c | 2 +- src/qemu/qemu_monitor_text.c | 6 +++--- 3 files changed, 52 insertions(+), 7 deletions(-)
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 17.05.2013 11:37, Li Wei wrote:
On 05/17/2013 04:52 PM, Michal Privoznik wrote:
On 17.05.2013 10:16, Li Wei wrote:
On 05/17/2013 04:07 PM, Michal Privoznik wrote:
On 17.05.2013 07:47, Li Wei wrote:
Hi Michal, Daniel
It seems there is something wrong with the 2/2 part of this patchset. When I do an "change-media" command in virsh, it doesn't do any better than before, but even worse(I must to wait 5 secs to see the error).
I'm not family with libvirt, just add some log things in the qemu_hotplug.c and found the tray_status never change to open, but with michal's original patch (which do active poll on tray_status), I can do "change-media" successfully every time.
Yes & no. Thing is - originally we ignored tray status in the guest. So in case guest has locked the tray and thus ignored our eject request, we have changed the media anyway, resulting in possible I/O errors within guest.
With my patch, we send eject request to the qemu, and actively poll for the tray to open. We check for up to 10 times every 200ms. So guest has 2 seconds to open the tray. If the guest doesn't eject media and open the tray we don't proceed with forcibly changing the media.
These patches are actually a bug fix for https://bugzilla.redhat.com/show_bug.cgi?id=892289
Hmm... I think you misunderstood my meaning(sorry for my pool english), simply to say, I can get expected behavior by apply patch V1 but not V2.
In V1, we do active poll on qemuMonitorGetBlockInfo(), and in V2 we just wait for origdisk->tray_status to become VIR_DOMAIN_DISK_TRAY_OPEN but this never happened.
Thanks, Wei
Aaah, now I understand. So you are saying that the event is not being delivered. What's the qemu version?
I have the following version:
virsh # version Compiled against library: libvirt 1.0.5 Using library: libvirt 1.0.5 Using API: QEMU 1.0.5 Running hypervisor: QEMU 1.4.1
If you turn the debug logs on, do you see DEVICE_TRAY_MOVED in them? http://wiki.libvirt.org/page/DebugLogs
Or maybe the timeout is just short for your guest to eject the tray. So after a while, does 'virsh change-media' succeed?
Yes, I can see the DEVICE_TRAY_MOVED in the log file, it goes after the debug message "qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 0". does that mean the timeout is too short?
If the guest needs too much time to eject, I'm wondering why when I use the v1 patch, I can get succeed in about 1 to 2 seconds. I doubt if there are other problems cause tray_status not updated in time?
I appended the log file for your reference, if you need any other information, please tell me.
Thanks, Wei
I am pasting some interesting knobs from the log: 2013-05-17 09:04:46.234+0000: 27004: debug : qemuMonitorSend:892 : QEMU_MONITOR_SEND_MSG: mon=0x7f7ac4000bd0 msg={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"} fd=-1 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOWrite:458 : QEMU_MONITOR_IO_WRITE: mon=0x7f7ac4000bd0 buf={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"} len=92 ret=92 errno=2 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOProcess:350 : QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"id": "libvirt-7", "error": {"class": "GenericError", "desc": "Device 'drive-ide0-1-0' is locked"}} len=102 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.235+0000: 27004: error : qemuMonitorJSONCheckError:352 : internal error unable to execute QEMU command 'eject': Device 'drive-ide0-1-0' is locked 2013-05-17 09:04:46.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 9 2013-05-17 09:04:46.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 8 2013-05-17 09:04:47.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 7 2013-05-17 09:04:47.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 6 2013-05-17 09:04:48.186+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:48.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 5 2013-05-17 09:04:48.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 4 2013-05-17 09:04:49.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 3 2013-05-17 09:04:49.736+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 2 2013-05-17 09:04:50.236+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 1 2013-05-17 09:04:50.736+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 0 2013-05-17 09:04:51.236+0000: 27004: error : qemuDomainChangeEjectableMedia:117 : operation failed: Unable to eject media before changing it 2013-05-17 09:04:51.236+0000: 27004: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockDiskDetach:282 : plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 disk=0x7f7ac4003690 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockManagerNew:128 : plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 withResources=0 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerPluginGetDriver:295 : plugin=0x7f7acc007c00 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerNew:319 : driver=0x7f7ae53dc5e0 type=0 nparams=5 params=0x7f7ad7d25830 flags=0 2013-05-17 09:04:51.236+0000: 27001: debug : qemuMonitorIOProcess:350 : QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"timestamp": {"seconds": 1368781488, "microseconds": 186399}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "drive-ide0-1-0", "tray-open": true}} len=151 What we can see here is: 1) libvirt asks qemu to eject 2) qemu responds with "Device is locked" - which is correct as it emulates guest request 3) However, after a while, the guest decides it's now safe to eject the media (e.g. all reads have finished, and the device has been unmounted) which is done 4) [not in the pasted log] After a while, you've decided to try again and this time it succeeded, as the tray is already opened. What we may do here is lifting the number of retries up. You can try yourself: diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index d037c9d..ddd8d17 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -50,7 +50,7 @@ #include "virstring.h" #define VIR_FROM_THIS VIR_FROM_QEMU -#define CHANGE_MEDIA_RETRIES 10 +#define CHANGE_MEDIA_RETRIES 20 int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, virDomainObjPtr vm, Michal

On 05/17, Michal Privoznik wrote:
On 17.05.2013 11:37, Li Wei wrote:
On 05/17/2013 04:52 PM, Michal Privoznik wrote:
On 17.05.2013 10:16, Li Wei wrote:
On 05/17/2013 04:07 PM, Michal Privoznik wrote:
On 17.05.2013 07:47, Li Wei wrote:
Hi Michal, Daniel
It seems there is something wrong with the 2/2 part of this patchset. When I do an "change-media" command in virsh, it doesn't do any better than before, but even worse(I must to wait 5 secs to see the error).
I'm not family with libvirt, just add some log things in the qemu_hotplug.c and found the tray_status never change to open, but with michal's original patch (which do active poll on tray_status), I can do "change-media" successfully every time.
Yes & no. Thing is - originally we ignored tray status in the guest. So in case guest has locked the tray and thus ignored our eject request, we have changed the media anyway, resulting in possible I/O errors within guest.
With my patch, we send eject request to the qemu, and actively poll for the tray to open. We check for up to 10 times every 200ms. So guest has 2 seconds to open the tray. If the guest doesn't eject media and open the tray we don't proceed with forcibly changing the media.
These patches are actually a bug fix for https://bugzilla.redhat.com/show_bug.cgi?id=892289
Hmm... I think you misunderstood my meaning(sorry for my pool english), simply to say, I can get expected behavior by apply patch V1 but not V2.
In V1, we do active poll on qemuMonitorGetBlockInfo(), and in V2 we just wait for origdisk->tray_status to become VIR_DOMAIN_DISK_TRAY_OPEN but this never happened.
Thanks, Wei
Aaah, now I understand. So you are saying that the event is not being delivered. What's the qemu version?
I have the following version:
virsh # version Compiled against library: libvirt 1.0.5 Using library: libvirt 1.0.5 Using API: QEMU 1.0.5 Running hypervisor: QEMU 1.4.1
If you turn the debug logs on, do you see DEVICE_TRAY_MOVED in them? http://wiki.libvirt.org/page/DebugLogs
Or maybe the timeout is just short for your guest to eject the tray. So after a while, does 'virsh change-media' succeed?
Yes, I can see the DEVICE_TRAY_MOVED in the log file, it goes after the debug message "qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 0". does that mean the timeout is too short?
If the guest needs too much time to eject, I'm wondering why when I use the v1 patch, I can get succeed in about 1 to 2 seconds. I doubt if there are other problems cause tray_status not updated in time?
I appended the log file for your reference, if you need any other information, please tell me.
Thanks, Wei
I am pasting some interesting knobs from the log:
2013-05-17 09:04:46.234+0000: 27004: debug : qemuMonitorSend:892 : QEMU_MONITOR_SEND_MSG: mon=0x7f7ac4000bd0 msg={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"} fd=-1 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOWrite:458 : QEMU_MONITOR_IO_WRITE: mon=0x7f7ac4000bd0 buf={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"} len=92 ret=92 errno=2 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOProcess:350 : QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"id": "libvirt-7", "error": {"class": "GenericError", "desc": "Device 'drive-ide0-1-0' is locked"}} len=102 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.235+0000: 27004: error : qemuMonitorJSONCheckError:352 : internal error unable to execute QEMU command 'eject': Device 'drive-ide0-1-0' is locked 2013-05-17 09:04:46.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 9 2013-05-17 09:04:46.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 8 2013-05-17 09:04:47.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 7 2013-05-17 09:04:47.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 6 2013-05-17 09:04:48.186+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:48.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 5 2013-05-17 09:04:48.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 4 2013-05-17 09:04:49.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 3 2013-05-17 09:04:49.736+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 2 2013-05-17 09:04:50.236+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 1 2013-05-17 09:04:50.736+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 0 2013-05-17 09:04:51.236+0000: 27004: error : qemuDomainChangeEjectableMedia:117 : operation failed: Unable to eject media before changing it 2013-05-17 09:04:51.236+0000: 27004: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockDiskDetach:282 : plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 disk=0x7f7ac4003690 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockManagerNew:128 : plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 withResources=0 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerPluginGetDriver:295 : plugin=0x7f7acc007c00 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerNew:319 : driver=0x7f7ae53dc5e0 type=0 nparams=5 params=0x7f7ad7d25830 flags=0 2013-05-17 09:04:51.236+0000: 27001: debug : qemuMonitorIOProcess:350 : QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"timestamp": {"seconds": 1368781488, "microseconds": 186399}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "drive-ide0-1-0", "tray-open": true}} len=151
What we can see here is: 1) libvirt asks qemu to eject 2) qemu responds with "Device is locked" - which is correct as it emulates guest request 3) However, after a while, the guest decides it's now safe to eject the media (e.g. all reads have finished, and the device has been unmounted) which is done 4) [not in the pasted log] After a while, you've decided to try again and this time it succeeded, as the tray is already opened.
What we may do here is lifting the number of retries up. You can try yourself:
I have indeed increased retries to 20 and unfortunately that didn't work either, we can see in the log file that the DEVICE_TRAY_MOVED event received immediately after the final retry. After we doubled the retries, that would be more that enough time for the guest to eject the tray. So, maybe there is something wrong otherwhere. Thanks, Wei
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index d037c9d..ddd8d17 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -50,7 +50,7 @@ #include "virstring.h"
#define VIR_FROM_THIS VIR_FROM_QEMU -#define CHANGE_MEDIA_RETRIES 10 +#define CHANGE_MEDIA_RETRIES 20
int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, virDomainObjPtr vm,
Michal
--

On 18.05.2013 05:31, Li Wei wrote:
On 05/17, Michal Privoznik wrote:
On 17.05.2013 11:37, Li Wei wrote:
On 05/17/2013 04:52 PM, Michal Privoznik wrote:
On 17.05.2013 10:16, Li Wei wrote:
On 05/17/2013 04:07 PM, Michal Privoznik wrote:
On 17.05.2013 07:47, Li Wei wrote: > Hi Michal, Daniel > > It seems there is something wrong with the 2/2 part of this patchset. > When I do an "change-media" command in virsh, it doesn't do any better > than before, but even worse(I must to wait 5 secs to see the error). > > I'm not family with libvirt, just add some log things in the qemu_hotplug.c > and found the tray_status never change to open, but with michal's original > patch (which do active poll on tray_status), I can do "change-media" successfully > every time.
Yes & no. Thing is - originally we ignored tray status in the guest. So in case guest has locked the tray and thus ignored our eject request, we have changed the media anyway, resulting in possible I/O errors within guest.
With my patch, we send eject request to the qemu, and actively poll for the tray to open. We check for up to 10 times every 200ms. So guest has 2 seconds to open the tray. If the guest doesn't eject media and open the tray we don't proceed with forcibly changing the media.
These patches are actually a bug fix for https://bugzilla.redhat.com/show_bug.cgi?id=892289
Hmm... I think you misunderstood my meaning(sorry for my pool english), simply to say, I can get expected behavior by apply patch V1 but not V2.
In V1, we do active poll on qemuMonitorGetBlockInfo(), and in V2 we just wait for origdisk->tray_status to become VIR_DOMAIN_DISK_TRAY_OPEN but this never happened.
Thanks, Wei
Aaah, now I understand. So you are saying that the event is not being delivered. What's the qemu version?
I have the following version:
virsh # version Compiled against library: libvirt 1.0.5 Using library: libvirt 1.0.5 Using API: QEMU 1.0.5 Running hypervisor: QEMU 1.4.1
If you turn the debug logs on, do you see DEVICE_TRAY_MOVED in them? http://wiki.libvirt.org/page/DebugLogs
Or maybe the timeout is just short for your guest to eject the tray. So after a while, does 'virsh change-media' succeed?
Yes, I can see the DEVICE_TRAY_MOVED in the log file, it goes after the debug message "qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 0". does that mean the timeout is too short?
If the guest needs too much time to eject, I'm wondering why when I use the v1 patch, I can get succeed in about 1 to 2 seconds. I doubt if there are other problems cause tray_status not updated in time?
I appended the log file for your reference, if you need any other information, please tell me.
Thanks, Wei
I am pasting some interesting knobs from the log:
2013-05-17 09:04:46.234+0000: 27004: debug : qemuMonitorSend:892 : QEMU_MONITOR_SEND_MSG: mon=0x7f7ac4000bd0 msg={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"} fd=-1 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOWrite:458 : QEMU_MONITOR_IO_WRITE: mon=0x7f7ac4000bd0 buf={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"} len=92 ret=92 errno=2 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOProcess:350 : QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"id": "libvirt-7", "error": {"class": "GenericError", "desc": "Device 'drive-ide0-1-0' is locked"}} len=102 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.235+0000: 27004: error : qemuMonitorJSONCheckError:352 : internal error unable to execute QEMU command 'eject': Device 'drive-ide0-1-0' is locked 2013-05-17 09:04:46.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 9 2013-05-17 09:04:46.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 8 2013-05-17 09:04:47.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 7 2013-05-17 09:04:47.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 6 2013-05-17 09:04:48.186+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:48.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 5 2013-05-17 09:04:48.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 4 2013-05-17 09:04:49.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 3 2013-05-17 09:04:49.736+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 2 2013-05-17 09:04:50.236+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 1 2013-05-17 09:04:50.736+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 0 2013-05-17 09:04:51.236+0000: 27004: error : qemuDomainChangeEjectableMedia:117 : operation failed: Unable to eject media before changing it 2013-05-17 09:04:51.236+0000: 27004: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockDiskDetach:282 : plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 disk=0x7f7ac4003690 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockManagerNew:128 : plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 withResources=0 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerPluginGetDriver:295 : plugin=0x7f7acc007c00 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerNew:319 : driver=0x7f7ae53dc5e0 type=0 nparams=5 params=0x7f7ad7d25830 flags=0 2013-05-17 09:04:51.236+0000: 27001: debug : qemuMonitorIOProcess:350 : QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"timestamp": {"seconds": 1368781488, "microseconds": 186399}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "drive-ide0-1-0", "tray-open": true}} len=151
What we can see here is: 1) libvirt asks qemu to eject 2) qemu responds with "Device is locked" - which is correct as it emulates guest request 3) However, after a while, the guest decides it's now safe to eject the media (e.g. all reads have finished, and the device has been unmounted) which is done 4) [not in the pasted log] After a while, you've decided to try again and this time it succeeded, as the tray is already opened.
What we may do here is lifting the number of retries up. You can try yourself:
I have indeed increased retries to 20 and unfortunately that didn't work either, we can see in the log file that the DEVICE_TRAY_MOVED event received immediately after the final retry. After we doubled the retries, that would be more that enough time for the guest to eject the tray. So, maybe there is something wrong otherwhere.
Thanks, Wei
I think I know where the bug is. When waiting on tray to open, we don't unlock the domain. So later, when an event arrives, the first thing that the event handler does is locking the domain. Hence, it seems like the event is delivered right after we've failed waiting. That's just because after we've finished waiting, we've unlocked the domain. And I think same applies for the domain monitor. So can you please try this out: diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index d037c9d..767d5a1 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -95,14 +95,20 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force); + virObjectRef(vm); /* we don't want to report errors from media tray_open polling */ while (retries--) { if (origdisk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN) break; + qemuDomainObjExitMonitor(driver, vm); + virObjectUnlock(vm); VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", retries); usleep(500 * 1000); /* sleep 500ms */ + virObjectLock(vm); + qemuDomainObjEnterMonitor(driver, vm); } + virObjectUnref(vm); if (disk->src) { /* deliberately don't depend on 'ret' as 'eject' may have failed for the Michal

On 05/19/2013 05:51 PM, Michal Privoznik wrote:
On 18.05.2013 05:31, Li Wei wrote:
On 05/17, Michal Privoznik wrote:
On 17.05.2013 11:37, Li Wei wrote:
On 05/17/2013 04:52 PM, Michal Privoznik wrote:
On 17.05.2013 10:16, Li Wei wrote:
On 05/17/2013 04:07 PM, Michal Privoznik wrote: > On 17.05.2013 07:47, Li Wei wrote: >> Hi Michal, Daniel >> >> It seems there is something wrong with the 2/2 part of this patchset. >> When I do an "change-media" command in virsh, it doesn't do any better >> than before, but even worse(I must to wait 5 secs to see the error). >> >> I'm not family with libvirt, just add some log things in the qemu_hotplug.c >> and found the tray_status never change to open, but with michal's original >> patch (which do active poll on tray_status), I can do "change-media" successfully >> every time. > > Yes & no. Thing is - originally we ignored tray status in the guest. So > in case guest has locked the tray and thus ignored our eject request, we > have changed the media anyway, resulting in possible I/O errors within > guest. > > With my patch, we send eject request to the qemu, and actively poll for > the tray to open. We check for up to 10 times every 200ms. So guest has > 2 seconds to open the tray. If the guest doesn't eject media and open > the tray we don't proceed with forcibly changing the media. > > These patches are actually a bug fix for > https://bugzilla.redhat.com/show_bug.cgi?id=892289
Hmm... I think you misunderstood my meaning(sorry for my pool english), simply to say, I can get expected behavior by apply patch V1 but not V2.
In V1, we do active poll on qemuMonitorGetBlockInfo(), and in V2 we just wait for origdisk->tray_status to become VIR_DOMAIN_DISK_TRAY_OPEN but this never happened.
Thanks, Wei
Aaah, now I understand. So you are saying that the event is not being delivered. What's the qemu version?
I have the following version:
virsh # version Compiled against library: libvirt 1.0.5 Using library: libvirt 1.0.5 Using API: QEMU 1.0.5 Running hypervisor: QEMU 1.4.1
If you turn the debug logs on, do you see DEVICE_TRAY_MOVED in them? http://wiki.libvirt.org/page/DebugLogs
Or maybe the timeout is just short for your guest to eject the tray. So after a while, does 'virsh change-media' succeed?
Yes, I can see the DEVICE_TRAY_MOVED in the log file, it goes after the debug message "qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 0". does that mean the timeout is too short?
If the guest needs too much time to eject, I'm wondering why when I use the v1 patch, I can get succeed in about 1 to 2 seconds. I doubt if there are other problems cause tray_status not updated in time?
I appended the log file for your reference, if you need any other information, please tell me.
Thanks, Wei
I am pasting some interesting knobs from the log:
2013-05-17 09:04:46.234+0000: 27004: debug : qemuMonitorSend:892 : QEMU_MONITOR_SEND_MSG: mon=0x7f7ac4000bd0 msg={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"} fd=-1 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOWrite:458 : QEMU_MONITOR_IO_WRITE: mon=0x7f7ac4000bd0 buf={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"} len=92 ret=92 errno=2 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOProcess:350 : QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"id": "libvirt-7", "error": {"class": "GenericError", "desc": "Device 'drive-ide0-1-0' is locked"}} len=102 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:46.235+0000: 27004: error : qemuMonitorJSONCheckError:352 : internal error unable to execute QEMU command 'eject': Device 'drive-ide0-1-0' is locked 2013-05-17 09:04:46.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 9 2013-05-17 09:04:46.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 8 2013-05-17 09:04:47.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 7 2013-05-17 09:04:47.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 6 2013-05-17 09:04:48.186+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0 2013-05-17 09:04:48.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 5 2013-05-17 09:04:48.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 4 2013-05-17 09:04:49.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 3 2013-05-17 09:04:49.736+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 2 2013-05-17 09:04:50.236+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 1 2013-05-17 09:04:50.736+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 0 2013-05-17 09:04:51.236+0000: 27004: error : qemuDomainChangeEjectableMedia:117 : operation failed: Unable to eject media before changing it 2013-05-17 09:04:51.236+0000: 27004: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockDiskDetach:282 : plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 disk=0x7f7ac4003690 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockManagerNew:128 : plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 withResources=0 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerPluginGetDriver:295 : plugin=0x7f7acc007c00 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerNew:319 : driver=0x7f7ae53dc5e0 type=0 nparams=5 params=0x7f7ad7d25830 flags=0 2013-05-17 09:04:51.236+0000: 27001: debug : qemuMonitorIOProcess:350 : QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"timestamp": {"seconds": 1368781488, "microseconds": 186399}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "drive-ide0-1-0", "tray-open": true}} len=151
What we can see here is: 1) libvirt asks qemu to eject 2) qemu responds with "Device is locked" - which is correct as it emulates guest request 3) However, after a while, the guest decides it's now safe to eject the media (e.g. all reads have finished, and the device has been unmounted) which is done 4) [not in the pasted log] After a while, you've decided to try again and this time it succeeded, as the tray is already opened.
What we may do here is lifting the number of retries up. You can try yourself:
I have indeed increased retries to 20 and unfortunately that didn't work either, we can see in the log file that the DEVICE_TRAY_MOVED event received immediately after the final retry. After we doubled the retries, that would be more that enough time for the guest to eject the tray. So, maybe there is something wrong otherwhere.
Thanks, Wei
I think I know where the bug is. When waiting on tray to open, we don't unlock the domain. So later, when an event arrives, the first thing that the event handler does is locking the domain. Hence, it seems like the event is delivered right after we've failed waiting. That's just because after we've finished waiting, we've unlocked the domain. And I think same applies for the domain monitor. So can you please try this out:
This patch works fine, thanks :) Tested-by: Li Wei <lw@cn.fujitsu.com> Thanks, Wei
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index d037c9d..767d5a1 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -95,14 +95,20 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force);
+ virObjectRef(vm); /* we don't want to report errors from media tray_open polling */ while (retries--) { if (origdisk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN) break;
+ qemuDomainObjExitMonitor(driver, vm); + virObjectUnlock(vm); VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", retries); usleep(500 * 1000); /* sleep 500ms */ + virObjectLock(vm); + qemuDomainObjEnterMonitor(driver, vm); } + virObjectUnref(vm);
if (disk->src) { /* deliberately don't depend on 'ret' as 'eject' may have failed for the
Michal

On 05/19/2013 03:51 AM, Michal Privoznik wrote:
I think I know where the bug is. When waiting on tray to open, we don't unlock the domain. So later, when an event arrives, the first thing that the event handler does is locking the domain. Hence, it seems like the event is delivered right after we've failed waiting. That's just because after we've finished waiting, we've unlocked the domain. And I think same applies for the domain monitor. So can you please try this out:
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index d037c9d..767d5a1 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -95,14 +95,20 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force);
+ virObjectRef(vm); /* we don't want to report errors from media tray_open polling */ while (retries--) { if (origdisk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN) break;
+ qemuDomainObjExitMonitor(driver, vm); + virObjectUnlock(vm); VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", retries); usleep(500 * 1000); /* sleep 500ms */
Yep, we absolutely must NOT usleep() while holding locks.
+ virObjectLock(vm); + qemuDomainObjEnterMonitor(driver, vm); } + virObjectUnref(vm);
ACK -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 20.05.2013 18:25, Eric Blake wrote:
On 05/19/2013 03:51 AM, Michal Privoznik wrote:
I think I know where the bug is. When waiting on tray to open, we don't unlock the domain. So later, when an event arrives, the first thing that the event handler does is locking the domain. Hence, it seems like the event is delivered right after we've failed waiting. That's just because after we've finished waiting, we've unlocked the domain. And I think same applies for the domain monitor. So can you please try this out:
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index d037c9d..767d5a1 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -95,14 +95,20 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver, qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force);
+ virObjectRef(vm); /* we don't want to report errors from media tray_open polling */ while (retries--) { if (origdisk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN) break;
+ qemuDomainObjExitMonitor(driver, vm); + virObjectUnlock(vm); VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", retries); usleep(500 * 1000); /* sleep 500ms */
Yep, we absolutely must NOT usleep() while holding locks.
+ virObjectLock(vm); + qemuDomainObjEnterMonitor(driver, vm); } + virObjectUnref(vm);
ACK
In fact, I am gonna be sending a slightly modified & more readable version in a little while. Michal
participants (6)
-
Daniel P. Berrange
-
Eric Blake
-
Li Wei
-
Li Wei
-
Michal Privoznik
-
Peter Krempa