On 09/05/2016 07:48 AM, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1372613
Apparently, some management applications use the following code
pattern when waiting for a block job to finish:
while (1) {
virDomainGetBlockJobInfo(dom, disk, info, flags);
if (info.cur == info.end)
break;
sleep(1);
}
Problem with this approach is in its corner cases. In case of
QEMU, libvirt merely pass what has been reported on the monitor.
However, if the block job hasn't started yet, qemu reports cur ==
end == 0 which tricks mgmt apps into thinking job is complete.
The solution is to mangle cur/end values as described here [1].
1:
https://www.redhat.com/archives/libvir-list/2016-September/msg00017.html
Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
---
src/libvirt-domain.c | 7 +++++++
src/qemu/qemu_driver.c | 12 ++++++++++++
2 files changed, 19 insertions(+)
diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c
index 46f0318..fa82e49 100644
--- a/src/libvirt-domain.c
+++ b/src/libvirt-domain.c
@@ -9896,6 +9896,13 @@ virDomainBlockJobAbort(virDomainPtr dom, const char *disk,
* can be found by calling virDomainGetXMLDesc() and inspecting
* elements within //domain/devices/disk.
*
+ * As a corner case underlying hypervisor may report cur == 0 and
+ * end == 0 when the block job hasn't been started yet. In this
+ * case libvirt reports cur = 0 and end = 1. However, hypervisor
+ * may return cur == 0 and end == 0 if the block job has finished
+ * and was no-op. In this case libvirt reports cur = 1 and end = 1.
+ * Since 2.3.0.
+ *
* Returns -1 in case of failure, 0 when nothing found, 1 when info was found.
*/
int
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 4535eb8..df0d916 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -16338,6 +16338,18 @@ qemuBlockJobInfoTranslate(qemuMonitorBlockJobInfoPtr rawInfo,
info->cur = rawInfo->cur;
info->end = rawInfo->end;
+ /* Fix job completeness reporting. If cur == end mgmt
+ * applications think job is completed. Except when both cur
+ * and end are zero, in which case qemu hasn't started the
+ * job yet. */
+ if (!info->cur && !info->end) {
+ if (rawInfo->ready > 0) {
+ info->cur = info->end = 1;
+ } else if (rawInfo->ready < 0) {
+ info->end = 1;
+ }
Can info->ready == 0 ? w/ info->cur = info->end = 0
If so, then we're in the same mess or some other weird condition.
Seems like "ready" will be set in qemu during block_job_event_ready, so
that would say to me that as long as the structure is allocated, ready
will be false and conceivably info->cur = info->end = 0.
Wouldn't that mean the < 0 should be <= 0
ACK (both patches) in principle, just need a clarification of what's
being seen...
John
+ }
+
info->type = rawInfo->type;
if (info->type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT &&
disk->mirrorJob == VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT)