[libvirt] [PATCH] virsh: report 0-length active block-commit job status

At least with live block commit, it is possible to have a block job that reports 0 status: namely, when the active image contains no sectors that differ from the backing image it is being committed into [1]. I'm not sure if that represents a qemu bug, but it leads to weird virsh output where 'virsh blockjob $dom vda' has no output during a no-op commit job. It appears that the special case for a zero total was first introduced for migration, where it does sort of make sense (when we do storage migration, the job is broken up into two pieces where the first half of migrating storage has no clue what the total length of the second phase will be, and where qemu migration always reports a non-zero total length but only once we complete the first phase to start actual migration), but it doesn't seem to make sense for any of the block jobs. [1] https://www.redhat.com/archives/libvir-list/2015-January/msg00348.html * tools/virsh-domain.c (vshPrintJobProgress): Move special-case of 0 total... (vshWatchJob): ...into lone caller where it matters. Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 6733cfa..ec62aae 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1,7 +1,7 @@ /* * virsh-domain.c: Commands to manage domain * - * Copyright (C) 2005, 2007-2014 Red Hat, Inc. + * Copyright (C) 2005, 2007-2015 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -1678,10 +1678,6 @@ vshPrintJobProgress(const char *label, unsigned long long remaining, { int progress; - if (total == 0) - /* migration has not been started */ - return; - if (remaining == 0) { /* migration has completed */ progress = 100; @@ -4189,7 +4185,7 @@ vshWatchJob(vshControl *ctl, ret = virDomainGetJobInfo(dom, &jobinfo); pthread_sigmask(SIG_SETMASK, &oldsigmask, NULL); if (ret == 0) { - if (verbose) + if (verbose && jobinfo.dataTotal) vshPrintJobProgress(label, jobinfo.dataRemaining, jobinfo.dataTotal); -- 2.1.0

On 01/12/2015 05:54 PM, Eric Blake wrote:
At least with live block commit, it is possible to have a block job that reports 0 status: namely, when the active image contains no sectors that differ from the backing image it is being committed into [1]. I'm not sure if that represents a qemu bug, but it leads to weird virsh output where 'virsh blockjob $dom vda' has no output during a no-op commit job. It appears that the special case for a zero total was first introduced for migration, where it does sort of make sense (when we do storage migration, the job is broken up into two pieces where the first half of migrating storage has no clue what the total length of the second phase will be, and where qemu migration always reports a non-zero total length but only once we complete the first phase to start actual migration), but it doesn't seem to make sense for any of the block jobs.
[1] https://www.redhat.com/archives/libvir-list/2015-January/msg00348.html
* tools/virsh-domain.c (vshPrintJobProgress): Move special-case of 0 total... (vshWatchJob): ...into lone caller where it matters.
Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-domain.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 6733cfa..ec62aae 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1,7 +1,7 @@ /* * virsh-domain.c: Commands to manage domain * - * Copyright (C) 2005, 2007-2014 Red Hat, Inc. + * Copyright (C) 2005, 2007-2015 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -1678,10 +1678,6 @@ vshPrintJobProgress(const char *label, unsigned long long remaining, { int progress;
- if (total == 0) - /* migration has not been started */ - return; -
Would it be necessarily true that remaining was zero at this point? Because if it wasn't then, the else condition will divide by zero if total == 0... More than 1 caller to this function... Perhaps safer to say if "remaining == 0 || total == 0"? I was just reviewing Michal's patch from last week: http://www.redhat.com/archives/libvir-list/2015-January/msg00230.html where it seems a zero could imply some sort of failure. If you did the total == 0 check, then the && jobinfo.dataTotal isn't necessary... I would suppose that an error would mean we're 100% done... John
if (remaining == 0) { /* migration has completed */ progress = 100; @@ -4189,7 +4185,7 @@ vshWatchJob(vshControl *ctl, ret = virDomainGetJobInfo(dom, &jobinfo); pthread_sigmask(SIG_SETMASK, &oldsigmask, NULL); if (ret == 0) { - if (verbose) + if (verbose && jobinfo.dataTotal) vshPrintJobProgress(label, jobinfo.dataRemaining, jobinfo.dataTotal);

On 01/19/2015 03:01 PM, John Ferlan wrote:
Revisiting, now that the release is done.
On 01/12/2015 05:54 PM, Eric Blake wrote:
At least with live block commit, it is possible to have a block job that reports 0 status: namely, when the active image contains no sectors that differ from the backing image it is being committed into [1]. I'm not sure if that represents a qemu bug, but it leads to weird virsh output where 'virsh blockjob $dom vda' has no output during a no-op commit job. It appears that the special case for a zero total was first introduced for migration, where it does sort of make sense (when we do storage migration, the job is broken up into two pieces where the first half of migrating storage has no clue what the total length of the second phase will be, and where qemu migration always reports a non-zero total length but only once we complete the first phase to start actual migration), but it doesn't seem to make sense for any of the block jobs.
@@ -1678,10 +1678,6 @@ vshPrintJobProgress(const char *label, unsigned long long remaining, { int progress;
- if (total == 0) - /* migration has not been started */ - return; -
Would it be necessarily true that remaining was zero at this point?
I've never seen a case where qemu (and thus libvirt) reported remaining
total.
Because if it wasn't then, the else condition will divide by zero if total == 0... More than 1 caller to this function...
So I think we are safe in avoiding the divide by 0 potential. Of the multiple callers, many are related to block jobs (where 0 status is likely synonymous with no work to do) and the only caller I modified (migration) is where 0 status means not yet started.
Perhaps safer to say if "remaining == 0 || total == 0"?
I was just reviewing Michal's patch from last week:
http://www.redhat.com/archives/libvir-list/2015-January/msg00230.html
where it seems a zero could imply some sort of failure. If you did the total == 0 check, then the && jobinfo.dataTotal isn't necessary... I would suppose that an error would mean we're 100% done...
I'm also debating whether qemu has a bug for reporting 0 (it would be nicer if it always reserved 0 for not started, and non-zero for completed) - but how would we tell the difference between a fixed and broken qemu? -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 01/27/2015 12:48 PM, Eric Blake wrote:
On 01/19/2015 03:01 PM, John Ferlan wrote:
Revisiting, now that the release is done.
On 01/12/2015 05:54 PM, Eric Blake wrote:
At least with live block commit, it is possible to have a block job that reports 0 status: namely, when the active image contains no sectors that differ from the backing image it is being committed into [1]. I'm not sure if that represents a qemu bug, but it leads to weird virsh output where 'virsh blockjob $dom vda' has no output during a no-op commit job. It appears that the special case for a zero total was first introduced for migration, where it does sort of make sense (when we do storage migration, the job is broken up into two pieces where the first half of migrating storage has no clue what the total length of the second phase will be, and where qemu migration always reports a non-zero total length but only once we complete the first phase to start actual migration), but it doesn't seem to make sense for any of the block jobs.
@@ -1678,10 +1678,6 @@ vshPrintJobProgress(const char *label, unsigned long long remaining, { int progress;
- if (total == 0) - /* migration has not been started */ - return; -
Would it be necessarily true that remaining was zero at this point?
I've never seen a case where qemu (and thus libvirt) reported remaining
total.
Because if it wasn't then, the else condition will divide by zero if total == 0... More than 1 caller to this function...
So I think we are safe in avoiding the divide by 0 potential. Of the multiple callers, many are related to block jobs (where 0 status is likely synonymous with no work to do) and the only caller I modified (migration) is where 0 status means not yet started.
What about the following? - int progress; - - if (total == 0) - /* migration has not been started */ - return; + int progress = 0; if (remaining == 0) { /* migration has completed */ progress = 100; - } else { + } else if (remaining > 0) { That way if this is called by "others" which use "... info.end - info.cur, info.end);", we avoid the chance that info.end == 0 and there's a divide by zero. The remaining == 0 still means we're done, the remaining < 0 means perhaps we haven't started (no progress). The only question I'd have is how much of an infinite loop this could be if total never got > 0... John
Perhaps safer to say if "remaining == 0 || total == 0"?
I was just reviewing Michal's patch from last week:
http://www.redhat.com/archives/libvir-list/2015-January/msg00230.html
where it seems a zero could imply some sort of failure. If you did the total == 0 check, then the && jobinfo.dataTotal isn't necessary... I would suppose that an error would mean we're 100% done...
I'm also debating whether qemu has a bug for reporting 0 (it would be nicer if it always reserved 0 for not started, and non-zero for completed) - but how would we tell the difference between a fixed and broken qemu?
participants (2)
-
Eric Blake
-
John Ferlan