
On Mon, 4 Jan 2016, Peter Krempa wrote:
On Thu, Dec 31, 2015 at 16:34:49 +1100, Michael Chapman wrote:
Wait for a block job event after the job has reached 100% only if exactly one of the BLOCK_JOB and BLOCK_JOB_2 callbacks were successfully registered.
If neither callback was registered, then no event will ever be received. If both were registered, then any user-supplied path is guaranteed to match one of the events.
Signed-off-by: Michael Chapman <mike@very.puzzling.org> ---
I have found that even a 2.5 second timeout isn't always sufficiently long for QEMU to flush a disk at the end of a block job.
I hope I've understood the code properly here, because as far as I can tell the comment I'm removing in this commit isn't right. The path the user supplies *has* to be either the <source file='name'/> or <target dev='name'/> exactly in order for the disk to be matched, and these are precisely the two strings used by the two events.
tools/virsh-domain.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-)
In addition to Andrea's review:
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index edbbc34..60de9ba 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1942,18 +1942,20 @@ virshBlockJobWait(virshBlockJobWaitDataPtr data) goto cleanup; }
- /* since virsh can't guarantee that the path provided by the user will - * later be matched in the event we will need to keep the fallback - * approach and claim success if the block job finishes or vanishes. */ - if (result == 0) - break; + /* if either event could not be registered we can't guarantee that the + * path provided by the user will be matched, so keep the fallback + * approach and claim success if the block job finishes or vanishes */
The new statement is not true. If the user provides a filesystem path different from what is in the definition but matching the same volume [1] the callback will still return the path present in the configuration and thus might not ever match and the job would hang forever.
[1] e.g. /path/to/image and /path/to/../to/image are equivalent but won't be equal using strcmp. The problem is even more prominent if you mix in some symlinks.
As far I can tell the block job won't even be able to be identified if the user specifies a different path than the one in the domain XML. At present, the only implementation of the virGetBlockJobInfo API is qemuDomainGetBlockJobInfo. The disk definition is found in the call chain: qemuDomainGetBlockJobInfo -> virDomainDiskByName -> virDomainDiskIndexByName and virDomainDiskIndexByName only does STREQ tests to match the supplied path against the <source> or <target> elements. So at present, the disk path supplied by the user of virGetBlockJobInfo has to be *exactly* the source path or the target name, and these are precisely the two strings used in the two events. The only safe way to allow different-but-equivalent paths to match the one disk definition would be record the device+inode numbers in the disk definition. We can't simply follow symlinks to resolve the path, since the symlinks could have changed since the domain was started -- e.g. /path/to/../to/image may now be equivalent to /path/to/image, but /path/to/image may not be the same as what it was when the domain was started. So on that basis I think we have to settle for requiring the virGetBlockJobInfo path to match the disk definition exactly.
+ if (data->cb_id2 < 0 || data->cb_id2 < 0) { + if (result == 0) + break;
- /* for two-phase jobs we will try to wait in the synchronized phase - * for event arrival since 100% completion doesn't necessarily mean that - * the block job has finished and can be terminated with success */ - if (info.end == info.cur && --retries == 0) { - ret = VIR_DOMAIN_BLOCK_JOB_READY; - goto cleanup; + /* only wait in the synchronized phase for event arrival if + * either callback was registered */ + if (info.end == info.cur && + ((data->cb_id2 < 0 && data->cb_id2 < 0) || --retries == 0)) { + ret = VIR_DOMAIN_BLOCK_JOB_READY; + goto cleanup; + } }
NACK to this approach, if there was a way how this ugly stuff could be avoided or deleted I'd already do so.
Peter
OK. At any rate, I do think 2.5 seconds is not enough. On a hypervisor with a large amount of memory I was able to generate block jobs that would take 10-15 seconds to fully flush out to disk. - Michael