
On 04/13/2012 01:29 AM, Michal Privoznik wrote:
On 12.04.2012 21:59, Eric Blake wrote:
I'm tired of shell-scripting to wait for completion of a block pull, when virsh can be taught to do the same. I couldn't quite reuse vshWatchJob, as this is not a case of a long-running command where a second thread must be used to probe job status (at least, not unless I make virsh start doing blocking waits for an event to fire), but it served as inspiration for my simpler single-threaded loop. There is up to a half-second delay between sending SIGINT and the job being aborted, but I didn't think it worth the complexity of a second thread and use of poll() just to minimize that delay.
@@ -7295,8 +7295,9 @@ repoll: }
GETTIMEOFDAY(&curr); - if ( timeout && ((int)(curr.tv_sec - start.tv_sec) * 1000 + \ - (int)(curr.tv_usec - start.tv_usec) / 1000) > timeout * 1000 ) { + if (timeout && (((int)(curr.tv_sec - start.tv_sec) * 1000 + + (int)(curr.tv_usec - start.tv_usec) / 1000) > + timeout * 1000)) { /* suspend the domain when migration timeouts. */ vshDebug(ctl, VSH_ERR_DEBUG, "%s timeout", label); if (timeout_func)
These two chunks are rather cosmetic but I'm okay with having them in this patch not a separate one.
I noticed them because I was copying and pasting from them. Depending on whether I spot other cleanups for my v2, I may split the cleanups into a separate patch.
+ while (blocking) { + virDomainBlockJobInfo info; + int result = virDomainGetBlockJobInfo(dom, path, &info, 0); + + if (result < 0) { + vshError(ctl, _("failed to query job for disk %s"), path); + goto cleanup; + } + if (result == 0) + break; + + if (verbose) + print_job_progress(_("Block Pull"), info.end - info.cur, info.end); + + GETTIMEOFDAY(&curr); + if (intCaught || (timeout && + (((int)(curr.tv_sec - start.tv_sec) * 1000 + + (int)(curr.tv_usec - start.tv_usec) / 1000) > + timeout * 1000))) { + vshDebug(ctl, VSH_ERR_DEBUG, + intCaught ? "interrupted" : "timeout"); + intCaught = 0; + timeout = 0; + quit = true; + if (virDomainBlockJobAbort(dom, path, 0) < 0) { + vshError(ctl, _("failed to abort job for disk %s"), path); + goto cleanup; + }
Don't we need blocking = false; here? Otherwise we may spin another rounds and issue GetBlockJobInfo API over and over again. But on the other hand, I can imagine this is desired behavior. If BlockJobAbort succeeds and returns 0 (we are enforcing synchronous operation here) it should mean that block job has really been aborted. Therefore we may issue GetBlockJobInfo to ascertain ourselves.
Hmm, interesting question, especially in light of the new async cancel flag. I'm not sure how long a cancellation can take (it has to flush all pending I/O, which may take a while), but you are correct that passing in 0 for the abort flags means to wait until the cancel is done; we may be waiting up to half a second to issue the abort from the time you pressed Ctrl-C, but worse, we are now waiting for the entire duration of the BlockJobAbort call, which might possibly be in the range of seconds, and that can feel awfully slow to a user trying to hit Ctrl-C to regain control of their terminal as fast as possible. I think I need to add an --async flag that controls whether we return control to the user as fast as possible after Ctrl-C (and fails if we are talking to a too-old libvirtd), or omit the flag so that we guarantee to wait until the job is cancelled, even if it takes a while after the Ctrl-C. I'll have to respin this patch, and since I want to do the same for my new 'virsh blockcopy' patch, I'll fold it into my block storage migration series. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org