QEMU does not abandon the mirror. The job carries on in the synchronised
phase and it might be either pivoted again or cancelled. The commit
hints that the described behavior was happening in a downstream version.
If the command returns false there are two possible options:
1) qemu did not reach the point where it would ask the block job to
pivot
2) pivotting failed in the actual qemu coroutine
If either of those would happen we return failure and reset the
condition that waits for the block job to complete. This makes the API
fail but in case where qemu would actually abandon the mirror the fact
is notified via the event and handled asynchronously.
Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1202704
---
Notes:
I've spent some time looking how the active commit and copy job actually
works in qemu, but I did not check if that behavior changed in the upstream
releases. At any rate, it makes sense thus I expect that it was there ever-since.
Version 2:
- this version resets the flag that makes libvirt wait on the event. This should
make the API as rugged as it can possibly be.
src/qemu/qemu_driver.c | 27 ++++++---------------------
1 file changed, 6 insertions(+), 21 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 2dd8ed4..52c3587 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -16128,27 +16128,10 @@ qemuDomainBlockPivot(virQEMUDriverPtr driver,
}
if (ret < 0) {
- /* On failure, qemu abandons the mirror, and reverts back to
- * the source disk (RHEL 6.3 has a bug where the revert could
- * cause catastrophic failure in qemu, but we don't need to
- * worry about it here as it is not an upstream qemu problem. */
- /* XXX should we be parsing the exact qemu error, or calling
- * 'query-block', to see what state we really got left in
- * before killing the mirroring job?
- * XXX We want to revoke security labels and disk lease, as
- * well as audit that revocation, before dropping the original
- * source. But it gets tricky if both source and mirror share
- * common backing files (we want to only revoke the non-shared
- * portion of the chain); so for now, we leak the access to
- * the original. */
- virStorageSourceFree(disk->mirror);
- disk->mirror = NULL;
- disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_NONE;
- disk->mirrorJob = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN;
- disk->blockjob = false;
+ /* The pivot failed. The block job in QEMU remains in the synchronised
+ * phase. Reset the state we changed and return the error to the user */
+ disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_READY;
}
- if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0)
- ret = -1;
cleanup:
if (oldsrc)
@@ -16354,8 +16337,10 @@ qemuDomainBlockJobAbort(virDomainPtr dom,
if (disk->mirror && (flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)) {
ret = qemuDomainBlockPivot(driver, vm, device, disk);
- if (ret < 0 && async)
+ if (ret < 0 && async) {
+ disk->blockJobSync = false;
goto endjob;
+ }
goto waitjob;
}
if (disk->mirror) {
--
2.2.2