[libvirt] [PATCH 0/3] qemu: fix broken block job handling

Block job handling violates our usage of domain jobs and changes disk source definition behind our back. Peter Krempa (3): qemu: process: Export qemuProcessFindDomainDiskByAlias qemu: event: Don't fiddle with disk backing trees without a job qemu: Disallow concurrent block jobs on a single disk src/conf/domain_conf.h | 1 + src/qemu/qemu_domain.c | 23 +++++++ src/qemu/qemu_domain.h | 4 ++ src/qemu/qemu_driver.c | 170 +++++++++++++++++++++++++++++++++++++++++++----- src/qemu/qemu_process.c | 131 +++++++------------------------------ src/qemu/qemu_process.h | 3 + 6 files changed, 211 insertions(+), 121 deletions(-) -- 2.2.2

--- src/qemu/qemu_process.c | 2 +- src/qemu/qemu_process.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d1f089d..28c3c27 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -390,7 +390,7 @@ qemuProcessFindDomainDiskByPath(virDomainObjPtr vm, return NULL; } -static virDomainDiskDefPtr +virDomainDiskDefPtr qemuProcessFindDomainDiskByAlias(virDomainObjPtr vm, const char *alias) { diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h index 5c70803..3c04179 100644 --- a/src/qemu/qemu_process.h +++ b/src/qemu/qemu_process.h @@ -112,4 +112,7 @@ int qemuProcessSPICEAllocatePorts(virQEMUDriverPtr driver, virDomainGraphicsDefPtr graphics, bool allocate); +virDomainDiskDefPtr qemuProcessFindDomainDiskByAlias(virDomainObjPtr vm, + const char *alias); + #endif /* __QEMU_PROCESS_H__ */ -- 2.2.2

On 03/13/2015 10:25 AM, Peter Krempa wrote:
--- src/qemu/qemu_process.c | 2 +- src/qemu/qemu_process.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-)
ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Surprisingly we did not grab a VM job when a block job finished and we'd happily rewrite the backing chain data. This made it possible to crash libvirt when queueing two backing chains tightly and other badness. To fix it, add yet another handler to the helper thread that handles monitor events that require a job. --- src/qemu/qemu_domain.h | 2 + src/qemu/qemu_driver.c | 142 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 129 ++++++++----------------------------------- 3 files changed, 168 insertions(+), 105 deletions(-) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index fe3e2b1..a7ebb47 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -196,6 +196,7 @@ typedef enum { QEMU_PROCESS_EVENT_DEVICE_DELETED, QEMU_PROCESS_EVENT_NIC_RX_FILTER_CHANGED, QEMU_PROCESS_EVENT_SERIAL_CHANGED, + QEMU_PROCESS_EVENT_BLOCK_JOB, QEMU_PROCESS_EVENT_LAST } qemuProcessEventType; @@ -204,6 +205,7 @@ struct qemuProcessEvent { virDomainObjPtr vm; qemuProcessEventType eventType; int action; + int status; void *data; }; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index b3263ac..1aed55f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4448,6 +4448,141 @@ processSerialChangedEvent(virQEMUDriverPtr driver, } +static void +processBlockJobEvent(virQEMUDriverPtr driver, + virDomainObjPtr vm, + char *diskAlias, + int type, + int status) +{ + virObjectEventPtr event = NULL; + virObjectEventPtr event2 = NULL; + const char *path; + virDomainDiskDefPtr disk; + virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + virDomainDiskDefPtr persistDisk = NULL; + bool save = false; + + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) + goto cleanup; + + if (!virDomainObjIsActive(vm)) { + VIR_DEBUG("Domain is not running"); + goto endjob; + } + + disk = qemuProcessFindDomainDiskByAlias(vm, diskAlias); + + if (disk) { + /* Have to generate two variants of the event for old vs. new + * client callbacks */ + if (type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT && + disk->mirrorJob == VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT) + type = disk->mirrorJob; + path = virDomainDiskGetSource(disk); + event = virDomainEventBlockJobNewFromObj(vm, path, type, status); + event2 = virDomainEventBlockJob2NewFromObj(vm, disk->dst, type, + status); + + /* If we completed a block pull or commit, then update the XML + * to match. */ + switch ((virConnectDomainEventBlockJobStatus) status) { + case VIR_DOMAIN_BLOCK_JOB_COMPLETED: + if (disk->mirrorState == VIR_DOMAIN_DISK_MIRROR_STATE_PIVOT) { + if (vm->newDef) { + int indx = virDomainDiskIndexByName(vm->newDef, disk->dst, + false); + virStorageSourcePtr copy = NULL; + + if (indx >= 0) { + persistDisk = vm->newDef->disks[indx]; + copy = virStorageSourceCopy(disk->mirror, false); + if (virStorageSourceInitChainElement(copy, + persistDisk->src, + true) < 0) { + VIR_WARN("Unable to update persistent definition " + "on vm %s after block job", + vm->def->name); + virStorageSourceFree(copy); + copy = NULL; + persistDisk = NULL; + } + } + if (copy) { + virStorageSourceFree(persistDisk->src); + persistDisk->src = copy; + } + } + + /* XXX We want to revoke security labels and disk + * lease, as well as audit that revocation, before + * dropping the original source. But it gets tricky + * if both source and mirror share common backing + * files (we want to only revoke the non-shared + * portion of the chain); so for now, we leak the + * access to the original. */ + virStorageSourceFree(disk->src); + disk->src = disk->mirror; + } else { + virStorageSourceFree(disk->mirror); + } + + /* Recompute the cached backing chain to match our + * updates. Better would be storing the chain ourselves + * rather than reprobing, but we haven't quite completed + * that conversion to use our XML tracking. */ + disk->mirror = NULL; + save = disk->mirrorState != VIR_DOMAIN_DISK_MIRROR_STATE_NONE; + disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_NONE; + disk->mirrorJob = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; + ignore_value(qemuDomainDetermineDiskChain(driver, vm, disk, + true, true)); + break; + + case VIR_DOMAIN_BLOCK_JOB_READY: + disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_READY; + save = true; + break; + + case VIR_DOMAIN_BLOCK_JOB_FAILED: + case VIR_DOMAIN_BLOCK_JOB_CANCELED: + virStorageSourceFree(disk->mirror); + disk->mirror = NULL; + disk->mirrorState = status == VIR_DOMAIN_BLOCK_JOB_FAILED ? + VIR_DOMAIN_DISK_MIRROR_STATE_ABORT : VIR_DOMAIN_DISK_MIRROR_STATE_NONE; + disk->mirrorJob = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; + save = true; + break; + + case VIR_DOMAIN_BLOCK_JOB_LAST: + break; + } + } + + if (save) { + if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0) + VIR_WARN("Unable to save status on vm %s after block job", + vm->def->name); + if (persistDisk && virDomainSaveConfig(cfg->configDir, + vm->newDef) < 0) + VIR_WARN("Unable to update persistent definition on vm %s " + "after block job", vm->def->name); + } + virObjectUnlock(vm); + virObjectUnref(cfg); + + if (event) + qemuDomainEventQueue(driver, event); + if (event2) + qemuDomainEventQueue(driver, event2); + + endjob: + qemuDomainObjEndJob(driver, vm); + cleanup: + VIR_FREE(diskAlias); +} + + static void qemuProcessEventHandler(void *data, void *opaque) { struct qemuProcessEvent *processEvent = data; @@ -4474,6 +4609,13 @@ static void qemuProcessEventHandler(void *data, void *opaque) case QEMU_PROCESS_EVENT_SERIAL_CHANGED: processSerialChangedEvent(driver, vm, processEvent->data, processEvent->action); + break; + case QEMU_PROCESS_EVENT_BLOCK_JOB: + processBlockJobEvent(driver, vm, + processEvent->data, + processEvent->action, + processEvent->status); + break; case QEMU_PROCESS_EVENT_LAST: break; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 28c3c27..b841e8d 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1017,123 +1017,42 @@ qemuProcessHandleBlockJob(qemuMonitorPtr mon ATTRIBUTE_UNUSED, void *opaque) { virQEMUDriverPtr driver = opaque; - virObjectEventPtr event = NULL; - virObjectEventPtr event2 = NULL; - const char *path; - virDomainDiskDefPtr disk; - virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); - virDomainDiskDefPtr persistDisk = NULL; - bool save = false; + struct qemuProcessEvent *processEvent = NULL; + char *data; virObjectLock(vm); - disk = qemuProcessFindDomainDiskByAlias(vm, diskAlias); - if (disk) { - /* Have to generate two variants of the event for old vs. new - * client callbacks */ - if (type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT && - disk->mirrorJob == VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT) - type = disk->mirrorJob; - path = virDomainDiskGetSource(disk); - event = virDomainEventBlockJobNewFromObj(vm, path, type, status); - event2 = virDomainEventBlockJob2NewFromObj(vm, disk->dst, type, - status); - - /* If we completed a block pull or commit, then update the XML - * to match. */ - switch ((virConnectDomainEventBlockJobStatus) status) { - case VIR_DOMAIN_BLOCK_JOB_COMPLETED: - if (disk->mirrorState == VIR_DOMAIN_DISK_MIRROR_STATE_PIVOT) { - if (vm->newDef) { - int indx = virDomainDiskIndexByName(vm->newDef, disk->dst, - false); - virStorageSourcePtr copy = NULL; - - if (indx >= 0) { - persistDisk = vm->newDef->disks[indx]; - copy = virStorageSourceCopy(disk->mirror, false); - if (virStorageSourceInitChainElement(copy, - persistDisk->src, - true) < 0) { - VIR_WARN("Unable to update persistent definition " - "on vm %s after block job", - vm->def->name); - virStorageSourceFree(copy); - copy = NULL; - persistDisk = NULL; - } - } - if (copy) { - virStorageSourceFree(persistDisk->src); - persistDisk->src = copy; - } - } - - /* XXX We want to revoke security labels and disk - * lease, as well as audit that revocation, before - * dropping the original source. But it gets tricky - * if both source and mirror share common backing - * files (we want to only revoke the non-shared - * portion of the chain); so for now, we leak the - * access to the original. */ - virStorageSourceFree(disk->src); - disk->src = disk->mirror; - } else { - virStorageSourceFree(disk->mirror); - } + VIR_DEBUG("Block job for device %s (domain: %p,%s) type %d status %d", + diskAlias, vm, vm->def->name, type, status); - /* Recompute the cached backing chain to match our - * updates. Better would be storing the chain ourselves - * rather than reprobing, but we haven't quite completed - * that conversion to use our XML tracking. */ - disk->mirror = NULL; - save = disk->mirrorState != VIR_DOMAIN_DISK_MIRROR_STATE_NONE; - disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_NONE; - disk->mirrorJob = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; - ignore_value(qemuDomainDetermineDiskChain(driver, vm, disk, - true, true)); - break; - - case VIR_DOMAIN_BLOCK_JOB_READY: - disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_READY; - save = true; - break; + if (VIR_ALLOC(processEvent) < 0) + goto error; - case VIR_DOMAIN_BLOCK_JOB_FAILED: - case VIR_DOMAIN_BLOCK_JOB_CANCELED: - virStorageSourceFree(disk->mirror); - disk->mirror = NULL; - disk->mirrorState = status == VIR_DOMAIN_BLOCK_JOB_FAILED ? - VIR_DOMAIN_DISK_MIRROR_STATE_ABORT : VIR_DOMAIN_DISK_MIRROR_STATE_NONE; - disk->mirrorJob = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; - save = true; - break; + processEvent->eventType = QEMU_PROCESS_EVENT_BLOCK_JOB; + if (VIR_STRDUP(data, diskAlias) < 0) + goto error; + processEvent->data = data; + processEvent->vm = vm; + processEvent->action = type; + processEvent->status = status; - case VIR_DOMAIN_BLOCK_JOB_LAST: - break; - } + virObjectRef(vm); + if (virThreadPoolSendJob(driver->workerPool, 0, processEvent) < 0) { + ignore_value(virObjectUnref(vm)); + goto error; } - if (save) { - if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0) - VIR_WARN("Unable to save status on vm %s after block job", - vm->def->name); - if (persistDisk && virDomainSaveConfig(cfg->configDir, - vm->newDef) < 0) - VIR_WARN("Unable to update persistent definition on vm %s " - "after block job", vm->def->name); - } + cleanup: virObjectUnlock(vm); - virObjectUnref(cfg); - - if (event) - qemuDomainEventQueue(driver, event); - if (event2) - qemuDomainEventQueue(driver, event2); - return 0; + error: + if (processEvent) + VIR_FREE(processEvent->data); + VIR_FREE(processEvent); + goto cleanup; } + static int qemuProcessHandleGraphics(qemuMonitorPtr mon ATTRIBUTE_UNUSED, virDomainObjPtr vm, -- 2.2.2

On 03/13/2015 10:25 AM, Peter Krempa wrote:
Surprisingly we did not grab a VM job when a block job finished and we'd happily rewrite the backing chain data. This made it possible to crash libvirt when queueing two backing chains tightly and other badness.
My fault for violating the rule of 'no VM modifications without a job'. Thanks for finding and fixing this.
To fix it, add yet another handler to the helper thread that handles monitor events that require a job. --- src/qemu/qemu_domain.h | 2 + src/qemu/qemu_driver.c | 142 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 129 ++++++++----------------------------------- 3 files changed, 168 insertions(+), 105 deletions(-)
+processBlockJobEvent(virQEMUDriverPtr driver, + virDomainObjPtr vm, + char *diskAlias, + int type, + int status) +{ + virObjectEventPtr event = NULL; + virObjectEventPtr event2 = NULL; + const char *path; + virDomainDiskDefPtr disk; + virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + virDomainDiskDefPtr persistDisk = NULL; + bool save = false; + + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) + goto cleanup; + + if (!virDomainObjIsActive(vm)) { + VIR_DEBUG("Domain is not running"); + goto endjob; + }
Hmm. I suspect we could hit the situation where the pivot after active block commit or block copy happens just before the guest quits, and thus where we fail to update the XML to record that the pivot happened and instead leave the XML in its old state. Probably not the end of the world (at that point, the amount of work done by the guest after pivot is not much, so restarting the guest from the unpivoted disk is hopefully not inconsistent); except that a guest shutting down might make the small modification of marking a file system clean that turns out to have a large impact on how the guest next starts if it starts from the wrong disk. I don't know if we can do any better, and at a minimum this is a strict improvement over what we had before, so I'm not going to reject the patch because of it, but it is food for thought. [Actually, we probably need the notion of persistent bitmaps, which looks like it might hit qemu 2.4, before we can guarantee that we KNOW when a pivot job completed or failed, and thus always update the XML correctly, based on the state of the persistent bitmap file]
virObjectLock(vm); - disk = qemuProcessFindDomainDiskByAlias(vm, diskAlias);
- if (disk) {
The old code was locking the VM, but modifying without a job.
+ processEvent->eventType = QEMU_PROCESS_EVENT_BLOCK_JOB; + if (VIR_STRDUP(data, diskAlias) < 0) + goto error; + processEvent->data = data; + processEvent->vm = vm; + processEvent->action = type; + processEvent->status = status;
- case VIR_DOMAIN_BLOCK_JOB_LAST: - break; - } + virObjectRef(vm); + if (virThreadPoolSendJob(driver->workerPool, 0, processEvent) < 0) { + ignore_value(virObjectUnref(vm)); + goto error;
The new code is now throwing things over the fence to a helper thread, so that blocking while waiting for the job is no longer holding up monitor interactions. Looks correct to me. ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Fri, 13 Mar 2015, Peter Krempa wrote:
Surprisingly we did not grab a VM job when a block job finished and we'd happily rewrite the backing chain data. This made it possible to crash libvirt when queueing two backing chains tightly and other badness.
To fix it, add yet another handler to the helper thread that handles monitor events that require a job.
Hi Peter, Unfortunately, I think this change has broken disk mirroring during migration. When a disk mirror is ready (or aborted), processBlockJobEvent blocks when beginning its MODIFY job because of the ongoing MIGRATION_OUT async job. I'd like to solve this by making use of blockJobSync from within qemuMigrationDriveMirror. This is also going to fix another bug where libvirt doesn't wait for QEMU to complete the drive mirroring before migrating a VM (leading to disk corruption if there's ongoing writes!) The plan would be to move qemuBlockJobEventProcess into a separate qemu_blockjob.c, so it can be used from both qemu_driver and qemu_migration, along with a few helper functions to set up blockJobSync and to wait for it to be notified. Do you think this would be a suitable approach? - Michael

On 08.04.2015 08:35, Michael Chapman wrote:
On Fri, 13 Mar 2015, Peter Krempa wrote:
Surprisingly we did not grab a VM job when a block job finished and we'd happily rewrite the backing chain data. This made it possible to crash libvirt when queueing two backing chains tightly and other badness.
To fix it, add yet another handler to the helper thread that handles monitor events that require a job.
Hi Peter,
Unfortunately, I think this change has broken disk mirroring during migration. When a disk mirror is ready (or aborted), processBlockJobEvent blocks when beginning its MODIFY job because of the ongoing MIGRATION_OUT async job.
Yes, I've already told Peter in person a week ago.
I'd like to solve this by making use of blockJobSync from within qemuMigrationDriveMirror. This is also going to fix another bug where libvirt doesn't wait for QEMU to complete the drive mirroring before migrating a VM (leading to disk corruption if there's ongoing writes!)
The plan would be to move qemuBlockJobEventProcess into a separate qemu_blockjob.c, so it can be used from both qemu_driver and qemu_migration, along with a few helper functions to set up blockJobSync and to wait for it to be notified. Do you think this would be a suitable approach?
Sounds good to me. Michal

While qemu may be prepared to do this libvirt is not. Forbid the block ops until we fix our code. --- src/conf/domain_conf.h | 1 + src/qemu/qemu_domain.c | 23 +++++++++++++++++++++++ src/qemu/qemu_domain.h | 2 ++ src/qemu/qemu_driver.c | 28 +++++++++++++--------------- 4 files changed, 39 insertions(+), 15 deletions(-) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index ea463cb..6f2df46 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -664,6 +664,7 @@ struct _virDomainDiskDef { int tray_status; /* enum virDomainDiskTray */ int removable; /* enum virTristateSwitch */ + bool blockjob; virStorageSourcePtr mirror; int mirrorState; /* enum virDomainDiskMirrorState */ int mirrorJob; /* virDomainBlockJobType */ diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d8a2087..ff4307b 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2756,6 +2756,29 @@ qemuDomainDetermineDiskChain(virQEMUDriverPtr driver, return ret; } + +bool +qemuDomainDiskBlockJobIsActive(virDomainDiskDefPtr disk) +{ + if (disk->mirror) { + virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, + _("disk '%s' already in active block job"), + disk->dst); + + return true; + } + + if (disk->blockjob) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("disk '%s' already in active block job"), + disk->dst); + return true; + } + + return false; +} + + int qemuDomainUpdateDeviceList(virQEMUDriverPtr driver, virDomainObjPtr vm, diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index a7ebb47..41e075b 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -414,6 +414,8 @@ int qemuDomainJobInfoToParams(qemuDomainJobInfoPtr jobInfo, ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4); +bool qemuDomainDiskBlockJobIsActive(virDomainDiskDefPtr disk); + void qemuDomObjEndAPI(virDomainObjPtr *vm); #endif /* __QEMU_DOMAIN_H__ */ diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 1aed55f..864ee50 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4537,6 +4537,7 @@ processBlockJobEvent(virQEMUDriverPtr driver, disk->mirrorJob = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; ignore_value(qemuDomainDetermineDiskChain(driver, vm, disk, true, true)); + disk->blockjob = false; break; case VIR_DOMAIN_BLOCK_JOB_READY: @@ -4552,6 +4553,7 @@ processBlockJobEvent(virQEMUDriverPtr driver, VIR_DOMAIN_DISK_MIRROR_STATE_ABORT : VIR_DOMAIN_DISK_MIRROR_STATE_NONE; disk->mirrorJob = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; save = true; + disk->blockjob = false; break; case VIR_DOMAIN_BLOCK_JOB_LAST: @@ -16133,6 +16135,7 @@ qemuDomainBlockPivot(virConnectPtr conn, disk->mirror = NULL; disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_NONE; disk->mirrorJob = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; + disk->blockjob = false; } if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0) ret = -1; @@ -16233,12 +16236,9 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, goto endjob; disk = vm->def->disks[idx]; - if (mode == BLOCK_JOB_PULL && disk->mirror) { - virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, - _("disk '%s' already in active block job"), - disk->dst); + if (mode == BLOCK_JOB_PULL && qemuDomainDiskBlockJobIsActive(disk)) goto endjob; - } + if (mode == BLOCK_JOB_ABORT) { if ((flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT) && !(async && disk->mirror)) { @@ -16322,6 +16322,8 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, if (mode == BLOCK_JOB_ABORT && disk->mirror) disk->mirrorState = VIR_DOMAIN_DISK_MIRROR_STATE_NONE; goto endjob; + } else if (mode == BLOCK_JOB_PULL) { + disk->blockjob = true; } waitjob: @@ -16573,12 +16575,8 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, if (!device) goto endjob; disk = vm->def->disks[idx]; - if (disk->mirror) { - virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, - _("disk '%s' already in active block job"), - disk->dst); + if (qemuDomainDiskBlockJobIsActive(disk)) goto endjob; - } if (!(virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_MIRROR) && virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_BLOCKJOB_ASYNC))) { @@ -16699,6 +16697,7 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, disk->mirror = mirror; mirror = NULL; disk->mirrorJob = VIR_DOMAIN_BLOCK_JOB_TYPE_COPY; + disk->blockjob = true; if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0) VIR_WARN("Unable to save status on vm %s after state change", @@ -16960,12 +16959,9 @@ qemuDomainBlockCommit(virDomainPtr dom, disk->dst); goto endjob; } - if (disk->mirror) { - virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, - _("disk '%s' already in active block job"), - disk->dst); + + if (qemuDomainDiskBlockJobIsActive(disk)) goto endjob; - } if (qemuDomainDetermineDiskChain(driver, vm, disk, false, true) < 0) goto endjob; @@ -17089,6 +17085,8 @@ qemuDomainBlockCommit(virDomainPtr dom, goto endjob; } + disk->blockjob = true; + if (mirror) { if (ret == 0) { virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); -- 2.2.2

On 03/13/2015 10:25 AM, Peter Krempa wrote:
While qemu may be prepared to do this libvirt is not. Forbid the block ops until we fix our code. --- src/conf/domain_conf.h | 1 + src/qemu/qemu_domain.c | 23 +++++++++++++++++++++++ src/qemu/qemu_domain.h | 2 ++ src/qemu/qemu_driver.c | 28 +++++++++++++--------------- 4 files changed, 39 insertions(+), 15 deletions(-)
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index ea463cb..6f2df46 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -664,6 +664,7 @@ struct _virDomainDiskDef { int tray_status; /* enum virDomainDiskTray */ int removable; /* enum virTristateSwitch */
+ bool blockjob;
Might be worth a FIXME comment acknowledging that a bool will be insufficient when we update our code to support parallel jobs.
+ +bool +qemuDomainDiskBlockJobIsActive(virDomainDiskDefPtr disk) +{ + if (disk->mirror) { + virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, + _("disk '%s' already in active block job"), + disk->dst); + + return true; + } + + if (disk->blockjob) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("disk '%s' already in active block job"), + disk->dst); + return true; + }
Shorter as 'if (disk->mirror || disk->blockjob)', up to you if you want to compress the common code. ACK. Looks like you have correctly locked out blockpull, blockcopy, and blockcommit as the three jobs that are all mutually exclusive according to our current abilities. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Fri, Mar 13, 2015 at 13:36:13 -0600, Eric Blake wrote:
On 03/13/2015 10:25 AM, Peter Krempa wrote:
While qemu may be prepared to do this libvirt is not. Forbid the block ops until we fix our code. --- src/conf/domain_conf.h | 1 + src/qemu/qemu_domain.c | 23 +++++++++++++++++++++++ src/qemu/qemu_domain.h | 2 ++ src/qemu/qemu_driver.c | 28 +++++++++++++--------------- 4 files changed, 39 insertions(+), 15 deletions(-)
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index ea463cb..6f2df46 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -664,6 +664,7 @@ struct _virDomainDiskDef { int tray_status; /* enum virDomainDiskTray */ int removable; /* enum virTristateSwitch */
+ bool blockjob;
Might be worth a FIXME comment acknowledging that a bool will be insufficient when we update our code to support parallel jobs.
Ok, I'll add a note.
+ +bool +qemuDomainDiskBlockJobIsActive(virDomainDiskDefPtr disk) +{ + if (disk->mirror) { + virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, + _("disk '%s' already in active block job"), + disk->dst); + + return true; + } + + if (disk->blockjob) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("disk '%s' already in active block job"), + disk->dst); + return true; + }
Shorter as 'if (disk->mirror || disk->blockjob)', up to you if you want to compress the common code.
The errors have different error codes and I wanted to avoid a ternary operator.
ACK. Looks like you have correctly locked out blockpull, blockcopy, and blockcommit as the three jobs that are all mutually exclusive according to our current abilities.
Peter

On 03/13/2015 10:25 AM, Peter Krempa wrote:
Block job handling violates our usage of domain jobs and changes disk source definition behind our back.
Peter Krempa (3): qemu: process: Export qemuProcessFindDomainDiskByAlias qemu: event: Don't fiddle with disk backing trees without a job qemu: Disallow concurrent block jobs on a single disk
On description alone, this should help solve the crash that Shanzhi observed: https://www.redhat.com/archives/libvir-list/2015-March/msg00656.html
src/conf/domain_conf.h | 1 + src/qemu/qemu_domain.c | 23 +++++++ src/qemu/qemu_domain.h | 4 ++ src/qemu/qemu_driver.c | 170 +++++++++++++++++++++++++++++++++++++++++++----- src/qemu/qemu_process.c | 131 +++++++------------------------------ src/qemu/qemu_process.h | 3 + 6 files changed, 211 insertions(+), 121 deletions(-)
-- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Fri, Mar 13, 2015 at 11:24:25 -0600, Eric Blake wrote:
On 03/13/2015 10:25 AM, Peter Krempa wrote:
Block job handling violates our usage of domain jobs and changes disk source definition behind our back.
Peter Krempa (3): qemu: process: Export qemuProcessFindDomainDiskByAlias qemu: event: Don't fiddle with disk backing trees without a job qemu: Disallow concurrent block jobs on a single disk
On description alone, this should help solve the crash that Shanzhi observed: https://www.redhat.com/archives/libvir-list/2015-March/msg00656.html
I've pushed this series. While my suggestion to check also the length of the backing chain reported by qemu would fix that crash as a symptom, this fixes the reason why it happened. Peter
participants (4)
-
Eric Blake
-
Michael Chapman
-
Michal Privoznik
-
Peter Krempa