[libvirt] [PATCH 0/7] Preparation for new QEMU migration states

Mostly refactoring of the horrible mess in qemuMigrationRun. Jiri Denemark (7): qemu: Use switch in qemuMigrationCompleted qemu: Refactor qemuMigrationRun a bit qemu: Split cleanup and error code in qemuMigrationRun qemu: Unite error handling in qemuMigrationRun qemu: Don't misuse "ret" in qemuMigrationRun qemu: Consistently use exit_monitor in qemuMigrationRun qemu: Set correct job status when qemuMigrationRun fails src/qemu/qemu_migration.c | 196 +++++++++++++++++++++++++--------------------- 1 file changed, 105 insertions(+), 91 deletions(-) -- 2.14.2

When adding a new job state it's useful to let the compiler complain about places where we need to think about what to do with the new state. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_migration.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 72edbb667..c3f9c38b2 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1531,18 +1531,31 @@ qemuMigrationCompleted(virQEMUDriverPtr driver, return 0; error: - /* state can not be active or completed at this point */ - if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_MIGRATING || - jobInfo->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { + switch (jobInfo->status) { + case QEMU_DOMAIN_JOB_STATUS_MIGRATING: + case QEMU_DOMAIN_JOB_STATUS_POSTCOPY: /* The migration was aborted by us rather than QEMU itself. */ jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -2; - } else if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED) { + + case QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED: + /* Something failed after QEMU already finished the migration. */ jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -1; - } else { + + case QEMU_DOMAIN_JOB_STATUS_FAILED: + case QEMU_DOMAIN_JOB_STATUS_CANCELED: + /* QEMU aborted the migration. */ return -1; + + case QEMU_DOMAIN_JOB_STATUS_ACTIVE: + case QEMU_DOMAIN_JOB_STATUS_COMPLETED: + case QEMU_DOMAIN_JOB_STATUS_NONE: + /* Impossible. */ + break; } + + return -1; } -- 2.14.2

On Thu, Oct 19, 2017 at 03:56 PM +0200, Jiri Denemark <jdenemar@redhat.com> wrote:
When adding a new job state it's useful to let the compiler complain about places where we need to think about what to do with the new state.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_migration.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 72edbb667..c3f9c38b2 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1531,18 +1531,31 @@ qemuMigrationCompleted(virQEMUDriverPtr driver, return 0;
error: - /* state can not be active or completed at this point */ - if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_MIGRATING || - jobInfo->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { + switch (jobInfo->status) { + case QEMU_DOMAIN_JOB_STATUS_MIGRATING: + case QEMU_DOMAIN_JOB_STATUS_POSTCOPY: /* The migration was aborted by us rather than QEMU itself. */ jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -2; - } else if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED) { + + case QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED: + /* Something failed after QEMU already finished the migration. */ jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -1; - } else { + + case QEMU_DOMAIN_JOB_STATUS_FAILED: + case QEMU_DOMAIN_JOB_STATUS_CANCELED: + /* QEMU aborted the migration. */ return -1; + + case QEMU_DOMAIN_JOB_STATUS_ACTIVE: + case QEMU_DOMAIN_JOB_STATUS_COMPLETED: + case QEMU_DOMAIN_JOB_STATUS_NONE: + /* Impossible. */ + break; } + + return -1; }
I think you have to add ATTRIBUTE_FALLTHROUGH for the intended fallthroughs (e.g. for gcc 7).
-- 2.14.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Beste Grüße / Kind regards Marc Hartmayer IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On Fri, Oct 20, 2017 at 11:28:37 +0200, Marc Hartmayer wrote:
On Thu, Oct 19, 2017 at 03:56 PM +0200, Jiri Denemark <jdenemar@redhat.com> wrote:
When adding a new job state it's useful to let the compiler complain about places where we need to think about what to do with the new state.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_migration.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 72edbb667..c3f9c38b2 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1531,18 +1531,31 @@ qemuMigrationCompleted(virQEMUDriverPtr driver, return 0;
error: - /* state can not be active or completed at this point */ - if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_MIGRATING || - jobInfo->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { + switch (jobInfo->status) { + case QEMU_DOMAIN_JOB_STATUS_MIGRATING: + case QEMU_DOMAIN_JOB_STATUS_POSTCOPY: /* The migration was aborted by us rather than QEMU itself. */ jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -2; - } else if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED) { + + case QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED: + /* Something failed after QEMU already finished the migration. */ jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -1; - } else { + + case QEMU_DOMAIN_JOB_STATUS_FAILED: + case QEMU_DOMAIN_JOB_STATUS_CANCELED: + /* QEMU aborted the migration. */ return -1; + + case QEMU_DOMAIN_JOB_STATUS_ACTIVE: + case QEMU_DOMAIN_JOB_STATUS_COMPLETED: + case QEMU_DOMAIN_JOB_STATUS_NONE: + /* Impossible. */ + break; } + + return -1; }
I think you have to add ATTRIBUTE_FALLTHROUGH for the intended fallthroughs (e.g. for gcc 7).
This should only be needed if there was any code between the cases. Jirka

On Fri, Oct 20, 2017 at 12:12 PM +0200, Jiri Denemark <jdenemar@redhat.com> wrote:
On Fri, Oct 20, 2017 at 11:28:37 +0200, Marc Hartmayer wrote:
On Thu, Oct 19, 2017 at 03:56 PM +0200, Jiri Denemark <jdenemar@redhat.com> wrote:
When adding a new job state it's useful to let the compiler complain about places where we need to think about what to do with the new state.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_migration.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 72edbb667..c3f9c38b2 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1531,18 +1531,31 @@ qemuMigrationCompleted(virQEMUDriverPtr driver, return 0;
error: - /* state can not be active or completed at this point */ - if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_MIGRATING || - jobInfo->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) { + switch (jobInfo->status) { + case QEMU_DOMAIN_JOB_STATUS_MIGRATING: + case QEMU_DOMAIN_JOB_STATUS_POSTCOPY: /* The migration was aborted by us rather than QEMU itself. */ jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -2; - } else if (jobInfo->status == QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED) { + + case QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED: + /* Something failed after QEMU already finished the migration. */ jobInfo->status = QEMU_DOMAIN_JOB_STATUS_FAILED; return -1; - } else { + + case QEMU_DOMAIN_JOB_STATUS_FAILED: + case QEMU_DOMAIN_JOB_STATUS_CANCELED: + /* QEMU aborted the migration. */ return -1; + + case QEMU_DOMAIN_JOB_STATUS_ACTIVE: + case QEMU_DOMAIN_JOB_STATUS_COMPLETED: + case QEMU_DOMAIN_JOB_STATUS_NONE: + /* Impossible. */ + break; } + + return -1; }
I think you have to add ATTRIBUTE_FALLTHROUGH for the intended fallthroughs (e.g. for gcc 7).
This should only be needed if there was any code between the cases.
Thanks for the information.
Jirka
-- Beste Grüße / Kind regards Marc Hartmayer IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

Some code which was supposed to be executed only when migration succeeded was buried inside the cleanup code. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_migration.c | 63 +++++++++++++++++++++++++++-------------------- 1 file changed, 36 insertions(+), 27 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index c3f9c38b2..f0d4f9d98 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3867,8 +3867,36 @@ qemuMigrationRun(virQEMUDriverPtr driver, qemuMigrationSetOffline(driver, vm) < 0) { goto cancelPostCopy; } - if (priv->job.completed) + + if (mig && mig->nbd && + qemuMigrationCancelDriveMirror(driver, vm, true, + QEMU_ASYNC_JOB_MIGRATION_OUT, + dconn) < 0) + goto cancelPostCopy; + + if (iothread) { + qemuMigrationIOThreadPtr io; + + VIR_STEAL_PTR(io, iothread); + if (qemuMigrationStopTunnel(io, false) < 0) + goto cancelPostCopy; + } + + if (priv->job.completed) { priv->job.completed->stopped = priv->job.current->stopped; + qemuDomainJobInfoUpdateTime(priv->job.completed); + qemuDomainJobInfoUpdateDowntime(priv->job.completed); + ignore_value(virTimeMillisNow(&priv->job.completed->sent)); + } + + cookieFlags |= QEMU_MIGRATION_COOKIE_NETWORK | + QEMU_MIGRATION_COOKIE_STATS; + + if (qemuMigrationCookieAddPersistent(mig, &persistDef) < 0 || + qemuMigrationBakeCookie(mig, driver, vm, cookieout, + cookieoutlen, cookieFlags) < 0) { + VIR_WARN("Unable to encode migration cookie"); + } ret = 0; @@ -3877,43 +3905,24 @@ qemuMigrationRun(virQEMUDriverPtr driver, orig_err = virSaveLastError(); /* cancel any outstanding NBD jobs */ - if (mig && mig->nbd) { - if (qemuMigrationCancelDriveMirror(driver, vm, ret == 0, - QEMU_ASYNC_JOB_MIGRATION_OUT, - dconn) < 0) - ret = -1; - } + if (ret < 0 && mig && mig->nbd) + qemuMigrationCancelDriveMirror(driver, vm, false, + QEMU_ASYNC_JOB_MIGRATION_OUT, + dconn); VIR_FREE(tlsAlias); VIR_FREE(secAlias); virObjectUnref(cfg); - if (spec->fwdType != MIGRATION_FWD_DIRECT) { - if (iothread && qemuMigrationStopTunnel(iothread, ret < 0) < 0) - ret = -1; - } - VIR_FORCE_CLOSE(fd); + if (ret < 0 && iothread) + qemuMigrationStopTunnel(iothread, true); - if (priv->job.completed) { - qemuDomainJobInfoUpdateTime(priv->job.completed); - qemuDomainJobInfoUpdateDowntime(priv->job.completed); - ignore_value(virTimeMillisNow(&priv->job.completed->sent)); - } + VIR_FORCE_CLOSE(fd); if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE || priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_MIGRATING) priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_FAILED; - cookieFlags |= QEMU_MIGRATION_COOKIE_NETWORK | - QEMU_MIGRATION_COOKIE_STATS; - - if (ret == 0 && - (qemuMigrationCookieAddPersistent(mig, &persistDef) < 0 || - qemuMigrationBakeCookie(mig, driver, vm, cookieout, - cookieoutlen, cookieFlags) < 0)) { - VIR_WARN("Unable to encode migration cookie"); - } - virDomainDefFree(persistDef); qemuMigrationCookieFree(mig); -- 2.14.2

Let cleanup only do things common to both failure and success paths and move error handling code inside the new "error" section. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_migration.c | 79 ++++++++++++++++++++++++----------------------- 1 file changed, 40 insertions(+), 39 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index f0d4f9d98..6956901be 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3657,20 +3657,20 @@ qemuMigrationRun(virQEMUDriverPtr driver, if (persist_xml) { if (!(persistDef = qemuMigrationPrepareDef(driver, persist_xml, NULL, NULL))) - goto cleanup; + goto error; } else { virDomainDefPtr def = vm->newDef ? vm->newDef : vm->def; if (!(persistDef = qemuDomainDefCopy(driver, def, VIR_DOMAIN_XML_SECURE | VIR_DOMAIN_XML_MIGRATABLE))) - goto cleanup; + goto error; } } mig = qemuMigrationEatCookie(driver, vm, cookiein, cookieinlen, cookieFlags | QEMU_MIGRATION_COOKIE_GRAPHICS); if (!mig) - goto cleanup; + goto error; if (qemuDomainMigrateGraphicsRelocate(driver, vm, mig, graphicsuri) < 0) VIR_WARN("unable to provide data for graphics client relocation"); @@ -3683,7 +3683,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, if (qemuMigrationAddTLSObjects(driver, vm, cfg, false, QEMU_ASYNC_JOB_MIGRATION_OUT, &tlsAlias, &secAlias, migParams) < 0) - goto cleanup; + goto error; /* We need to add tls-hostname whenever QEMU itself does not * connect directly to the destination. */ @@ -3691,17 +3691,17 @@ qemuMigrationRun(virQEMUDriverPtr driver, spec->destType == MIGRATION_DEST_FD) { if (VIR_STRDUP(migParams->migrateTLSHostname, spec->dest.host.name) < 0) - goto cleanup; + goto error; } else { /* Be sure there's nothing from a previous migration */ if (VIR_STRDUP(migParams->migrateTLSHostname, "") < 0) - goto cleanup; + goto error; } } else { if (qemuMigrationSetEmptyTLSParams(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, migParams) < 0) - goto cleanup; + goto error; } if (migrate_flags & (QEMU_MONITOR_MIGRATE_NON_SHARED_DISK | @@ -3715,7 +3715,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, nmigrate_disks, migrate_disks, dconn) < 0) { - goto cleanup; + goto error; } } else { /* Destination doesn't support NBD server. @@ -3729,37 +3729,37 @@ qemuMigrationRun(virQEMUDriverPtr driver, if (!(flags & VIR_MIGRATE_LIVE) && virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { if (qemuMigrationSetOffline(driver, vm) < 0) - goto cleanup; + goto error; } if (qemuMigrationSetCompression(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, compression, migParams) < 0) - goto cleanup; + goto error; if (qemuMigrationSetOption(driver, vm, QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE, flags & VIR_MIGRATE_AUTO_CONVERGE, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) - goto cleanup; + goto error; if (qemuMigrationSetOption(driver, vm, QEMU_MONITOR_MIGRATION_CAPS_RDMA_PIN_ALL, flags & VIR_MIGRATE_RDMA_PIN_ALL, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) - goto cleanup; + goto error; if (qemuMigrationSetPostCopy(driver, vm, flags & VIR_MIGRATE_POSTCOPY, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) - goto cleanup; + goto error; if (qemuMigrationSetParams(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, migParams) < 0) - goto cleanup; + goto error; if (qemuDomainObjEnterMonitorAsync(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) - goto cleanup; + goto error; if (priv->job.abortJob) { /* explicitly do this *after* we entered the monitor, @@ -3770,7 +3770,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, virReportError(VIR_ERR_OPERATION_ABORTED, _("%s: %s"), qemuDomainAsyncJobTypeToString(priv->job.asyncJob), _("canceled by client")); - goto cleanup; + goto error; } if (qemuMonitorSetMigrationSpeed(priv->mon, migrate_speed) < 0) @@ -3817,7 +3817,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, if (qemuDomainObjExitMonitor(driver, vm) < 0) ret = -1; if (ret < 0) - goto cleanup; + goto error; ret = -1; /* From this point onwards we *must* call cancel to abort the @@ -3846,7 +3846,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, if (rc == -2) goto cancel; else if (rc == -1) - goto cleanup; + goto error; if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) inPostCopy = true; @@ -3901,28 +3901,10 @@ qemuMigrationRun(virQEMUDriverPtr driver, ret = 0; cleanup: - if (ret < 0 && !orig_err) - orig_err = virSaveLastError(); - - /* cancel any outstanding NBD jobs */ - if (ret < 0 && mig && mig->nbd) - qemuMigrationCancelDriveMirror(driver, vm, false, - QEMU_ASYNC_JOB_MIGRATION_OUT, - dconn); - VIR_FREE(tlsAlias); VIR_FREE(secAlias); virObjectUnref(cfg); - - if (ret < 0 && iothread) - qemuMigrationStopTunnel(iothread, true); - VIR_FORCE_CLOSE(fd); - - if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE || - priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_MIGRATING) - priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_FAILED; - virDomainDefFree(persistDef); qemuMigrationCookieFree(mig); @@ -3936,9 +3918,28 @@ qemuMigrationRun(virQEMUDriverPtr driver, return ret; + error: + if (!orig_err) + orig_err = virSaveLastError(); + + /* cancel any outstanding NBD jobs */ + if (mig && mig->nbd) + qemuMigrationCancelDriveMirror(driver, vm, false, + QEMU_ASYNC_JOB_MIGRATION_OUT, + dconn); + + if (iothread) + qemuMigrationStopTunnel(iothread, true); + + if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE || + priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_MIGRATING) + priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_FAILED; + + goto cleanup; + exit_monitor: ignore_value(qemuDomainObjExitMonitor(driver, vm)); - goto cleanup; + goto error; cancel: orig_err = virSaveLastError(); @@ -3950,14 +3951,14 @@ qemuMigrationRun(virQEMUDriverPtr driver, ignore_value(qemuDomainObjExitMonitor(driver, vm)); } } - goto cleanup; + goto error; cancelPostCopy: priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_FAILED; if (inPostCopy) goto cancel; else - goto cleanup; + goto error; } /* Perform migration using QEMU's native migrate support, -- 2.14.2

Merge cancel and cancelPostCopy sections with the generic error section, where we can easily decide whether canceling the ongoing migration is required. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_migration.c | 61 ++++++++++++++++++++++------------------------- 1 file changed, 28 insertions(+), 33 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 6956901be..cbf255704 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3616,7 +3616,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, unsigned int cookieFlags = 0; bool abort_on_error = !!(flags & VIR_MIGRATE_ABORT_ON_ERROR); bool events = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT); - bool inPostCopy = false; + bool cancel = false; unsigned int waitFlags; virDomainDefPtr persistDef = NULL; char *timestamp; @@ -3822,10 +3822,11 @@ qemuMigrationRun(virQEMUDriverPtr driver, /* From this point onwards we *must* call cancel to abort the * migration on source if anything goes wrong */ + cancel = true; if (spec->fwdType != MIGRATION_FWD_DIRECT) { if (!(iothread = qemuMigrationStartTunnel(spec->fwd.stream, fd))) - goto cancel; + goto error; /* If we've created a tunnel, then the 'fd' will be closed in the * qemuMigrationIOFunc as data->sock. */ @@ -3843,13 +3844,18 @@ qemuMigrationRun(virQEMUDriverPtr driver, rc = qemuMigrationWaitForCompletion(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, dconn, waitFlags); - if (rc == -2) - goto cancel; - else if (rc == -1) + if (rc == -2) { goto error; + } else if (rc == -1) { + /* QEMU reported failed migration, nothing to cancel anymore */ + cancel = false; + goto error; + } - if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) - inPostCopy = true; + if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED) { + /* QEMU finished migration, nothing to cancel anymore */ + cancel = false; + } /* When migration completed, QEMU will have paused the CPUs for us. * Wait for the STOP event to be processed or explicitly stop CPUs @@ -3861,25 +3867,25 @@ qemuMigrationRun(virQEMUDriverPtr driver, rc = virDomainObjWait(vm); priv->signalStop = false; if (rc < 0) - goto cancelPostCopy; + goto error; } } else if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING && qemuMigrationSetOffline(driver, vm) < 0) { - goto cancelPostCopy; + goto error; } if (mig && mig->nbd && qemuMigrationCancelDriveMirror(driver, vm, true, QEMU_ASYNC_JOB_MIGRATION_OUT, dconn) < 0) - goto cancelPostCopy; + goto error; if (iothread) { qemuMigrationIOThreadPtr io; VIR_STEAL_PTR(io, iothread); if (qemuMigrationStopTunnel(io, false) < 0) - goto cancelPostCopy; + goto error; } if (priv->job.completed) { @@ -3919,8 +3925,15 @@ qemuMigrationRun(virQEMUDriverPtr driver, return ret; error: - if (!orig_err) - orig_err = virSaveLastError(); + orig_err = virSaveLastError(); + + if (cancel && + virDomainObjIsActive(vm) && + qemuDomainObjEnterMonitorAsync(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_OUT) == 0) { + qemuMonitorMigrateCancel(priv->mon); + ignore_value(qemuDomainObjExitMonitor(driver, vm)); + } /* cancel any outstanding NBD jobs */ if (mig && mig->nbd) @@ -3932,7 +3945,8 @@ qemuMigrationRun(virQEMUDriverPtr driver, qemuMigrationStopTunnel(iothread, true); if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE || - priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_MIGRATING) + priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_MIGRATING || + priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_FAILED; goto cleanup; @@ -3940,25 +3954,6 @@ qemuMigrationRun(virQEMUDriverPtr driver, exit_monitor: ignore_value(qemuDomainObjExitMonitor(driver, vm)); goto error; - - cancel: - orig_err = virSaveLastError(); - - if (virDomainObjIsActive(vm)) { - if (qemuDomainObjEnterMonitorAsync(driver, vm, - QEMU_ASYNC_JOB_MIGRATION_OUT) == 0) { - qemuMonitorMigrateCancel(priv->mon); - ignore_value(qemuDomainObjExitMonitor(driver, vm)); - } - } - goto error; - - cancelPostCopy: - priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_FAILED; - if (inPostCopy) - goto cancel; - else - goto error; } /* Perform migration using QEMU's native migrate support, -- 2.14.2

On Thu, Oct 19, 2017 at 15:56:29 +0200, Jiri Denemark wrote:
Merge cancel and cancelPostCopy sections with the generic error section, where we can easily decide whether canceling the ongoing migration is required. ... @@ -3843,13 +3844,18 @@ qemuMigrationRun(virQEMUDriverPtr driver, rc = qemuMigrationWaitForCompletion(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, dconn, waitFlags); - if (rc == -2) - goto cancel; - else if (rc == -1) + if (rc == -2) { goto error; + } else if (rc == -1) { + /* QEMU reported failed migration, nothing to cancel anymore */ + cancel = false; + goto error; + }
- if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) - inPostCopy = true; + if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED) { + /* QEMU finished migration, nothing to cancel anymore */ + cancel = false; + }
/* When migration completed, QEMU will have paused the CPUs for us. * Wait for the STOP event to be processed or explicitly stop CPUs
Please, consider the following small patch squashed in: diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index bc57f757a..626b4e3ee 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3855,11 +3855,6 @@ qemuMigrationRun(virQEMUDriverPtr driver, goto error; } - if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED) { - /* QEMU finished migration, nothing to cancel anymore */ - cancel = false; - } - /* When migration completed, QEMU will have paused the CPUs for us. * Wait for the STOP event to be processed or explicitly stop CPUs * (for old QEMU which does not send events) to release the lock state. @@ -3931,6 +3926,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, orig_err = virSaveLastError(); if (cancel && + priv->job.current->status != QEMU_DOMAIN_JOB_STATUS_QEMU_COMPLETED && virDomainObjIsActive(vm) && qemuDomainObjEnterMonitorAsync(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT) == 0) {

The "ret" variable is used for storing the return value of a function and should not be used as a temporary variable. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_migration.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index cbf255704..2d8a634f9 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3788,16 +3788,17 @@ qemuMigrationRun(virQEMUDriverPtr driver, VIR_FREE(timestamp); } + rc = -1; switch (spec->destType) { case MIGRATION_DEST_HOST: if (STREQ(spec->dest.host.protocol, "rdma") && virProcessSetMaxMemLock(vm->pid, vm->def->mem.hard_limit << 10) < 0) { goto exit_monitor; } - ret = qemuMonitorMigrateToHost(priv->mon, migrate_flags, - spec->dest.host.protocol, - spec->dest.host.name, - spec->dest.host.port); + rc = qemuMonitorMigrateToHost(priv->mon, migrate_flags, + spec->dest.host.protocol, + spec->dest.host.name, + spec->dest.host.port); break; case MIGRATION_DEST_CONNECT_HOST: @@ -3809,16 +3810,14 @@ qemuMigrationRun(virQEMUDriverPtr driver, fd = spec->dest.fd.local; spec->dest.fd.local = -1; } - ret = qemuMonitorMigrateToFd(priv->mon, migrate_flags, - spec->dest.fd.qemu); + rc = qemuMonitorMigrateToFd(priv->mon, migrate_flags, + spec->dest.fd.qemu); VIR_FORCE_CLOSE(spec->dest.fd.qemu); break; } - if (qemuDomainObjExitMonitor(driver, vm) < 0) - ret = -1; - if (ret < 0) + + if (qemuDomainObjExitMonitor(driver, vm) < 0 || rc < 0) goto error; - ret = -1; /* From this point onwards we *must* call cancel to abort the * migration on source if anything goes wrong */ -- 2.14.2

Almost every failure in qemuMigrationRun while we are talking to QEMU monitor results in a jump to exit_monitor label. The only exception is removed by this patch. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_migration.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 2d8a634f9..8a529f9ad 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3765,12 +3765,11 @@ qemuMigrationRun(virQEMUDriverPtr driver, /* explicitly do this *after* we entered the monitor, * as this is a critical section so we are guaranteed * priv->job.abortJob will not change */ - ignore_value(qemuDomainObjExitMonitor(driver, vm)); priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_CANCELED; virReportError(VIR_ERR_OPERATION_ABORTED, _("%s: %s"), qemuDomainAsyncJobTypeToString(priv->job.asyncJob), _("canceled by client")); - goto error; + goto exit_monitor; } if (qemuMonitorSetMigrationSpeed(priv->mon, migrate_speed) < 0) -- 2.14.2

Instead of enumerating all states which need to be turned into QEMU_DOMAIN_JOB_STATUS_FAILED (and failing to add all of them), it's better to mention just the one which needs to be left alone. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_migration.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 8a529f9ad..f785c308c 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3942,9 +3942,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, if (iothread) qemuMigrationStopTunnel(iothread, true); - if (priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_ACTIVE || - priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_MIGRATING || - priv->job.current->status == QEMU_DOMAIN_JOB_STATUS_POSTCOPY) + if (priv->job.current->status != QEMU_DOMAIN_JOB_STATUS_CANCELED) priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_FAILED; goto cleanup; -- 2.14.2

On 10/19/2017 09:56 AM, Jiri Denemark wrote:
Mostly refactoring of the horrible mess in qemuMigrationRun.
Jiri Denemark (7): qemu: Use switch in qemuMigrationCompleted qemu: Refactor qemuMigrationRun a bit qemu: Split cleanup and error code in qemuMigrationRun qemu: Unite error handling in qemuMigrationRun qemu: Don't misuse "ret" in qemuMigrationRun qemu: Consistently use exit_monitor in qemuMigrationRun qemu: Set correct job status when qemuMigrationRun fails
src/qemu/qemu_migration.c | 196 +++++++++++++++++++++++++--------------------- 1 file changed, 105 insertions(+), 91 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> (series) John
participants (3)
-
Jiri Denemark
-
John Ferlan
-
Marc Hartmayer