On Thu, Sep 01, 2022 at 16:51:34 +0200, Peter Krempa wrote:
> On Thu, Sep 01, 2022 at 14:47:41 +0200, Jiri Denemark wrote:
> > We have always considered "migrate_cancel" QMP command to return
after
> > successfully cancelling the migration. But this is no longer true (to be
> > honest I'm not sure it ever was) as it just changes the migration state
> > to "cancelling". In most cases the migration is canceled pretty
quickly
> > and we don't really notice anything, but sometimes it takes so long we
> > even get to clearing migration capabilities before the migration is
> > actually canceled, which fails as capabilities can only be changed when
> > no migration is running. So to avoid this issue, we can wait for the
> > migration to be really canceled after sending migrate_cancel. The only
> > place where we don't need synchronous behavior is when we're
cancelling
> > migration on user's request while it is actively watched by another
> > thread.
> >
> >
https://bugzilla.redhat.com/show_bug.cgi?id=2114866
> >
> > Signed-off-by: Jiri Denemark <jdenemar(a)redhat.com>
> > ---
> > src/qemu/qemu_driver.c | 2 +-
> > src/qemu/qemu_migration.c | 45 +++++++++++++++++++++++++++++++++++----
> > src/qemu/qemu_migration.h | 3 ++-
> > src/qemu/qemu_process.c | 2 +-
> > 4 files changed, 45 insertions(+), 7 deletions(-)
>
> [...]
>
> > int
> > qemuMigrationSrcCancel(virDomainObj *vm,
> > - virDomainAsyncJob asyncJob)
> > + virDomainAsyncJob asyncJob,
> > + bool wait)
> > {
> > qemuDomainObjPrivate *priv = vm->privateData;
> >
> > @@ -4625,6 +4653,15 @@ qemuMigrationSrcCancel(virDomainObj *vm,
> > qemuMonitorMigrateCancel(priv->mon);
> > qemuDomainObjExitMonitor(vm);
> >
> > + if (virDomainObjIsActive(vm) && wait) {
>
> Is the call to virDomainObjIsActive() necessary here? IIUC the domain
> shutdown code is always executed in a way to make sure that waiting
> threads are always woken.
>
>
> > + VIR_DEBUG("Waiting for migration to be canceled");
> > +
> > + while (!qemuMigrationSrcIsCanceled(vm)) {
> > + if (qemuDomainObjWait(vm) < 0)
>
> So here if the VM would crash before we wait we'd report success and if
> it crashed during our wait we'll report failure, which seems weird too.
Oh right, qemuDomainObjWait already checks for virDomainObjIsActive so
we don't have to do it explicitly here. Just
if (wait) {
...
}
is enough.