On Thu, Jan 25, 2018 at 19:51:23 +0530, Prerna Saxena wrote:
> In case of non-p2p migration, in case libvirt client gets disconnected from source libvirt
> after PERFORM phase is over, the daemon just resets the current migration job.
> However, the VM could be left paused on both source and destination in such case. In case
> the client reconnects and queries migration status, the job has been blanked out from source libvirt,
> and this reconnected client has no clear way of figuring out if an unclean migration had previously
> been aborted.
The virDomainGetState API should return VIR_DOMAIN_PAUSED with
VIR_DOMAIN_PAUSED_MIGRATION reason. Is this not enough?
> This patch calls out a "potentially" incomplete migration as a failed
> job, so that a client may
As you say it's potentially incomplete, so marking it as failed is not
completely correct. It's a split brain when the source cannot
distinguish whether the migration was successful or not.
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index e8e0313..7c60d17 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -4564,6 +4564,22 @@ qemuDomainObjDiscardAsyncJob(virQEMUDriverPtr driver, virDomainObjPtr obj) This will just leak the memory allocated for priv->job.completed by
> qemuDomainObjSaveJob(driver, obj);
> }
>
> +
> +void
> +qemuDomainObjFailAsyncJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
> +{
> + qemuDomainObjPrivatePtr priv = obj->privateData;
> + VIR_FREE(priv->job.completed);
> + if (VIR_ALLOC(priv->job.completed) == 0) {
> + priv->job.current->type = VIR_DOMAIN_JOB_FAILED;
> + priv->job.completed = priv->job.current;
overwriting the pointer to the one from priv->job.current, ...
> + } else {
> + VIR_WARN("Unable to allocate job.completed for VM %s", obj->def->name);
> + }
> + qemuDomainObjResetAsyncJob(priv); which will point to a freed memory after this call.
> + qemuDomainObjEndJob(driver, obj);
And while this is almost certainly (I didn't really check though) not
something you should call from a close callback, you don't save the
changes to the status XML so once libvirtd restarts, it will think the
domain is still being migrated.