On Wed, Sep 19, 2018 at 16:04:13 -0400, John Ferlan wrote:
> diff --git a/src/qemu/qemu_migration.c
b/src/qemu/qemu_migration.c
> index 825a9d399b..4771f26938 100644
> --- a/src/qemu/qemu_migration.c
> +++ b/src/qemu/qemu_migration.c
[...]
>
> @@ -5074,13 +5060,8 @@ qemuMigrationDstFinish(virQEMUDriverPtr driver,
> goto endjob;
> }
>
> - if (inPostCopy) {
> + if (inPostCopy)
> doKill = false;
> - event = virDomainEventLifecycleNewFromObj(vm,
> - VIR_DOMAIN_EVENT_RESUMED,
> - VIR_DOMAIN_EVENT_RESUMED_POSTCOPY);
> - virObjectEventStateQueue(driver->domainEventState, event);
> - }
> }
>
> if (mig->jobInfo) {
> @@ -5111,11 +5092,6 @@ qemuMigrationDstFinish(virQEMUDriverPtr driver,
>
> dom = virGetDomain(dconn, vm->def->name, vm->def->uuid,
vm->def->id);
>
> - event = virDomainEventLifecycleNewFromObj(vm,
> - VIR_DOMAIN_EVENT_RESUMED,
> - VIR_DOMAIN_EVENT_RESUMED_MIGRATED);
> - virObjectEventStateQueue(driver->domainEventState, event);
> -
> if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_PAUSED) {
> virDomainObjSetState(vm, VIR_DOMAIN_PAUSED, VIR_DOMAIN_PAUSED_USER);
> event = virDomainEventLifecycleNewFromObj(vm,
For this path 2 events were generated if @inPostCopy == True - one right
after the qemuProcessStartCPUs for RESUMED_POSTCOPY and then one for
SUSPENDED_PAUSED once qemuMigrationDstWaitForCompletion occurred.
I guess you meant s/SUSPENDED_PAUSED/RESUMED_MIGRATED/. Anyway, you made
me think about this and you're right. Or docs even say that two RESUMED
events are sent during post-copy migration. RESUMED_POSTCOPY when the
domain switches from source to the destination host and RESUMED_MIGRATED
once migration is complete... v2 is in progress.
IIUC, the changes here when @inPostCopy == true would only generate
the
POSTCOPY event, but not the MIGRATED event. Or is there something subtle
I'm missing. Will the post copy processing generate two resume events?
If so, we perhaps should document that. If not, then perhaps this second
event that's being deleted needs to be moved inside that inPostCopy
condition just above with the note that this one case is "abnormal"
Yes.
(or you could just do the *StartCPUs again with the RESUMED_MIGRATED
and close your eyes ;-)).
This wouldn't work, the RESUME handler ignores the event if the domain
is not paused.
Curious, for the future, would it be useful to change the FakeReboot
path to generate a different detail reason? It's hard to distinguish
between someone using virsh resume (or recovery from Save, Dump,
Snapshot failure) from that fake reboot path - at least from just the
event. I guess it seems VIR_DOMAIN_EVENT_RESUMED_UNPAUSED is just too
generic since it's more than just a "normal resume due to admin
unpause", but that I would think would be extra given the "scope" of
this particular problem.
Yeah, a new resume reason would be nice in this case, but I don't think
it's a big issue. The guest-agent driven reboot is much better and
hopefully used more often than fake reboot.
Second curiosity... would it be useful/easy to generate an event on
the
failure scenario for StartCPUs? Knowing from whence we were called, we
should be able to generate a suspended event rather than having the
callers decide.
Not really, if StartCPUs fails the vCPUs just remain paused as if
nothing happened. That is, there's no state change to signal using an
event. And I think "cont" can't really fail anyway. Normally it would
just succeed and another STOP event would be sent right away in case the
reason for pausing the CPUs (such as I/O error) is still valid.
Jirka