
On Wed, Sep 19, 2018 at 16:04:13 -0400, John Ferlan wrote:
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 825a9d399b..4771f26938 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c
[...]
@@ -5074,13 +5060,8 @@ qemuMigrationDstFinish(virQEMUDriverPtr driver, goto endjob; }
- if (inPostCopy) { + if (inPostCopy) doKill = false; - event = virDomainEventLifecycleNewFromObj(vm, - VIR_DOMAIN_EVENT_RESUMED, - VIR_DOMAIN_EVENT_RESUMED_POSTCOPY); - virObjectEventStateQueue(driver->domainEventState, event); - } }
if (mig->jobInfo) { @@ -5111,11 +5092,6 @@ qemuMigrationDstFinish(virQEMUDriverPtr driver,
dom = virGetDomain(dconn, vm->def->name, vm->def->uuid, vm->def->id);
- event = virDomainEventLifecycleNewFromObj(vm, - VIR_DOMAIN_EVENT_RESUMED, - VIR_DOMAIN_EVENT_RESUMED_MIGRATED); - virObjectEventStateQueue(driver->domainEventState, event); - if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_PAUSED) { virDomainObjSetState(vm, VIR_DOMAIN_PAUSED, VIR_DOMAIN_PAUSED_USER); event = virDomainEventLifecycleNewFromObj(vm,
For this path 2 events were generated if @inPostCopy == True - one right after the qemuProcessStartCPUs for RESUMED_POSTCOPY and then one for SUSPENDED_PAUSED once qemuMigrationDstWaitForCompletion occurred.
I guess you meant s/SUSPENDED_PAUSED/RESUMED_MIGRATED/. Anyway, you made me think about this and you're right. Or docs even say that two RESUMED events are sent during post-copy migration. RESUMED_POSTCOPY when the domain switches from source to the destination host and RESUMED_MIGRATED once migration is complete... v2 is in progress.
IIUC, the changes here when @inPostCopy == true would only generate the POSTCOPY event, but not the MIGRATED event. Or is there something subtle I'm missing. Will the post copy processing generate two resume events? If so, we perhaps should document that. If not, then perhaps this second event that's being deleted needs to be moved inside that inPostCopy condition just above with the note that this one case is "abnormal"
Yes.
(or you could just do the *StartCPUs again with the RESUMED_MIGRATED and close your eyes ;-)).
This wouldn't work, the RESUME handler ignores the event if the domain is not paused.
Curious, for the future, would it be useful to change the FakeReboot path to generate a different detail reason? It's hard to distinguish between someone using virsh resume (or recovery from Save, Dump, Snapshot failure) from that fake reboot path - at least from just the event. I guess it seems VIR_DOMAIN_EVENT_RESUMED_UNPAUSED is just too generic since it's more than just a "normal resume due to admin unpause", but that I would think would be extra given the "scope" of this particular problem.
Yeah, a new resume reason would be nice in this case, but I don't think it's a big issue. The guest-agent driven reboot is much better and hopefully used more often than fake reboot.
Second curiosity... would it be useful/easy to generate an event on the failure scenario for StartCPUs? Knowing from whence we were called, we should be able to generate a suspended event rather than having the callers decide.
Not really, if StartCPUs fails the vCPUs just remain paused as if nothing happened. That is, there's no state change to signal using an event. And I think "cont" can't really fail anyway. Normally it would just succeed and another STOP event would be sent right away in case the reason for pausing the CPUs (such as I/O error) is still valid. Jirka