
On 06/23/2011 04:58 AM, Daniel P. Berrange wrote:
Migration is a multi-step process
1. Begin(src) 2. Prepare(dst) 3. Perform(src) 4. Finish(dst) 5. Confirm(src)
At step 2, a QEMU process is lauched in the destination to accept the incoming migration. Occasionally the process that is controlling the migration workflow aborts, and fails to call step 4, Finish. This leaves a QEMU process running on the target (albeit with paused CPUs). Unfortunately because step 2 actives a job on the QEMU process, it is unkillable by normal means.
By registering the VM for autokill against the src virConnectPtr in step 2, we can ensure that the guest is forcefully killed off if the connection is closed without step 4 being invoked
* src/qemu/qemu_migration.c: Register autokill in PrepareDirect and PrepareTunnel. Unregister autokill on successful run of Finish * src/qemu/qemu_process.c: Unregister autokill when stopping a process --- src/qemu/qemu_migration.c | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-)
See comments on 0/3 about whether this patch is complete or whether we need more restrictions on migration when autokill is active. But for what you have here: ACK with nits.
@@ -2549,6 +2549,9 @@ qemuMigrationFinish(struct qemud_driver *driver, VIR_WARN("Failed to save status on vm %s", vm->def->name); goto endjob; } + + /* Guest is sucessfully running, so cancel previous autokill */
s/sucessfully/successfully/
+ qemuProcessAutokillRemove(driver, vm); } else { qemuProcessStop(driver, vm, 1, VIR_DOMAIN_SHUTOFF_FAILED); qemuAuditDomainStop(vm, "failed");
-- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org