
On Thu, Jun 28, 2018 at 16:06:47 +0200, Jiri Denemark wrote:
Once we called qemuDomainObjEnterRemote to talk to the destination daemon during a peer to peer migration, the vm lock is released and we only hold an async job. If the source domain dies at this point the monitor EOF callback is allowed to do its job and (among other things) clear all private data irrelevant for stopped domain. Thus when we call qemuDomainObjExitRemote, the domain may already be gone and we should avoid touching runtime private data (such as current job info).
In other words after acquiring the lock in qemuDomainObjExitRemote, we need to check the domain is still alive. Unless we're doing offline migration.
https://bugzilla.redhat.com/show_bug.cgi?id=1589730
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_domain.c | 14 +++++++++++- src/qemu/qemu_domain.h | 5 +++-- src/qemu/qemu_migration.c | 45 ++++++++++++++++++++++----------------- 3 files changed, 41 insertions(+), 23 deletions(-)
[...]
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index c9aaa38029..cafab2f3e8 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3920,13 +3920,15 @@ qemuMigrationSrcPerformPeer2Peer2(virQEMUDriverPtr driver, qemuDomainObjEnterRemote(vm); ret = dconn->driver->domainMigratePrepareTunnel (dconn, st, destflags, dname, resource, dom_xml); - qemuDomainObjExitRemote(vm); + if (qemuDomainObjExitRemote(vm, true) < 0) + goto cleanup; } else { qemuDomainObjEnterRemote(vm); ret = dconn->driver->domainMigratePrepare2 (dconn, &cookie, &cookielen, NULL, &uri_out, destflags, dname, resource, dom_xml); - qemuDomainObjExitRemote(vm); + if (qemuDomainObjExitRemote(vm, true) < 0) + goto cleanup; } VIR_FREE(dom_xml); if (ret == -1)
There is a check below this which does !virDomainObjIsActive which seems pointless now.
@@ -3987,7 +3989,8 @@ qemuMigrationSrcPerformPeer2Peer2(virQEMUDriverPtr driver, ddomain = dconn->driver->domainMigrateFinish2 (dconn, dname, cookie, cookielen, uri_out ? uri_out : dconnuri, destflags, cancelled); - qemuDomainObjExitRemote(vm); + /* The domain is already gone at this point */ + ignore_value(qemuDomainObjExitRemote(vm, false)); if (cancelled && ddomain) VIR_ERROR(_("finish step ignored that migration was cancelled"));
[...]
@@ -4488,20 +4500,13 @@ qemuMigrationSrcPerformPeer2Peer(virQEMUDriverPtr driver, goto cleanup; }
- if (flags & VIR_MIGRATE_OFFLINE && !dstOffline) { + if (offline && !dstOffline) {
This slightly mixes two distinct refactors ...
virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", _("offline migration is not supported by " "the destination host")); goto cleanup; }
ACK