On Thu, Jun 28, 2018 at 16:06:47 +0200, Jiri Denemark wrote:
Once we called qemuDomainObjEnterRemote to talk to the destination
daemon during a peer to peer migration, the vm lock is released and we
only hold an async job. If the source domain dies at this point the
monitor EOF callback is allowed to do its job and (among other things)
clear all private data irrelevant for stopped domain. Thus when we call
qemuDomainObjExitRemote, the domain may already be gone and we should
avoid touching runtime private data (such as current job info).
In other words after acquiring the lock in qemuDomainObjExitRemote, we
need to check the domain is still alive. Unless we're doing offline
migration.
https://bugzilla.redhat.com/show_bug.cgi?id=1589730
Signed-off-by: Jiri Denemark <jdenemar(a)redhat.com>
---
src/qemu/qemu_domain.c | 14 +++++++++++-
src/qemu/qemu_domain.h | 5 +++--
src/qemu/qemu_migration.c | 45 ++++++++++++++++++++++-----------------
3 files changed, 41 insertions(+), 23 deletions(-)
[...]
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
index c9aaa38029..cafab2f3e8 100644
--- a/src/qemu/qemu_migration.c
+++ b/src/qemu/qemu_migration.c
@@ -3920,13 +3920,15 @@ qemuMigrationSrcPerformPeer2Peer2(virQEMUDriverPtr driver,
qemuDomainObjEnterRemote(vm);
ret = dconn->driver->domainMigratePrepareTunnel
(dconn, st, destflags, dname, resource, dom_xml);
- qemuDomainObjExitRemote(vm);
+ if (qemuDomainObjExitRemote(vm, true) < 0)
+ goto cleanup;
} else {
qemuDomainObjEnterRemote(vm);
ret = dconn->driver->domainMigratePrepare2
(dconn, &cookie, &cookielen, NULL, &uri_out,
destflags, dname, resource, dom_xml);
- qemuDomainObjExitRemote(vm);
+ if (qemuDomainObjExitRemote(vm, true) < 0)
+ goto cleanup;
}
VIR_FREE(dom_xml);
if (ret == -1)
There is a check below this which does !virDomainObjIsActive which seems
pointless now.
@@ -3987,7 +3989,8 @@
qemuMigrationSrcPerformPeer2Peer2(virQEMUDriverPtr driver,
ddomain = dconn->driver->domainMigrateFinish2
(dconn, dname, cookie, cookielen,
uri_out ? uri_out : dconnuri, destflags, cancelled);
- qemuDomainObjExitRemote(vm);
+ /* The domain is already gone at this point */
+ ignore_value(qemuDomainObjExitRemote(vm, false));
if (cancelled && ddomain)
VIR_ERROR(_("finish step ignored that migration was cancelled"));
[...]
@@ -4488,20 +4500,13 @@
qemuMigrationSrcPerformPeer2Peer(virQEMUDriverPtr driver,
goto cleanup;
}
- if (flags & VIR_MIGRATE_OFFLINE && !dstOffline) {
+ if (offline && !dstOffline) {
This slightly mixes two distinct refactors ...
virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED,
"%s",
_("offline migration is not supported by "
"the destination host"));
goto cleanup;
}
ACK