[libvirt] [PATCH for release 0/3] Fix broken migration

Currently, we are unable to migrate ANY domain. These should get in before the release. Okay, the first one is not that crucial. The last one is! Michal Privoznik (3): qemuMigrationSrcIsSafe: Check local storage more thoroughly qemuProcessLaunch: Print all arguments to debug qemu: Add virConnectPtr back to some migration methods src/qemu/qemu_driver.c | 12 ++++++------ src/qemu/qemu_migration.c | 45 ++++++++++++++++++++++++++++----------------- src/qemu/qemu_migration.h | 2 ++ src/qemu/qemu_process.c | 4 ++-- 4 files changed, 38 insertions(+), 25 deletions(-) -- 2.16.1

https://bugzilla.redhat.com/show_bug.cgi?id=1494454 If a domain disk is stored on local filesystem (e.g. ext4) but is not being migrated it is very likely that domain is not able to run on destination. Regardless of share/cache mode. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_migration.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index b22a327b5..e98b1e4ce 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1215,25 +1215,26 @@ qemuMigrationSrcIsSafe(virDomainDefPtr def, virDomainDiskDefPtr disk = def->disks[i]; const char *src = virDomainDiskGetSource(disk); - /* Our code elsewhere guarantees shared disks are either readonly (in - * which case cache mode doesn't matter) or used with cache=none or used with cache=directsync */ + /* Disks without any source (i.e. floppies and CD-ROMs) + * OR readonly are safe. */ if (virStorageSourceIsEmpty(disk->src) || - disk->src->readonly || - disk->src->shared || - disk->cachemode == VIR_DOMAIN_DISK_CACHE_DISABLE || - disk->cachemode == VIR_DOMAIN_DISK_CACHE_DIRECTSYNC) + disk->src->readonly) continue; - /* disks which are migrated by qemu are safe too */ + /* Disks which are migrated by qemu are safe too. */ if (storagemigration && qemuMigrationAnyCopyDisk(disk, nmigrate_disks, migrate_disks)) continue; + /* However, disks on local FS (e.g. ext4) are not safe. */ if (virDomainDiskGetType(disk) == VIR_STORAGE_TYPE_FILE) { - if ((rc = virFileIsSharedFS(src)) < 0) + if ((rc = virFileIsSharedFS(src)) < 0) { return false; - else if (rc == 0) - continue; + } else if (rc == 0) { + virReportError(VIR_ERR_MIGRATE_UNSAFE, "%s", + _("Migration without shared storage is unsafe")); + return false; + } if ((rc = virStorageFileIsClusterFS(src)) < 0) return false; else if (rc == 1) @@ -1243,6 +1244,13 @@ qemuMigrationSrcIsSafe(virDomainDefPtr def, continue; } + /* Our code elsewhere guarantees shared disks are either readonly (in + * which case cache mode doesn't matter) or used with cache=none or used with cache=directsync */ + if (disk->src->shared || + disk->cachemode == VIR_DOMAIN_DISK_CACHE_DISABLE || + disk->cachemode == VIR_DOMAIN_DISK_CACHE_DIRECTSYNC) + continue; + virReportError(VIR_ERR_MIGRATE_UNSAFE, "%s", _("Migration may lead to data corruption if disks" " use cache != none or cache != directsync")); -- 2.16.1

On Mon, Feb 26, 2018 at 11:05:49AM +0100, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1494454
If a domain disk is stored on local filesystem (e.g. ext4) but is not being migrated it is very likely that domain is not able to run on destination. Regardless of share/cache mode.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_migration.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index b22a327b5..e98b1e4ce 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1215,25 +1215,26 @@ qemuMigrationSrcIsSafe(virDomainDefPtr def, virDomainDiskDefPtr disk = def->disks[i]; const char *src = virDomainDiskGetSource(disk);
- /* Our code elsewhere guarantees shared disks are either readonly (in - * which case cache mode doesn't matter) or used with cache=none or used with cache=directsync */ + /* Disks without any source (i.e. floppies and CD-ROMs) + * OR readonly are safe. */ if (virStorageSourceIsEmpty(disk->src) || - disk->src->readonly || - disk->src->shared || - disk->cachemode == VIR_DOMAIN_DISK_CACHE_DISABLE || - disk->cachemode == VIR_DOMAIN_DISK_CACHE_DIRECTSYNC) + disk->src->readonly) continue;
- /* disks which are migrated by qemu are safe too */ + /* Disks which are migrated by qemu are safe too. */ if (storagemigration && qemuMigrationAnyCopyDisk(disk, nmigrate_disks, migrate_disks)) continue;
+ /* However, disks on local FS (e.g. ext4) are not safe. */ if (virDomainDiskGetType(disk) == VIR_STORAGE_TYPE_FILE) { - if ((rc = virFileIsSharedFS(src)) < 0) + if ((rc = virFileIsSharedFS(src)) < 0) { return false; - else if (rc == 0) - continue; + } else if (rc == 0) { + virReportError(VIR_ERR_MIGRATE_UNSAFE, "%s", + _("Migration without shared storage is unsafe")); + return false; + } if ((rc = virStorageFileIsClusterFS(src)) < 0) return false; else if (rc == 1) @@ -1243,6 +1244,13 @@ qemuMigrationSrcIsSafe(virDomainDefPtr def, continue; }
+ /* Our code elsewhere guarantees shared disks are either readonly (in + * which case cache mode doesn't matter) or used with cache=none or used with cache=directsync */ + if (disk->src->shared || + disk->cachemode == VIR_DOMAIN_DISK_CACHE_DISABLE || + disk->cachemode == VIR_DOMAIN_DISK_CACHE_DIRECTSYNC) + continue; + virReportError(VIR_ERR_MIGRATE_UNSAFE, "%s", _("Migration may lead to data corruption if disks" " use cache != none or cache != directsync"));
A little hard to follow but the key difference is that the old code would immediately return saying it is "safe" if the cache mode was set to a reasonable value, and not get to checking whether disk is local or not. We've essentially reversed that so we detect unsafe scenarios first, and then later check safe options. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 1923c8e35..57c06c7c1 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -5909,11 +5909,11 @@ qemuProcessLaunch(virConnectPtr conn, int *nicindexes = NULL; size_t i; - VIR_DEBUG("vm=%p name=%s id=%d asyncJob=%d " + VIR_DEBUG("conn=%p driver=%p vm=%p name=%s if=%d asyncJob=%d " "incoming.launchURI=%s incoming.deferredURI=%s " "incoming.fd=%d incoming.path=%s " "snapshot=%p vmop=%d flags=0x%x", - vm, vm->def->name, vm->def->id, asyncJob, + conn, driver, vm, vm->def->name, vm->def->id, asyncJob, NULLSTR(incoming ? incoming->launchURI : NULL), NULLSTR(incoming ? incoming->deferredURI : NULL), incoming ? incoming->fd : -1, -- 2.16.1

On Mon, Feb 26, 2018 at 11:05:50AM +0100, Michal Privoznik wrote:
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 1923c8e35..57c06c7c1 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -5909,11 +5909,11 @@ qemuProcessLaunch(virConnectPtr conn, int *nicindexes = NULL; size_t i;
- VIR_DEBUG("vm=%p name=%s id=%d asyncJob=%d " + VIR_DEBUG("conn=%p driver=%p vm=%p name=%s if=%d asyncJob=%d " "incoming.launchURI=%s incoming.deferredURI=%s " "incoming.fd=%d incoming.path=%s " "snapshot=%p vmop=%d flags=0x%x", - vm, vm->def->name, vm->def->id, asyncJob, + conn, driver, vm, vm->def->name, vm->def->id, asyncJob, NULLSTR(incoming ? incoming->launchURI : NULL), NULLSTR(incoming ? incoming->deferredURI : NULL), incoming ? incoming->fd : -1,
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

This partially reverts 82592551cb8c4112cfa2264d50b8dce5349533d5. When migrating a domain, qemuMigrationDstPrepareAny() is called which eventually calls qemuProcessLaunch(conn = NULL, flags = VIR_QEMU_PROCESS_START_AUTODESTROY); But the very first thing that qemuProcessLaunch does is check if AUTODESTROY flag is set and @conn is not NULL. Well, it is. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 12 ++++++------ src/qemu/qemu_migration.c | 17 ++++++++++------- src/qemu/qemu_migration.h | 2 ++ 3 files changed, 18 insertions(+), 13 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 313d730c7..96454c17c 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -12138,7 +12138,7 @@ qemuDomainMigratePrepareTunnel(virConnectPtr dconn, if (virDomainMigratePrepareTunnelEnsureACL(dconn, def) < 0) goto cleanup; - ret = qemuMigrationDstPrepareTunnel(driver, + ret = qemuMigrationDstPrepareTunnel(driver, dconn, NULL, 0, NULL, NULL, /* No cookies in v2 */ st, &def, origname, flags); @@ -12201,7 +12201,7 @@ qemuDomainMigratePrepare2(virConnectPtr dconn, * length was not sufficiently large, causing failures * migrating between old & new libvirtd */ - ret = qemuMigrationDstPrepareDirect(driver, + ret = qemuMigrationDstPrepareDirect(driver, dconn, NULL, 0, NULL, NULL, /* No cookies */ uri_in, uri_out, &def, origname, NULL, 0, NULL, 0, @@ -12439,7 +12439,7 @@ qemuDomainMigratePrepare3(virConnectPtr dconn, if (virDomainMigratePrepare3EnsureACL(dconn, def) < 0) goto cleanup; - ret = qemuMigrationDstPrepareDirect(driver, + ret = qemuMigrationDstPrepareDirect(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, uri_in, uri_out, @@ -12525,7 +12525,7 @@ qemuDomainMigratePrepare3Params(virConnectPtr dconn, if (virDomainMigratePrepare3ParamsEnsureACL(dconn, def) < 0) goto cleanup; - ret = qemuMigrationDstPrepareDirect(driver, + ret = qemuMigrationDstPrepareDirect(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, uri_in, uri_out, @@ -12574,7 +12574,7 @@ qemuDomainMigratePrepareTunnel3(virConnectPtr dconn, if (virDomainMigratePrepareTunnel3EnsureACL(dconn, def) < 0) goto cleanup; - ret = qemuMigrationDstPrepareTunnel(driver, + ret = qemuMigrationDstPrepareTunnel(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, st, &def, origname, flags); @@ -12627,7 +12627,7 @@ qemuDomainMigratePrepareTunnel3Params(virConnectPtr dconn, if (virDomainMigratePrepareTunnel3ParamsEnsureACL(dconn, def) < 0) goto cleanup; - ret = qemuMigrationDstPrepareTunnel(driver, + ret = qemuMigrationDstPrepareTunnel(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, st, &def, origname, flags); diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index e98b1e4ce..bf89e184e 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2600,6 +2600,7 @@ qemuMigrationParamsResetTLS(virQEMUDriverPtr driver, static int qemuMigrationDstPrepareAny(virQEMUDriverPtr driver, + virConnectPtr dconn, const char *cookiein, int cookieinlen, char **cookieout, @@ -2809,7 +2810,7 @@ qemuMigrationDstPrepareAny(virQEMUDriverPtr driver, if (qemuProcessPrepareHost(driver, vm, startFlags) < 0) goto stopjob; - rv = qemuProcessLaunch(NULL, driver, vm, QEMU_ASYNC_JOB_MIGRATION_IN, + rv = qemuProcessLaunch(dconn, driver, vm, QEMU_ASYNC_JOB_MIGRATION_IN, incoming, NULL, VIR_NETDEV_VPORT_PROFILE_OP_MIGRATE_IN_START, startFlags); @@ -2993,6 +2994,7 @@ qemuMigrationDstPrepareAny(virQEMUDriverPtr driver, */ int qemuMigrationDstPrepareTunnel(virQEMUDriverPtr driver, + virConnectPtr dconn, const char *cookiein, int cookieinlen, char **cookieout, @@ -3005,10 +3007,10 @@ qemuMigrationDstPrepareTunnel(virQEMUDriverPtr driver, qemuMigrationCompressionPtr compression = NULL; int ret; - VIR_DEBUG("driver=%p, cookiein=%s, cookieinlen=%d, " + VIR_DEBUG("driver=%p, dconn=%p, cookiein=%s, cookieinlen=%d, " "cookieout=%p, cookieoutlen=%p, st=%p, def=%p, " "origname=%s, flags=0x%lx", - driver, NULLSTR(cookiein), cookieinlen, + driver, dconn, NULLSTR(cookiein), cookieinlen, cookieout, cookieoutlen, st, *def, origname, flags); if (st == NULL) { @@ -3020,7 +3022,7 @@ qemuMigrationDstPrepareTunnel(virQEMUDriverPtr driver, if (!(compression = qemuMigrationAnyCompressionParse(NULL, 0, flags))) return -1; - ret = qemuMigrationDstPrepareAny(driver, cookiein, cookieinlen, + ret = qemuMigrationDstPrepareAny(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, def, origname, st, NULL, 0, false, NULL, 0, NULL, 0, compression, flags); @@ -3054,6 +3056,7 @@ qemuMigrationAnyParseURI(const char *uri, bool *wellFormed) int qemuMigrationDstPrepareDirect(virQEMUDriverPtr driver, + virConnectPtr dconn, const char *cookiein, int cookieinlen, char **cookieout, @@ -3077,11 +3080,11 @@ qemuMigrationDstPrepareDirect(virQEMUDriverPtr driver, virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); const char *migrateHost = cfg->migrateHost; - VIR_DEBUG("driver=%p, cookiein=%s, cookieinlen=%d, " + VIR_DEBUG("driver=%p, dconn=%p, cookiein=%s, cookieinlen=%d, " "cookieout=%p, cookieoutlen=%p, uri_in=%s, uri_out=%p, " "def=%p, origname=%s, listenAddress=%s, " "nmigrate_disks=%zu, migrate_disks=%p, nbdPort=%d, flags=0x%lx", - driver, NULLSTR(cookiein), cookieinlen, + driver, dconn, NULLSTR(cookiein), cookieinlen, cookieout, cookieoutlen, NULLSTR(uri_in), uri_out, *def, origname, NULLSTR(listenAddress), nmigrate_disks, migrate_disks, nbdPort, flags); @@ -3185,7 +3188,7 @@ qemuMigrationDstPrepareDirect(virQEMUDriverPtr driver, if (*uri_out) VIR_DEBUG("Generated uri_out=%s", *uri_out); - ret = qemuMigrationDstPrepareAny(driver, cookiein, cookieinlen, + ret = qemuMigrationDstPrepareAny(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, def, origname, NULL, uri ? uri->scheme : "tcp", port, autoPort, listenAddress, diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index ecb176e06..a075aec12 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -170,6 +170,7 @@ qemuMigrationAnyPrepareDef(virQEMUDriverPtr driver, int qemuMigrationDstPrepareTunnel(virQEMUDriverPtr driver, + virConnectPtr dconn, const char *cookiein, int cookieinlen, char **cookieout, @@ -181,6 +182,7 @@ qemuMigrationDstPrepareTunnel(virQEMUDriverPtr driver, int qemuMigrationDstPrepareDirect(virQEMUDriverPtr driver, + virConnectPtr dconn, const char *cookiein, int cookieinlen, char **cookieout, -- 2.16.1

On Mon, Feb 26, 2018 at 11:05:51AM +0100, Michal Privoznik wrote:
This partially reverts 82592551cb8c4112cfa2264d50b8dce5349533d5.
When migrating a domain, qemuMigrationDstPrepareAny() is called which eventually calls qemuProcessLaunch(conn = NULL, flags = VIR_QEMU_PROCESS_START_AUTODESTROY); But the very first thing that qemuProcessLaunch does is check if AUTODESTROY flag is set and @conn is not NULL. Well, it is.
Doh, I was over zealous removing it then :-)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 12 ++++++------ src/qemu/qemu_migration.c | 17 ++++++++++------- src/qemu/qemu_migration.h | 2 ++ 3 files changed, 18 insertions(+), 13 deletions(-)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (2)
-
Daniel P. Berrangé
-
Michal Privoznik