[libvirt PATCH 0/3] Introduce VIR_MIGRATE_ASSUME_SHARED_STORAGE

This was initially motivated by a KubeVirt issue[1] concerning integration with the Portworx storage provide, but it turns out to be more generally applicable: since mounting an NFS share on the same host that is exporting it is known to cause issues and is therefore not recommended, we need a way to allow migration in such a configuration while still not going quite as far as VIR_MIGRATE_UNSAFE does and losing all handrails. [1] https://issues.redhat.com/browse/CNV-34322 Andrea Bolognani (3): include: Introduce VIR_MIGRATE_ASSUME_SHARED_STORAGE qemu: Implement VIR_MIGRATE_ASSUME_SHARED_STORAGE support virsh: Wire up VIR_MIGRATE_ASSUME_SHARED_STORAGE support docs/manpages/virsh.rst | 5 ++++- include/libvirt/libvirt-domain.h | 14 ++++++++++++++ src/qemu/qemu_migration.c | 5 +++++ src/qemu/qemu_migration.h | 1 + tools/virsh-domain.c | 5 +++++ 5 files changed, 29 insertions(+), 1 deletion(-) -- 2.41.0

Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- include/libvirt/libvirt-domain.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index a1902546bb..37c55c5ae8 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1100,6 +1100,20 @@ typedef enum { * Since: 8.5.0 */ VIR_MIGRATE_ZEROCOPY = (1 << 20), + + /* Assume that the storage used for VM disks is shared even when + * it appears to be local storage. + * + * This can be the case, for example, when a directory is mounted + * using NFS on the destination host but is bind-mounted on the + * source host, because the latter happens to be the host that + * storage is physically attached to. + * + * Effectively a subset of VIR_MIGRATE_UNSAFE. + * + * Since: 9.10.0 + */ + VIR_MIGRATE_ASSUME_SHARED_STORAGE = (1 << 21), } virDomainMigrateFlags; -- 2.41.0

On Tue, Oct 31, 2023 at 18:12:58 +0100, Andrea Bolognani wrote:
Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- include/libvirt/libvirt-domain.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index a1902546bb..37c55c5ae8 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1100,6 +1100,20 @@ typedef enum { * Since: 8.5.0 */ VIR_MIGRATE_ZEROCOPY = (1 << 20), + + /* Assume that the storage used for VM disks is shared even when + * it appears to be local storage.
I'll try to come up with a safety label to put here as this has the potential to be a very shiny tempting foot-gun for anyone getting the warning about migration being unsafe.
+ * + * This can be the case, for example, when a directory is mounted + * using NFS on the destination host but is bind-mounted on the + * source host, because the latter happens to be the host that + * storage is physically attached to.
Technically it doesn't even need to be bind mounted, you can simply use directly what NFS is exporting. I presume bind-mount is what CNV does in this case, but please document the common one.
+ * + * Effectively a subset of VIR_MIGRATE_UNSAFE. + * + * Since: 9.10.0 + */ + VIR_MIGRATE_ASSUME_SHARED_STORAGE = (1 << 21), } virDomainMigrateFlags;
-- 2.41.0

Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/qemu/qemu_migration.c | 5 +++++ src/qemu/qemu_migration.h | 1 + 2 files changed, 6 insertions(+) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index ac58aa1a8c..bf6f1de310 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1600,6 +1600,11 @@ qemuMigrationSrcIsSafe(virDomainDef *def, if ((rc = virFileIsSharedFS(src)) < 0) { return false; } else if (rc == 0) { + /* Ignore the outcome of this check if we've been + * asked to assume that storage is shared */ + if (flags & VIR_MIGRATE_ASSUME_SHARED_STORAGE) + break; + unsafe = true; } if ((rc = virFileIsClusterFS(src)) < 0) diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index ed62fd4a91..c21417084a 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -62,6 +62,7 @@ VIR_MIGRATE_NON_SHARED_SYNCHRONOUS_WRITES | \ VIR_MIGRATE_POSTCOPY_RESUME | \ VIR_MIGRATE_ZEROCOPY | \ + VIR_MIGRATE_ASSUME_SHARED_STORAGE | \ 0) /* All supported migration parameters and their types. */ -- 2.41.0

On Tue, Oct 31, 2023 at 18:12:59 +0100, Andrea Bolognani wrote:
Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/qemu/qemu_migration.c | 5 +++++ src/qemu/qemu_migration.h | 1 + 2 files changed, 6 insertions(+)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index ac58aa1a8c..bf6f1de310 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1600,6 +1600,11 @@ qemuMigrationSrcIsSafe(virDomainDef *def, if ((rc = virFileIsSharedFS(src)) < 0) { return false; } else if (rc == 0) { + /* Ignore the outcome of this check if we've been + * asked to assume that storage is shared */
What purpose does it serve to do the check if we're about to ignore it? Just assume it's shared right at the beginning.

On Tue, Oct 31, 2023 at 06:12:59PM +0100, Andrea Bolognani wrote:
Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- src/qemu/qemu_migration.c | 5 +++++ src/qemu/qemu_migration.h | 1 + 2 files changed, 6 insertions(+)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index ac58aa1a8c..bf6f1de310 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1600,6 +1600,11 @@ qemuMigrationSrcIsSafe(virDomainDef *def, if ((rc = virFileIsSharedFS(src)) < 0) { return false; } else if (rc == 0) { + /* Ignore the outcome of this check if we've been + * asked to assume that storage is shared */ + if (flags & VIR_MIGRATE_ASSUME_SHARED_STORAGE) + break; + unsafe = true; } if ((rc = virFileIsClusterFS(src)) < 0)
So this is essentially telling this particular bit of code to ignore what virFileIsSharedFS reported. The problem this is not the only piece of code relevant to live migration that cares about the virFileIsSharedFS answer. The TPM code also checks virFileIsSharedFS in the context of migration. Also in the security manager, we check virFileIsSharedFS and skip restoring ownership on the src host, as that would kill access on the dst host. So while VIR_MIGRATE_ASSUME_SHARED_STORAGE may work in the precise scenario you've tested, it looks like an incomplete solution to the described problem, and liable to cause new problems. Of course FORCE is no better, because that'll share all the same problems with code not behaving correctly due to not detecting a shared fs. The new flag though is promoting itself as a safer variant of FORCE, and that is giving apps a misleading impression of what its actually able todo. I feel like the root problem here needs addressing in the virFileIsSharedFS method in some manner. Whether or not a given location is shared between two hosts cannot be reliably determined from stat() alone, we need some other inputs. Maybe this is as simple as having a qemu.conf parameter shared_storage_paths = [ "/some/dir", "/other/dir"....] and then we register those paths via virFile during startup. Extending virFileIsSharedFS to work correctly in special deployment scenarios can now be done by the admin when configuring the host. Alternatively, we could make it potentially "just work" for NFS at least by reading the "/etc/exports" file and identfiying which local paths have been exported to other hosts. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Nov 01, 2023 at 12:52:15PM +0000, Daniel P. Berrangé wrote:
On Tue, Oct 31, 2023 at 06:12:59PM +0100, Andrea Bolognani wrote:
if ((rc = virFileIsSharedFS(src)) < 0) { return false; } else if (rc == 0) { + /* Ignore the outcome of this check if we've been + * asked to assume that storage is shared */ + if (flags & VIR_MIGRATE_ASSUME_SHARED_STORAGE) + break;
So this is essentially telling this particular bit of code to ignore what virFileIsSharedFS reported.
The problem this is not the only piece of code relevant to live migration that cares about the virFileIsSharedFS answer. The TPM code also checks virFileIsSharedFS in the context of migration.
Also in the security manager, we check virFileIsSharedFS and skip restoring ownership on the src host, as that would kill access on the dst host.
So while VIR_MIGRATE_ASSUME_SHARED_STORAGE may work in the precise scenario you've tested, it looks like an incomplete solution to the described problem, and liable to cause new problems.
That's excellent insight, thanks!
I feel like the root problem here needs addressing in the virFileIsSharedFS method in some manner. Whether or not a given location is shared between two hosts cannot be reliably determined from stat() alone, we need some other inputs.
Maybe this is as simple as having a qemu.conf parameter
shared_storage_paths = [ "/some/dir", "/other/dir"....]
and then we register those paths via virFile during startup.
Extending virFileIsSharedFS to work correctly in special deployment scenarios can now be done by the admin when configuring the host.
Yes, this sounds like a much more reasonably sized hammer. I'll look into it. One concern that I have is that if this is configured statically at the daemon level, i.e. not fully under control of the management app, there might be scenarios where things don't work as expected. For example, if /some/dir is exported for hostA only, then trying to migrate towards hostB will be allowed even though it's clearly not going to work. Same for the scenario in which you're trying to migrate towards hostA but the mount is not active on the remote side for whatever reason. Would the migration parameter mechanism perhaps be a better fit? As in, the management app would pass a list of paths only after performing all the relevant checks, with the advantage of having the full information about the migration at its disposal. -- Andrea Bolognani / Red Hat / Virtualization

On Mon, Nov 13, 2023 at 09:03:53AM -0800, Andrea Bolognani wrote:
On Wed, Nov 01, 2023 at 12:52:15PM +0000, Daniel P. Berrangé wrote:
On Tue, Oct 31, 2023 at 06:12:59PM +0100, Andrea Bolognani wrote:
if ((rc = virFileIsSharedFS(src)) < 0) { return false; } else if (rc == 0) { + /* Ignore the outcome of this check if we've been + * asked to assume that storage is shared */ + if (flags & VIR_MIGRATE_ASSUME_SHARED_STORAGE) + break;
So this is essentially telling this particular bit of code to ignore what virFileIsSharedFS reported.
The problem this is not the only piece of code relevant to live migration that cares about the virFileIsSharedFS answer. The TPM code also checks virFileIsSharedFS in the context of migration.
Also in the security manager, we check virFileIsSharedFS and skip restoring ownership on the src host, as that would kill access on the dst host.
So while VIR_MIGRATE_ASSUME_SHARED_STORAGE may work in the precise scenario you've tested, it looks like an incomplete solution to the described problem, and liable to cause new problems.
That's excellent insight, thanks!
I feel like the root problem here needs addressing in the virFileIsSharedFS method in some manner. Whether or not a given location is shared between two hosts cannot be reliably determined from stat() alone, we need some other inputs.
Maybe this is as simple as having a qemu.conf parameter
shared_storage_paths = [ "/some/dir", "/other/dir"....]
and then we register those paths via virFile during startup.
Extending virFileIsSharedFS to work correctly in special deployment scenarios can now be done by the admin when configuring the host.
Yes, this sounds like a much more reasonably sized hammer. I'll look into it.
One concern that I have is that if this is configured statically at the daemon level, i.e. not fully under control of the management app, there might be scenarios where things don't work as expected.
For example, if /some/dir is exported for hostA only, then trying to migrate towards hostB will be allowed even though it's clearly not going to work. Same for the scenario in which you're trying to migrate towards hostA but the mount is not active on the remote side for whatever reason.
Don't we do the shared storage check on both hosts ? ie, if both hostA and hostB libvirts' check /some/dir, then it will see that hostB's /some/dir is not an NFS volume. Of course both hostA and hostB could have an /etc/exports for their respective /some/dir. At some point we have to decide how much protection we want to try to give. Even today's shared storage check is not foolproof. Ie we check both src & target are shared storage, but we don't check they're actually the same shared storage.
Would the migration parameter mechanism perhaps be a better fit? As in, the management app would pass a list of paths only after performing all the relevant checks, with the advantage of having the full information about the migration at its disposal.
Mgmt apps don't neccessarily know any better than libvirt what config the admin has done on the host OS wrt NFS mounts. We're just as liable to have mgmt apps blindly pass in a set of paths just to bypass the libvirt check. Making this a host admin config task gets us alignment with what the admin has actually setup, as well as what a mgmt app might have done. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- docs/manpages/virsh.rst | 5 ++++- tools/virsh-domain.c | 5 +++++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst index 3e7a4c6c22..849933fa2e 100644 --- a/docs/manpages/virsh.rst +++ b/docs/manpages/virsh.rst @@ -3382,6 +3382,7 @@ migrate [--parallel [--parallel-connections connections]] [--bandwidth bandwidth] [--tls-destination hostname] [--disks-uri URI] [--copy-storage-synchronous-writes] + [--assume-shared-storage] Migrate domain to another host. Add *--live* for live migration; <--p2p> for peer-2-peer migration; *--direct* for direct migration; or *--tunnelled* @@ -3461,7 +3462,9 @@ considered unsafe. For QEMU domain, this may happen if the domain uses disks without explicitly setting cache mode to "none". Migrating such domains is unsafe unless the disk images are stored on coherent clustered filesystem, such as GFS2 or GPFS. If you are sure the migration is safe or you just do not -care, use *--unsafe* to force the migration. +care, use *--unsafe* to force the migration. *--assume-shared-storage* is a +weaker version of *--unsafe* which allows migration if the source disk is +detected as local while retaining all other checks. *dname* is used for renaming the domain to new name during migration, which also usually can be omitted. Likewise, *--xml* ``file`` is usually diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 66f933dead..cccf079c34 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -11128,6 +11128,10 @@ static const vshCmdOptDef opts_migrate[] = { .type = VSH_OT_INT, .help = N_("compress level for zstd compression") }, + {.name = "assume-shared-storage", + .type = VSH_OT_BOOL, + .help = N_("assume that disk storage is shared even if libvirt detects it as local") + }, {.name = NULL} }; @@ -11177,6 +11181,7 @@ doMigrate(void *opaque) { "tls", VIR_MIGRATE_TLS }, { "parallel", VIR_MIGRATE_PARALLEL }, { "suspend", VIR_MIGRATE_PAUSED }, + { "assume-shared-storage", VIR_MIGRATE_ASSUME_SHARED_STORAGE }, }; #ifndef WIN32 -- 2.41.0

On Tue, Oct 31, 2023 at 18:13:00 +0100, Andrea Bolognani wrote:
Signed-off-by: Andrea Bolognani <abologna@redhat.com> --- docs/manpages/virsh.rst | 5 ++++- tools/virsh-domain.c | 5 +++++ 2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst index 3e7a4c6c22..849933fa2e 100644 --- a/docs/manpages/virsh.rst +++ b/docs/manpages/virsh.rst @@ -3382,6 +3382,7 @@ migrate [--parallel [--parallel-connections connections]] [--bandwidth bandwidth] [--tls-destination hostname] [--disks-uri URI] [--copy-storage-synchronous-writes] + [--assume-shared-storage]
Migrate domain to another host. Add *--live* for live migration; <--p2p> for peer-2-peer migration; *--direct* for direct migration; or *--tunnelled* @@ -3461,7 +3462,9 @@ considered unsafe. For QEMU domain, this may happen if the domain uses disks without explicitly setting cache mode to "none". Migrating such domains is unsafe unless the disk images are stored on coherent clustered filesystem, such as GFS2 or GPFS. If you are sure the migration is safe or you just do not -care, use *--unsafe* to force the migration. +care, use *--unsafe* to force the migration. *--assume-shared-storage* is a +weaker version of *--unsafe* which allows migration if the source disk is +detected as local while retaining all other checks.
The docs here don't emphasisze just imply that the storage must be shared for the migration to work properly even if the check is bypassed.
*dname* is used for renaming the domain to new name during migration, which also usually can be omitted. Likewise, *--xml* ``file`` is usually diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 66f933dead..cccf079c34 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -11128,6 +11128,10 @@ static const vshCmdOptDef opts_migrate[] = { .type = VSH_OT_INT, .help = N_("compress level for zstd compression") }, + {.name = "assume-shared-storage", + .type = VSH_OT_BOOL, + .help = N_("assume that disk storage is shared even if libvirt detects it as local") + }, {.name = NULL} };

On Tue, Oct 31, 2023 at 18:12:57 +0100, Andrea Bolognani wrote:
This was initially motivated by a KubeVirt issue[1] concerning integration with the Portworx storage provide, but it turns out to be more generally applicable: since mounting an NFS share on the same host that is exporting it is known to cause issues and is therefore not recommended, we need a way to allow migration in such a configuration while still not going quite as far as VIR_MIGRATE_UNSAFE does and losing all handrails.
[1] https://issues.redhat.com/browse/CNV-34322
Andrea Bolognani (3): include: Introduce VIR_MIGRATE_ASSUME_SHARED_STORAGE qemu: Implement VIR_MIGRATE_ASSUME_SHARED_STORAGE support virsh: Wire up VIR_MIGRATE_ASSUME_SHARED_STORAGE support
Reviewed-by: Peter Krempa <pkrempa@redhat.com> but please add some wording being more explicit about the need for the storage to be still shared. While it's implied in all places users tend to ignore any non-obvious messaging.

On Wed, Nov 01, 2023 at 01:42:07PM +0100, Peter Krempa wrote:
On Tue, Oct 31, 2023 at 18:12:57 +0100, Andrea Bolognani wrote:
This was initially motivated by a KubeVirt issue[1] concerning integration with the Portworx storage provide, but it turns out to be more generally applicable: since mounting an NFS share on the same host that is exporting it is known to cause issues and is therefore not recommended, we need a way to allow migration in such a configuration while still not going quite as far as VIR_MIGRATE_UNSAFE does and losing all handrails.
[1] https://issues.redhat.com/browse/CNV-34322
Andrea Bolognani (3): include: Introduce VIR_MIGRATE_ASSUME_SHARED_STORAGE qemu: Implement VIR_MIGRATE_ASSUME_SHARED_STORAGE support virsh: Wire up VIR_MIGRATE_ASSUME_SHARED_STORAGE support
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
but please add some wording being more explicit about the need for the storage to be still shared. While it's implied in all places users tend to ignore any non-obvious messaging.
I don't think we should be adding this flag, but instead provide a way for an host admin to express to libvirt which local paths are exported to other hosts and also detect this from /etc/exports where available. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (3)
-
Andrea Bolognani
-
Daniel P. Berrangé
-
Peter Krempa