[PATCH v2 0/9] qemu: tpm: Add support for migration across shared storage

This series of patches adds support for migrating vTPMs across hosts whose storage has been set up to share the directory structure holding the state of the TPM (swtpm). A new migration flag VIR_MIGRATE_TPM_SHARED_STORAGE is added to enable this. This flag influences the management of the directory structure holding the TPM state, which for example is only removed when a domain is undefined and not when a VM is removed on the migration source host. Further, when shared storage is used then security labeling on the destination side is skipped assuming that the labeling was already done on the source side. I have tested this with an NFS setup where I had to turn SELinux off on the hosts since the SELinux MLS range labeling is not supported by NFS. Shared storage migration requires (upcoming) swtpm v0.8. Stefan Stefan Berger (9): util: Add parsing support for swtpm's cmdarg-migration capability qemu: Introduced VIR_MIGRATE_TPM_SHARED_STORAGE for TPM migration qemu: tpm: Conditionally create storage on incoming migration qemu: tpm: Pass --migration option to swtpm if supported qemu: tpm: Avoid security labels on incoming migration with shared storage qemu: tpm: Require UNDEFINE_TPM to be set to remove TPM state qemu: tpm: Determine whether to remove TPM state during migration qemu: tpm: Enable migration with VIR_MIGRATE_TPM_SHARED_STORAGE virsh: Add support for --tpm-shared-storage flag for migration docs/manpages/virsh.rst | 6 +++ include/libvirt/libvirt-domain.h | 8 +++ src/conf/domain_conf.h | 1 + src/qemu/qemu_driver.c | 4 +- src/qemu/qemu_extdevice.c | 5 +- src/qemu/qemu_extdevice.h | 3 +- src/qemu/qemu_migration.c | 23 +++++++-- src/qemu/qemu_migration.h | 1 + src/qemu/qemu_process.c | 10 ++-- src/qemu/qemu_process.h | 6 ++- src/qemu/qemu_saveimage.c | 2 +- src/qemu/qemu_snapshot.c | 4 +- src/qemu/qemu_tpm.c | 87 ++++++++++++++++++++++++++++---- src/qemu/qemu_tpm.h | 24 ++++++++- src/util/virtpm.c | 1 + src/util/virtpm.h | 1 + tools/virsh-domain.c | 7 +++ 17 files changed, 164 insertions(+), 29 deletions(-) -- 2.37.3

Add support for parsing swtpm 'cmdarg-migration' capability (since v0.8). Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/util/virtpm.c | 1 + src/util/virtpm.h | 1 + 2 files changed, 2 insertions(+) diff --git a/src/util/virtpm.c b/src/util/virtpm.c index 91db0f31eb..19850de1c8 100644 --- a/src/util/virtpm.c +++ b/src/util/virtpm.c @@ -39,6 +39,7 @@ VIR_LOG_INIT("util.tpm"); VIR_ENUM_IMPL(virTPMSwtpmFeature, VIR_TPM_SWTPM_FEATURE_LAST, "cmdarg-pwd-fd", + "cmdarg-migration", ); VIR_ENUM_IMPL(virTPMSwtpmSetupFeature, diff --git a/src/util/virtpm.h b/src/util/virtpm.h index a873881b23..fb330effa8 100644 --- a/src/util/virtpm.h +++ b/src/util/virtpm.h @@ -30,6 +30,7 @@ bool virTPMHasSwtpm(void); typedef enum { VIR_TPM_SWTPM_FEATURE_CMDARG_PWD_FD, + VIR_TPM_SWTPM_FEATURE_CMDARG_MIGRATION, VIR_TPM_SWTPM_FEATURE_LAST } virTPMSwtpmFeature; -- 2.37.3

Introduced VIR_MIGRATE_TPM_SHARED_STORAGE for migrating a TPM across shared storage. At this point do not support this flag in 'virsh', yet. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- include/libvirt/libvirt-domain.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 8357aea797..110929039d 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1098,6 +1098,14 @@ typedef enum { * Since: 8.5.0 */ VIR_MIGRATE_ZEROCOPY = (1 << 20), + + /* Support TPM migration across hosts that have shared storage setup for + * the directory structure holding the state of TPMs. Typically this would + * mean that the directory /var/lib/libvirt/swtpm is shared. + * + * Since: 8.9.0 + */ + VIR_MIGRATE_TPM_SHARED_STORAGE = (1 << 21), } virDomainMigrateFlags; -- 2.37.3

On Wed, Oct 05, 2022 at 10:02:00AM -0400, Stefan Berger wrote:
Introduced VIR_MIGRATE_TPM_SHARED_STORAGE for migrating a TPM across shared storage.
At this point do not support this flag in 'virsh', yet.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- include/libvirt/libvirt-domain.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 8357aea797..110929039d 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1098,6 +1098,14 @@ typedef enum { * Since: 8.5.0 */ VIR_MIGRATE_ZEROCOPY = (1 << 20), + + /* Support TPM migration across hosts that have shared storage setup for + * the directory structure holding the state of TPMs. Typically this would + * mean that the directory /var/lib/libvirt/swtpm is shared. + * + * Since: 8.9.0 + */ + VIR_MIGRATE_TPM_SHARED_STORAGE = (1 << 21),
Why do we need this flag at all. We don't require users to set any flag when dealing with shared storage for disks, we just make sure we detect shared storage and "do the right thing" with it. Adding this flag introduces failure scenarios, such as mgmt app not setting the flag, but still have the TPM on shared storage, in which case we'd be liable to take incorrect action on labelling. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 10/6/22 15:47, Daniel P. Berrangé wrote:
On Wed, Oct 05, 2022 at 10:02:00AM -0400, Stefan Berger wrote:
Introduced VIR_MIGRATE_TPM_SHARED_STORAGE for migrating a TPM across shared storage.
At this point do not support this flag in 'virsh', yet.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- include/libvirt/libvirt-domain.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 8357aea797..110929039d 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1098,6 +1098,14 @@ typedef enum { * Since: 8.5.0 */ VIR_MIGRATE_ZEROCOPY = (1 << 20), + + /* Support TPM migration across hosts that have shared storage setup for + * the directory structure holding the state of TPMs. Typically this would + * mean that the directory /var/lib/libvirt/swtpm is shared. + * + * Since: 8.9.0 + */ + VIR_MIGRATE_TPM_SHARED_STORAGE = (1 << 21),
Why do we need this flag at all. We don't require users to set any flag when dealing with shared storage for disks, we just make sure we detect shared storage and "do the right thing" with it.
That could work. Until the state is stored on a shared FS but not shared across migration hosts. But I guess our disk migration code would fail then too, wouldn't it? Michal

On Thu, Oct 06, 2022 at 04:07:13PM +0200, Michal Prívozník wrote:
On 10/6/22 15:47, Daniel P. Berrangé wrote:
On Wed, Oct 05, 2022 at 10:02:00AM -0400, Stefan Berger wrote:
Introduced VIR_MIGRATE_TPM_SHARED_STORAGE for migrating a TPM across shared storage.
At this point do not support this flag in 'virsh', yet.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- include/libvirt/libvirt-domain.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 8357aea797..110929039d 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1098,6 +1098,14 @@ typedef enum { * Since: 8.5.0 */ VIR_MIGRATE_ZEROCOPY = (1 << 20), + + /* Support TPM migration across hosts that have shared storage setup for + * the directory structure holding the state of TPMs. Typically this would + * mean that the directory /var/lib/libvirt/swtpm is shared. + * + * Since: 8.9.0 + */ + VIR_MIGRATE_TPM_SHARED_STORAGE = (1 << 21),
Why do we need this flag at all. We don't require users to set any flag when dealing with shared storage for disks, we just make sure we detect shared storage and "do the right thing" with it.
That could work. Until the state is stored on a shared FS but not shared across migration hosts. But I guess our disk migration code would fail then too, wouldn't it?
Exactly, if our existing code is good enough for disks for the last NNN years, then its good enough for TPM too. If someone finds a broken scenario, then we'll need to fix it for all cases, and that'll require something more general than a VIR_MIGRATE_TPM_SHARED_STORAGE flag. It is basically akin to a "make it work=yes" setting, and actually introduces failure scenarios that would not otherwise exist. eg if someone sets VIR_MIGRATE_TPM_SHARED_STORAGE when the TPM is on local only storage, or fails to set it when using shared storage. Ergo, we should not add this flag to our public API. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 10/14/22 11:28, Daniel P. Berrangé wrote:
On Thu, Oct 06, 2022 at 04:07:13PM +0200, Michal Prívozník wrote:
On 10/6/22 15:47, Daniel P. Berrangé wrote:
On Wed, Oct 05, 2022 at 10:02:00AM -0400, Stefan Berger wrote:
Introduced VIR_MIGRATE_TPM_SHARED_STORAGE for migrating a TPM across shared storage.
At this point do not support this flag in 'virsh', yet.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- include/libvirt/libvirt-domain.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 8357aea797..110929039d 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1098,6 +1098,14 @@ typedef enum { * Since: 8.5.0 */ VIR_MIGRATE_ZEROCOPY = (1 << 20), + + /* Support TPM migration across hosts that have shared storage setup for + * the directory structure holding the state of TPMs. Typically this would + * mean that the directory /var/lib/libvirt/swtpm is shared. + * + * Since: 8.9.0 + */ + VIR_MIGRATE_TPM_SHARED_STORAGE = (1 << 21),
Why do we need this flag at all. We don't require users to set any flag when dealing with shared storage for disks, we just make sure we detect shared storage and "do the right thing" with it.
That could work. Until the state is stored on a shared FS but not shared across migration hosts. But I guess our disk migration code would fail then too, wouldn't it?
Exactly, if our existing code is good enough for disks for the last NNN years, then its good enough for TPM too. If someone finds a broken scenario, then we'll need to fix it for all cases, and that'll require something more general than a VIR_MIGRATE_TPM_SHARED_STORAGE flag.
It is basically akin to a "make it work=yes" setting, and actually introduces failure scenarios that would not otherwise exist. eg if someone sets VIR_MIGRATE_TPM_SHARED_STORAGE when the TPM is on local only storage, or fails to set it when using shared storage.
I was trying to find out how storage is being copied and whether there is any testing going on regarding whether storage for storage is shared or not and how that's done. However, it seems there's an explicit hint coming from the user encoded in the virsh command line options about non-shared storage for storage. Or maybe I not dig deep enough? if (vshCommandOptBool(cmd, "copy-storage-all")) flags |= VIR_MIGRATE_NON_SHARED_DISK; if (vshCommandOptBool(cmd, "copy-storage-inc")) flags |= VIR_MIGRATE_NON_SHARED_INC; if (vshCommandOptBool(cmd, "copy-storage-synchronous-writes")) { if (!(flags & VIR_MIGRATE_NON_SHARED_DISK) && !(flags & VIR_MIGRATE_NON_SHARED_INC)) { vshError(ctl, "'--copy-storage-synchronous-writes' requires one of '--copy-storage-all', 'copy-storage-inc'"); goto out; } flags |= VIR_MIGRATE_NON_SHARED_SYNCHRONOUS_WRITES; } So we won't do it this way for TPM state migration. Instead we can try to write on the source migration side a) a simple file and detect whether the file is at the destination b) a file with either a name or content that only the source and destination libvirts would know at this point b) is to prevent stale files from becoming indicators for shared storage. If you have any suggestions on how to do it or what data to write into the test file, let me know. A hash over some shared knowledge would do the trick I guess. Stefan
Ergo, we should not add this flag to our public API.
With regards, Daniel

On Mon, Oct 17, 2022 at 09:39:52AM -0400, Stefan Berger wrote:
On 10/14/22 11:28, Daniel P. Berrangé wrote:
On Thu, Oct 06, 2022 at 04:07:13PM +0200, Michal Prívozník wrote:
On 10/6/22 15:47, Daniel P. Berrangé wrote:
On Wed, Oct 05, 2022 at 10:02:00AM -0400, Stefan Berger wrote:
Introduced VIR_MIGRATE_TPM_SHARED_STORAGE for migrating a TPM across shared storage.
At this point do not support this flag in 'virsh', yet.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- include/libvirt/libvirt-domain.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 8357aea797..110929039d 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1098,6 +1098,14 @@ typedef enum { * Since: 8.5.0 */ VIR_MIGRATE_ZEROCOPY = (1 << 20), + + /* Support TPM migration across hosts that have shared storage setup for + * the directory structure holding the state of TPMs. Typically this would + * mean that the directory /var/lib/libvirt/swtpm is shared. + * + * Since: 8.9.0 + */ + VIR_MIGRATE_TPM_SHARED_STORAGE = (1 << 21),
Why do we need this flag at all. We don't require users to set any flag when dealing with shared storage for disks, we just make sure we detect shared storage and "do the right thing" with it.
That could work. Until the state is stored on a shared FS but not shared across migration hosts. But I guess our disk migration code would fail then too, wouldn't it?
Exactly, if our existing code is good enough for disks for the last NNN years, then its good enough for TPM too. If someone finds a broken scenario, then we'll need to fix it for all cases, and that'll require something more general than a VIR_MIGRATE_TPM_SHARED_STORAGE flag.
It is basically akin to a "make it work=yes" setting, and actually introduces failure scenarios that would not otherwise exist. eg if someone sets VIR_MIGRATE_TPM_SHARED_STORAGE when the TPM is on local only storage, or fails to set it when using shared storage.
I was trying to find out how storage is being copied and whether there is any testing going on regarding whether storage for storage is shared or not and how that's done. However, it seems there's an explicit hint coming from the user encoded in the virsh command line options about non-shared storage for storage. Or maybe I not dig deep enough?
if (vshCommandOptBool(cmd, "copy-storage-all")) flags |= VIR_MIGRATE_NON_SHARED_DISK;
if (vshCommandOptBool(cmd, "copy-storage-inc")) flags |= VIR_MIGRATE_NON_SHARED_INC;
if (vshCommandOptBool(cmd, "copy-storage-synchronous-writes")) { if (!(flags & VIR_MIGRATE_NON_SHARED_DISK) && !(flags & VIR_MIGRATE_NON_SHARED_INC)) { vshError(ctl, "'--copy-storage-synchronous-writes' requires one of '--copy-storage-all', 'copy-storage-inc'"); goto out; } flags |= VIR_MIGRATE_NON_SHARED_SYNCHRONOUS_WRITES; }
The key is in qemuMigrationSrcIsSafe(), and how it determines if a migration is safe. * Disk on local storage, no flags => unsafe, migration error * Disk on local storage, VIR_MIGRATE_NON_SHARED_DISK => ok, copies disk storage * Disk on shared storage, no flags => safe * Disk on shared storage, VIR_MIGRATE_NON_SHARED_DISK => ok, but needlessly copies disk storage The key helper methods are virFileIsSharedFS and virFileIsClusterFS which check the filesystem type for the path against a known list of shared/cluster FS.
So we won't do it this way for TPM state migration. Instead we can try to write on the source migration side
a) a simple file and detect whether the file is at the destination b) a file with either a name or content that only the source and destination libvirts would know at this point
b) is to prevent stale files from becoming indicators for shared storage.
No, please use the virFileIsSharedFS/ClusterFS helpers. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 10/17/22 09:48, Daniel P. Berrangé wrote:
On Mon, Oct 17, 2022 at 09:39:52AM -0400, Stefan Berger wrote:
The key is in qemuMigrationSrcIsSafe(), and how it determines if a migration is safe.
* Disk on local storage, no flags => unsafe, migration error * Disk on local storage, VIR_MIGRATE_NON_SHARED_DISK => ok, copies disk storage * Disk on shared storage, no flags => safe * Disk on shared storage, VIR_MIGRATE_NON_SHARED_DISK => ok, but needlessly copies disk storage
The key helper methods are virFileIsSharedFS and virFileIsClusterFS which check the filesystem type for the path against a known list of shared/cluster FS.
So we won't do it this way for TPM state migration. Instead we can try to write on the source migration side
a) a simple file and detect whether the file is at the destination b) a file with either a name or content that only the source and destination libvirts would know at this point
b) is to prevent stale files from becoming indicators for shared storage.
No, please use the virFileIsSharedFS/ClusterFS helpers.
I tried to use virFileIsSharedFS on the source and destination side of my NFS setup to see how they work. The issue here is that the NFS server side, that exports /var/lib/libvirt/swtpm and is also the migration source at some point, says this: /var/lib/libvirt/swtpm/ecc221c4-6bb9-423f-ac31-72244fdbb1a1/tpm2 is shared: 0 the NFS client side then says this: /var/lib/libvirt/swtpm/ecc221c4-6bb9-423f-ac31-72244fdbb1a1/tpm2 is shared: 1 The latter is correct, the former obviously not. Is this an illegal NFS setup or a shortcoming of the AP I suppose both sides should be able to run the API (and come up with the same result) and not the one side setting an additional migration flag when shared storage is found and that flag then appearing on the migration destination side, which then avoids having to run the API again there? Stefan +#include "virfile.h" + +static inline void +qemuTestSharedStorage(virQEMUDriver *driver, virDomainObj *vm) +{ + virDomainDef *def = vm->def; + size_t i; + int n; + + for (i = 0; i < def->ntpms; i++) { + if (def->tpms[i]->type == VIR_DOMAIN_TPM_TYPE_EMULATOR) { + n = qemuExtTPMInitPaths(driver, def, (def->tpms[i])); + n = virFileIsSharedFS(def->tpms[i]->data.emulator.storagepath); + fprintf(stderr, "%s is shared: %d\n", def->tpms[i]->data.emulator.storagepath, n); + } + } +} +
With regards, Daniel

On 10/17/22 17:17, Stefan Berger wrote:
On 10/17/22 09:48, Daniel P. Berrangé wrote:
On Mon, Oct 17, 2022 at 09:39:52AM -0400, Stefan Berger wrote:
The key is in qemuMigrationSrcIsSafe(), and how it determines if a migration is safe.
* Disk on local storage, no flags => unsafe, migration error * Disk on local storage, VIR_MIGRATE_NON_SHARED_DISK => ok, copies disk storage * Disk on shared storage, no flags => safe * Disk on shared storage, VIR_MIGRATE_NON_SHARED_DISK => ok, but needlessly copies disk storage
The key helper methods are virFileIsSharedFS and virFileIsClusterFS which check the filesystem type for the path against a known list of shared/cluster FS.
So we won't do it this way for TPM state migration. Instead we can try to write on the source migration side
a) a simple file and detect whether the file is at the destination b) a file with either a name or content that only the source and destination libvirts would know at this point
b) is to prevent stale files from becoming indicators for shared storage.
No, please use the virFileIsSharedFS/ClusterFS helpers.
I tried to use virFileIsSharedFS on the source and destination side of my NFS setup to see how they work. The issue here is that the NFS server side, that exports /var/lib/libvirt/swtpm and is also the migration source at some point, says this:
/var/lib/libvirt/swtpm/ecc221c4-6bb9-423f-ac31-72244fdbb1a1/tpm2 is shared: 0
the NFS client side then says this:
/var/lib/libvirt/swtpm/ecc221c4-6bb9-423f-ac31-72244fdbb1a1/tpm2 is shared: 1
Are both sides mounted as NFS or just one of them? I think it's safe to assume that either both sides are on a shared FS or none. Otherwise I'm quite sure file locking won't work anyways. We can go even further and test whether shared_fs(source) == shared_fs(dst); and throw an error if it doesn't. And since I've misguided you on the way with flag I am willing to help you write patches. Please let me know if you need any help. Michal

On 10/17/22 11:28, Michal Prívozník wrote:
On 10/17/22 17:17, Stefan Berger wrote:
On 10/17/22 09:48, Daniel P. Berrangé wrote:
On Mon, Oct 17, 2022 at 09:39:52AM -0400, Stefan Berger wrote:
The key is in qemuMigrationSrcIsSafe(), and how it determines if a migration is safe.
* Disk on local storage, no flags => unsafe, migration error * Disk on local storage, VIR_MIGRATE_NON_SHARED_DISK => ok, copies disk storage * Disk on shared storage, no flags => safe * Disk on shared storage, VIR_MIGRATE_NON_SHARED_DISK => ok, but needlessly copies disk storage
The key helper methods are virFileIsSharedFS and virFileIsClusterFS which check the filesystem type for the path against a known list of shared/cluster FS.
So we won't do it this way for TPM state migration. Instead we can try to write on the source migration side
a) a simple file and detect whether the file is at the destination b) a file with either a name or content that only the source and destination libvirts would know at this point
b) is to prevent stale files from becoming indicators for shared storage.
No, please use the virFileIsSharedFS/ClusterFS helpers.
I tried to use virFileIsSharedFS on the source and destination side of my NFS setup to see how they work. The issue here is that the NFS server side, that exports /var/lib/libvirt/swtpm and is also the migration source at some point, says this:
/var/lib/libvirt/swtpm/ecc221c4-6bb9-423f-ac31-72244fdbb1a1/tpm2 is shared: 0
the NFS client side then says this:
/var/lib/libvirt/swtpm/ecc221c4-6bb9-423f-ac31-72244fdbb1a1/tpm2 is shared: 1VIR_MIGRATE_TPM_SHARED_STORAGE
Are both sides mounted as NFS or just one of them? I think it's safe to
It's just one of them that is the client, the other is the server exporting the file system. So it will be a matter of documentation on how to setup the shared filesystem. For NFS this means that both sides of the migration must be an NFS client and not one side acting as the server exporting the filesystem to the client.
assume that either both sides are on a shared FS or none. Otherwise I'm quite sure file locking won't work anyways. We can go even further and
I saw an error about the locks on cleanup (due to missing code in the series to avoid security label cleanup on TPM stop) but otherwise it's working fine.
test whether shared_fs(source) == shared_fs(dst); and throw an error if it doesn't.
You mean we would set the (now non-public? or not allowed to be set by user?) flag VIR_MIGRATE_TPM_SHARED_STORAGE on the sources side if we determine shared storage and again test on the destination side whether it thinks about shared storage (hopefully with the source host and not some other host -- presumably an excluded case) in the same way as the source host did?
And since I've misguided you on the way with flag I am willing to help you write patches. Please let me know if you need any help.
I am not sure at this point, probably you have a better feeling how to do this correctly than I do. If we could run virFileIsSharedFS on both sides and get the same result, this would be easy but I think it's not that simple. It looks like we need to determine the VIR_MIGRATE_TPM_SHARED_STORAGE on the source side and pass it to the destination side and possibly test again there... Stefan
Michal

On Mon, Oct 17, 2022 at 11:17:56AM -0400, Stefan Berger wrote:
On 10/17/22 09:48, Daniel P. Berrangé wrote:
On Mon, Oct 17, 2022 at 09:39:52AM -0400, Stefan Berger wrote:
The key is in qemuMigrationSrcIsSafe(), and how it determines if a migration is safe.
* Disk on local storage, no flags => unsafe, migration error * Disk on local storage, VIR_MIGRATE_NON_SHARED_DISK => ok, copies disk storage * Disk on shared storage, no flags => safe * Disk on shared storage, VIR_MIGRATE_NON_SHARED_DISK => ok, but needlessly copies disk storage
The key helper methods are virFileIsSharedFS and virFileIsClusterFS which check the filesystem type for the path against a known list of shared/cluster FS.
So we won't do it this way for TPM state migration. Instead we can try to write on the source migration side
a) a simple file and detect whether the file is at the destination b) a file with either a name or content that only the source and destination libvirts would know at this point
b) is to prevent stale files from becoming indicators for shared storage.
No, please use the virFileIsSharedFS/ClusterFS helpers.
I tried to use virFileIsSharedFS on the source and destination side of my NFS setup to see how they work. The issue here is that the NFS server side, that exports /var/lib/libvirt/swtpm and is also the migration source at some point, says this:
We expect both sides to have the same storage configurtion. ie both must be NFS. I'm pretty sure our code for handling disks does not work when you have a local FS on one side, which is exported to the other side as NFS. Conceptally that's not something someone would do in production, since they you make the target host dependant on the src compute host, which is poor for resilience. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 10/18/22 04:15, Daniel P. Berrangé wrote:
On Mon, Oct 17, 2022 at 11:17:56AM -0400, Stefan Berger wrote:
On 10/17/22 09:48, Daniel P. Berrangé wrote:
On Mon, Oct 17, 2022 at 09:39:52AM -0400, Stefan Berger wrote:
The key is in qemuMigrationSrcIsSafe(), and how it determines if a migration is safe.
* Disk on local storage, no flags => unsafe, migration error * Disk on local storage, VIR_MIGRATE_NON_SHARED_DISK => ok, copies disk storage * Disk on shared storage, no flags => safe * Disk on shared storage, VIR_MIGRATE_NON_SHARED_DISK => ok, but needlessly copies disk storage
The key helper methods are virFileIsSharedFS and virFileIsClusterFS which check the filesystem type for the path against a known list of shared/cluster FS.
So we won't do it this way for TPM state migration. Instead we can try to write on the source migration side
a) a simple file and detect whether the file is at the destination b) a file with either a name or content that only the source and destination libvirts would know at this point
b) is to prevent stale files from becoming indicators for shared storage.
No, please use the virFileIsSharedFS/ClusterFS helpers.
I tried to use virFileIsSharedFS on the source and destination side of my NFS setup to see how they work. The issue here is that the NFS server side, that exports /var/lib/libvirt/swtpm and is also the migration source at some point, says this:
We expect both sides to have the same storage configurtion. ie both must be NFS. I'm pretty sure our code for handling disks does not work when you have a local FS on one side, which is exported to the other side as NFS. Conceptally that's not something someone would do in production, since they you make the target host dependant on the src compute host, which is poor for resilience.
Ok, so then we can replace (accesses to) VIR_MIGARTE_TPM_SHARED_STORAGE with calls to virFilesIsSharedFS() and simply assume that this function will return the same results on both sides of the migration (without testing it) and if it doesn't and failures occur then it's an unsupported shared storage setup (that is documented somewhere). I hope to not receive reports from ppl that don't describe their setup appropriately but see some odd failures because of this. Stefan
With regards, Daniel

On 10/18/22 14:23, Stefan Berger wrote:
On 10/18/22 04:15, Daniel P. Berrangé wrote:
On Mon, Oct 17, 2022 at 11:17:56AM -0400, Stefan Berger wrote:
On 10/17/22 09:48, Daniel P. Berrangé wrote:
On Mon, Oct 17, 2022 at 09:39:52AM -0400, Stefan Berger wrote:
The key is in qemuMigrationSrcIsSafe(), and how it determines if a migration is safe.
* Disk on local storage, no flags => unsafe, migration error * Disk on local storage, VIR_MIGRATE_NON_SHARED_DISK => ok, copies disk storage * Disk on shared storage, no flags => safe * Disk on shared storage, VIR_MIGRATE_NON_SHARED_DISK => ok, but needlessly copies disk storage
The key helper methods are virFileIsSharedFS and virFileIsClusterFS which check the filesystem type for the path against a known list of shared/cluster FS.
So we won't do it this way for TPM state migration. Instead we can try to write on the source migration side
a) a simple file and detect whether the file is at the destination b) a file with either a name or content that only the source and destination libvirts would know at this point
b) is to prevent stale files from becoming indicators for shared storage.
No, please use the virFileIsSharedFS/ClusterFS helpers.
I tried to use virFileIsSharedFS on the source and destination side of my NFS setup to see how they work. The issue here is that the NFS server side, that exports /var/lib/libvirt/swtpm and is also the migration source at some point, says this:
We expect both sides to have the same storage configurtion. ie both must be NFS. I'm pretty sure our code for handling disks does not work when you have a local FS on one side, which is exported to the other side as NFS. Conceptally that's not something someone would do in production, since they you make the target host dependant on the src compute host, which is poor for resilience.
Ok, so then we can replace (accesses to) VIR_MIGARTE_TPM_SHARED_STORAGE with calls to virFilesIsSharedFS() and simply assume that this function will return the same results on both sides of the migration (without testing it) and if it doesn't and failures occur then it's an unsupported shared storage setup (that is documented somewhere). I hope to not receive reports from ppl that don't describe their setup appropriately but see some odd failures because of this.
Just FYI, I'm testing the following patches: https://gitlab.com/MichalPrivoznik/libvirt/-/commits/tpm_migration_mine_v1 There're still some parts missing. but I'm continuing to work on them. Michal

On 10/18/22 09:47, Michal Prívozník wrote:
On 10/18/22 14:23, Stefan Berger wrote:
On 10/18/22 04:15, Daniel P. Berrangé wrote:
On Mon, Oct 17, 2022 at 11:17:56AM -0400, Stefan Berger wrote:
On 10/17/22 09:48, Daniel P. Berrangé wrote:
On Mon, Oct 17, 2022 at 09:39:52AM -0400, Stefan Berger wrote:
The key is in qemuMigrationSrcIsSafe(), and how it determines if a migration is safe.
* Disk on local storage, no flags => unsafe, migration error * Disk on local storage, VIR_MIGRATE_NON_SHARED_DISK => ok, copies disk storage * Disk on shared storage, no flags => safe * Disk on shared storage, VIR_MIGRATE_NON_SHARED_DISK => ok, but needlessly copies disk storage
The key helper methods are virFileIsSharedFS and virFileIsClusterFS which check the filesystem type for the path against a known list of shared/cluster FS.
So we won't do it this way for TPM state migration. Instead we can try to write on the source migration side
a) a simple file and detect whether the file is at the destination b) a file with either a name or content that only the source and destination libvirts would know at this point
b) is to prevent stale files from becoming indicators for shared storage.
No, please use the virFileIsSharedFS/ClusterFS helpers.
I tried to use virFileIsSharedFS on the source and destination side of my NFS setup to see how they work. The issue here is that the NFS server side, that exports /var/lib/libvirt/swtpm and is also the migration source at some point, says this:
We expect both sides to have the same storage configurtion. ie both must be NFS. I'm pretty sure our code for handling disks does not work when you have a local FS on one side, which is exported to the other side as NFS. Conceptally that's not something someone would do in production, since they you make the target host dependant on the src compute host, which is poor for resilience.
Ok, so then we can replace (accesses to) VIR_MIGARTE_TPM_SHARED_STORAGE with calls to virFilesIsSharedFS() and simply assume that this function will return the same results on both sides of the migration (without testing it) and if it doesn't and failures occur then it's an unsupported shared storage setup (that is documented somewhere). I hope to not receive reports from ppl that don't describe their setup appropriately but see some odd failures because of this.
Just FYI, I'm testing the following patches:
https://gitlab.com/MichalPrivoznik/libvirt/-/commits/tpm_migration_mine_v1
There're still some parts missing. but I'm continuing to work on them.
Oh well, I just morphed my series to get rid of the flag: https://github.com/stefanberger/libvirt-tpm/tree/swtpm_shared_storage.v3 Stefan
Michal

Do not create storage if TPM_SHARED_STORAGE migration flag is set and on incoming migration since in this case the storage directory must already exist. Also do not run swtpm_setup in this case. Pass the migration flag from migration related functions all the way down to TPM related functions. If no migration flags exist on higher layers, pass down '0'. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/qemu/qemu_driver.c | 4 ++-- src/qemu/qemu_extdevice.c | 5 +++-- src/qemu/qemu_extdevice.h | 3 ++- src/qemu/qemu_migration.c | 2 +- src/qemu/qemu_process.c | 10 ++++++---- src/qemu/qemu_process.h | 6 ++++-- src/qemu/qemu_saveimage.c | 2 +- src/qemu/qemu_snapshot.c | 4 ++-- src/qemu/qemu_tpm.c | 27 +++++++++++++++++++++------ src/qemu/qemu_tpm.h | 3 ++- 10 files changed, 44 insertions(+), 22 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 40d23b5723..3f163a4664 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1633,7 +1633,7 @@ static virDomainPtr qemuDomainCreateXML(virConnectPtr conn, if (qemuProcessStart(conn, driver, vm, NULL, VIR_ASYNC_JOB_START, NULL, -1, NULL, NULL, VIR_NETDEV_VPORT_PROFILE_OP_CREATE, - start_flags) < 0) { + start_flags, 0) < 0) { virDomainAuditStart(vm, "booted", false); qemuDomainRemoveInactive(driver, vm, 0); qemuProcessEndJob(vm); @@ -6555,7 +6555,7 @@ qemuDomainObjStart(virConnectPtr conn, ret = qemuProcessStart(conn, driver, vm, NULL, asyncJob, NULL, -1, NULL, NULL, - VIR_NETDEV_VPORT_PROFILE_OP_CREATE, start_flags); + VIR_NETDEV_VPORT_PROFILE_OP_CREATE, start_flags, 0); virDomainAuditStart(vm, "booted", ret >= 0); if (ret >= 0) { virObjectEvent *event = diff --git a/src/qemu/qemu_extdevice.c b/src/qemu/qemu_extdevice.c index 24a57b0f74..0bafe2b7b0 100644 --- a/src/qemu/qemu_extdevice.c +++ b/src/qemu/qemu_extdevice.c @@ -168,7 +168,8 @@ qemuExtDevicesCleanupHost(virQEMUDriver *driver, int qemuExtDevicesStart(virQEMUDriver *driver, virDomainObj *vm, - bool incomingMigration) + bool incomingMigration, + virDomainMigrateFlags flags) { virDomainDef *def = vm->def; size_t i; @@ -186,7 +187,7 @@ qemuExtDevicesStart(virQEMUDriver *driver, virDomainTPMDef *tpm = def->tpms[i]; if (tpm->type == VIR_DOMAIN_TPM_TYPE_EMULATOR && - qemuExtTPMStart(driver, vm, tpm, incomingMigration) < 0) + qemuExtTPMStart(driver, vm, tpm, incomingMigration, flags) < 0) return -1; } diff --git a/src/qemu/qemu_extdevice.h b/src/qemu/qemu_extdevice.h index 6b05b59cd6..723e21d42c 100644 --- a/src/qemu/qemu_extdevice.h +++ b/src/qemu/qemu_extdevice.h @@ -47,7 +47,8 @@ void qemuExtDevicesCleanupHost(virQEMUDriver *driver, int qemuExtDevicesStart(virQEMUDriver *driver, virDomainObj *vm, - bool incomingMigration) + bool incomingMigration, + virDomainMigrateFlags flags) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 33105cf07b..efb27a24aa 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3095,7 +3095,7 @@ qemuMigrationDstPrepareActive(virQEMUDriver *driver, rv = qemuProcessLaunch(dconn, driver, vm, VIR_ASYNC_JOB_MIGRATION_IN, incoming, NULL, VIR_NETDEV_VPORT_PROFILE_OP_MIGRATE_IN_START, - startFlags); + startFlags, flags); if (rv < 0) { if (rv == -2) relabel = true; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 97336e2622..f278b73858 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -7421,7 +7421,8 @@ qemuProcessLaunch(virConnectPtr conn, qemuProcessIncomingDef *incoming, virDomainMomentObj *snapshot, virNetDevVPortProfileOp vmop, - unsigned int flags) + unsigned int flags, + virDomainMigrateFlags migFlags) { int ret = -1; int rv; @@ -7485,7 +7486,7 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessGenID(vm, flags) < 0) goto cleanup; - if (qemuExtDevicesStart(driver, vm, incoming != NULL) < 0) + if (qemuExtDevicesStart(driver, vm, incoming != NULL, migFlags) < 0) goto cleanup; if (!(cmd = qemuBuildCommandLine(vm, @@ -7849,7 +7850,8 @@ qemuProcessStart(virConnectPtr conn, const char *migratePath, virDomainMomentObj *snapshot, virNetDevVPortProfileOp vmop, - unsigned int flags) + unsigned int flags, + virDomainMigrateFlags migFlags) { qemuDomainObjPrivate *priv = vm->privateData; qemuProcessIncomingDef *incoming = NULL; @@ -7901,7 +7903,7 @@ qemuProcessStart(virConnectPtr conn, } if ((rv = qemuProcessLaunch(conn, driver, vm, asyncJob, incoming, - snapshot, vmop, flags)) < 0) { + snapshot, vmop, flags, migFlags)) < 0) { if (rv == -2) relabel = true; goto stop; diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h index 421efc6016..76fcbd56e6 100644 --- a/src/qemu/qemu_process.h +++ b/src/qemu/qemu_process.h @@ -88,7 +88,8 @@ int qemuProcessStart(virConnectPtr conn, const char *stdin_path, virDomainMomentObj *snapshot, virNetDevVPortProfileOp vmop, - unsigned int flags); + unsigned int flags, + virDomainMigrateFlags migFlags); int qemuProcessCreatePretendCmdPrepare(virQEMUDriver *driver, virDomainObj *vm, @@ -130,7 +131,8 @@ int qemuProcessLaunch(virConnectPtr conn, qemuProcessIncomingDef *incoming, virDomainMomentObj *snapshot, virNetDevVPortProfileOp vmop, - unsigned int flags); + unsigned int flags, + virDomainMigrateFlags migFlags); int qemuProcessFinishStartup(virQEMUDriver *driver, virDomainObj *vm, diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 79567bf17d..af2394f829 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -632,7 +632,7 @@ qemuSaveImageStartVM(virConnectPtr conn, if (qemuProcessStart(conn, driver, vm, cookie ? cookie->cpu : NULL, asyncJob, "stdio", *fd, path, NULL, VIR_NETDEV_VPORT_PROFILE_OP_RESTORE, - start_flags) == 0) + start_flags, 0) == 0) started = true; if (intermediatefd != -1) { diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c index 06b5c180ff..471e14e22c 100644 --- a/src/qemu/qemu_snapshot.c +++ b/src/qemu/qemu_snapshot.c @@ -1999,7 +1999,7 @@ qemuSnapshotRevertActive(virDomainObj *vm, cookie ? cookie->cpu : NULL, VIR_ASYNC_JOB_START, NULL, -1, NULL, snap, VIR_NETDEV_VPORT_PROFILE_OP_CREATE, - start_flags); + start_flags, 0); virDomainAuditStart(vm, "from-snapshot", rc >= 0); detail = VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT; event = virDomainEventLifecycleNewFromObj(vm, @@ -2122,7 +2122,7 @@ qemuSnapshotRevertInactive(virDomainObj *vm, rc = qemuProcessStart(snapshot->domain->conn, driver, vm, NULL, VIR_ASYNC_JOB_START, NULL, -1, NULL, NULL, VIR_NETDEV_VPORT_PROFILE_OP_CREATE, - start_flags); + start_flags, 0); virDomainAuditStart(vm, "from-snapshot", rc >= 0); if (rc < 0) { qemuDomainRemoveInactive(driver, vm, 0); diff --git a/src/qemu/qemu_tpm.c b/src/qemu/qemu_tpm.c index dc09c94a4d..07def3c840 100644 --- a/src/qemu/qemu_tpm.c +++ b/src/qemu/qemu_tpm.c @@ -536,6 +536,7 @@ qemuTPMEmulatorReconfigure(const char *storagepath, * @swtpm_user: The uid for the swtpm to run as (drop privileges to from root) * @swtpm_group: The gid for the swtpm to run as * @incomingMigration: whether we have an incoming migration + * @flags: migration flags * * Create the virCommand use for starting the emulator * Do some initializations on the way, such as creation of storage @@ -548,7 +549,8 @@ qemuTPMEmulatorBuildCommand(virDomainTPMDef *tpm, bool privileged, uid_t swtpm_user, gid_t swtpm_group, - bool incomingMigration) + bool incomingMigration, + virDomainMigrateFlags flags) { g_autoptr(virCommand) cmd = NULL; bool created = false; @@ -556,11 +558,19 @@ qemuTPMEmulatorBuildCommand(virDomainTPMDef *tpm, int pwdfile_fd = -1; int migpwdfile_fd = -1; const unsigned char *secretuuid = NULL; + bool create_storage = true; if (!swtpm) return NULL; - if (qemuTPMEmulatorCreateStorage(tpm, &created, swtpm_user, swtpm_group) < 0) + /* Do not create storage and run swtpm_setup on incoming migration over + * shared storage + */ + if (incomingMigration && (flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) + create_storage = false; + + if (create_storage && + qemuTPMEmulatorCreateStorage(tpm, &created, swtpm_user, swtpm_group) < 0) return NULL; if (tpm->data.emulator.hassecretuuid) @@ -854,6 +864,7 @@ qemuExtTPMEmulatorSetupCgroup(const char *swtpmStateDir, * @tpm: TPM definition * @shortName: short and unique name of the domain * @incomingMigration: whether we have an incoming migration + * @flags: migration flags * * Start the external TPM Emulator: * - have the command line built @@ -864,7 +875,8 @@ qemuTPMEmulatorStart(virQEMUDriver *driver, virDomainObj *vm, const char *shortName, virDomainTPMDef *tpm, - bool incomingMigration) + bool incomingMigration, + virDomainMigrateFlags flags) { g_autoptr(virCommand) cmd = NULL; VIR_AUTOCLOSE errfd = -1; @@ -884,7 +896,8 @@ qemuTPMEmulatorStart(virQEMUDriver *driver, driver->privileged, cfg->swtpm_user, cfg->swtpm_group, - incomingMigration))) + incomingMigration, + flags))) return -1; if (qemuExtDeviceLogCommand(driver, vm, cmd, "TPM Emulator") < 0) @@ -1011,14 +1024,16 @@ int qemuExtTPMStart(virQEMUDriver *driver, virDomainObj *vm, virDomainTPMDef *tpm, - bool incomingMigration) + bool incomingMigration, + virDomainMigrateFlags flags) { g_autofree char *shortName = virDomainDefGetShortName(vm->def); if (!shortName) return -1; - return qemuTPMEmulatorStart(driver, vm, shortName, tpm, incomingMigration); + return qemuTPMEmulatorStart(driver, vm, shortName, tpm, incomingMigration, + flags); } diff --git a/src/qemu/qemu_tpm.h b/src/qemu/qemu_tpm.h index f068f3ca5a..410c9ec1c6 100644 --- a/src/qemu/qemu_tpm.h +++ b/src/qemu/qemu_tpm.h @@ -42,7 +42,8 @@ void qemuExtTPMCleanupHost(virDomainTPMDef *tpm, int qemuExtTPMStart(virQEMUDriver *driver, virDomainObj *vm, virDomainTPMDef *def, - bool incomingMigration) + bool incomingMigration, + virDomainMigrateFlags flags) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) G_GNUC_WARN_UNUSED_RESULT; -- 2.37.3

On 10/5/22 16:02, Stefan Berger wrote:
Do not create storage if TPM_SHARED_STORAGE migration flag is set and on incoming migration since in this case the storage directory must already exist. Also do not run swtpm_setup in this case.
Pass the migration flag from migration related functions all the way down to TPM related functions. If no migration flags exist on higher layers, pass down '0'.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/qemu/qemu_driver.c | 4 ++-- src/qemu/qemu_extdevice.c | 5 +++-- src/qemu/qemu_extdevice.h | 3 ++- src/qemu/qemu_migration.c | 2 +- src/qemu/qemu_process.c | 10 ++++++---- src/qemu/qemu_process.h | 6 ++++-- src/qemu/qemu_saveimage.c | 2 +- src/qemu/qemu_snapshot.c | 4 ++-- src/qemu/qemu_tpm.c | 27 +++++++++++++++++++++------ src/qemu/qemu_tpm.h | 3 ++- 10 files changed, 44 insertions(+), 22 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 40d23b5723..3f163a4664 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1633,7 +1633,7 @@ static virDomainPtr qemuDomainCreateXML(virConnectPtr conn, if (qemuProcessStart(conn, driver, vm, NULL, VIR_ASYNC_JOB_START, NULL, -1, NULL, NULL, VIR_NETDEV_VPORT_PROFILE_OP_CREATE, - start_flags) < 0) { + start_flags, 0) < 0) { virDomainAuditStart(vm, "booted", false); qemuDomainRemoveInactive(driver, vm, 0); qemuProcessEndJob(vm); @@ -6555,7 +6555,7 @@ qemuDomainObjStart(virConnectPtr conn,
ret = qemuProcessStart(conn, driver, vm, NULL, asyncJob, NULL, -1, NULL, NULL, - VIR_NETDEV_VPORT_PROFILE_OP_CREATE, start_flags); + VIR_NETDEV_VPORT_PROFILE_OP_CREATE, start_flags, 0); virDomainAuditStart(vm, "booted", ret >= 0); if (ret >= 0) { virObjectEvent *event = diff --git a/src/qemu/qemu_extdevice.c b/src/qemu/qemu_extdevice.c index 24a57b0f74..0bafe2b7b0 100644 --- a/src/qemu/qemu_extdevice.c +++ b/src/qemu/qemu_extdevice.c @@ -168,7 +168,8 @@ qemuExtDevicesCleanupHost(virQEMUDriver *driver, int qemuExtDevicesStart(virQEMUDriver *driver, virDomainObj *vm, - bool incomingMigration) + bool incomingMigration, + virDomainMigrateFlags flags) { virDomainDef *def = vm->def; size_t i; @@ -186,7 +187,7 @@ qemuExtDevicesStart(virQEMUDriver *driver, virDomainTPMDef *tpm = def->tpms[i];
if (tpm->type == VIR_DOMAIN_TPM_TYPE_EMULATOR && - qemuExtTPMStart(driver, vm, tpm, incomingMigration) < 0) + qemuExtTPMStart(driver, vm, tpm, incomingMigration, flags) < 0) return -1; }
diff --git a/src/qemu/qemu_extdevice.h b/src/qemu/qemu_extdevice.h index 6b05b59cd6..723e21d42c 100644 --- a/src/qemu/qemu_extdevice.h +++ b/src/qemu/qemu_extdevice.h @@ -47,7 +47,8 @@ void qemuExtDevicesCleanupHost(virQEMUDriver *driver,
int qemuExtDevicesStart(virQEMUDriver *driver, virDomainObj *vm, - bool incomingMigration) + bool incomingMigration, + virDomainMigrateFlags flags) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT;
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 33105cf07b..efb27a24aa 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3095,7 +3095,7 @@ qemuMigrationDstPrepareActive(virQEMUDriver *driver, rv = qemuProcessLaunch(dconn, driver, vm, VIR_ASYNC_JOB_MIGRATION_IN, incoming, NULL, VIR_NETDEV_VPORT_PROFILE_OP_MIGRATE_IN_START, - startFlags); + startFlags, flags); if (rv < 0) { if (rv == -2) relabel = true; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 97336e2622..f278b73858 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -7421,7 +7421,8 @@ qemuProcessLaunch(virConnectPtr conn, qemuProcessIncomingDef *incoming, virDomainMomentObj *snapshot, virNetDevVPortProfileOp vmop, - unsigned int flags) + unsigned int flags, + virDomainMigrateFlags migFlags)
To avoid having to invent new argument, we have @flags already which is a mixture of internal and public flags. Then, each caller has a logic that sets internal flags and possibly translates public ones. The same applies to qemuProcessStart(). Michal

Always pass the --migration option to swtpm, if swptm supports it (staring with v0.8). Always apply the 'release-lock-outgoing' parameter with this option and apply the 'incoming' parameter for incoming migration so that swtpm releases the file lock on the source side when the state is migrated and locks the file on the destination side when the state is received. If a started swtpm instance is capable of migrating with share storage then remember this with a flag in the virDomainTPMDef. This flag allows for modifications of the installed swtpm, such as installing a version that does not support migration with shared storage. Report an error if swtpm does not support the --migration option and an incoming migration across shared storage is requested. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/conf/domain_conf.h | 1 + src/qemu/qemu_migration.c | 8 ++++++++ src/qemu/qemu_tpm.c | 40 +++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_tpm.h | 3 +++ 4 files changed, 52 insertions(+) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 352b88eae5..4f9b5c6686 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1461,6 +1461,7 @@ struct _virDomainTPMDef { bool hassecretuuid; bool persistent_state; virBitmap *activePcrBanks; + bool canMigrateWithSharedStorage; } emulator; } data; }; diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index efb27a24aa..431b1b0bcb 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -38,6 +38,7 @@ #include "qemu_security.h" #include "qemu_slirp.h" #include "qemu_block.h" +#include "qemu_tpm.h" #include "domain_audit.h" #include "virlog.h" @@ -2789,6 +2790,13 @@ qemuMigrationSrcBegin(virConnectPtr conn, goto cleanup; } + if ((flags & VIR_MIGRATE_TPM_SHARED_STORAGE) && + !qemuTPMCanMigrateSharedStorage(vm->def)) { + virReportError(VIR_ERR_NO_SUPPORT, "%s", + _("the running swtpm does not support migration with shared storage")); + goto cleanup; + } + if (flags & VIR_MIGRATE_POSTCOPY_RESUME) { ret = qemuMigrationSrcBeginResumePhase(conn, driver, vm, xmlin, cookieout, cookieoutlen, flags); diff --git a/src/qemu/qemu_tpm.c b/src/qemu/qemu_tpm.c index 07def3c840..fde15b7587 100644 --- a/src/qemu/qemu_tpm.c +++ b/src/qemu/qemu_tpm.c @@ -644,6 +644,32 @@ qemuTPMEmulatorBuildCommand(virDomainTPMDef *tpm, virCommandAddArgFormat(cmd, "pwdfd=%d,mode=aes-256-cbc", migpwdfile_fd); } + /* If swtpm supports it, start it with --migration release-lock-outgoing + * so it can migrate across shared storage if needed. + */ + tpm->data.emulator.canMigrateWithSharedStorage = false; + if (virTPMSwtpmCapsGet(VIR_TPM_SWTPM_FEATURE_CMDARG_MIGRATION)) { + bool incoming = false; + + if (incomingMigration && (flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) + incoming = true; + virCommandAddArg(cmd, "--migration"); + virCommandAddArgFormat(cmd, "release-lock-outgoing%s", + incoming ? ",incoming": ""); + tpm->data.emulator.canMigrateWithSharedStorage = true; + } else { + /* Report an error if there's an incoming migration across shared + * storage and swtpm does not support the --migration option. + */ + if (incomingMigration && (flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, + _("%s (on destination side) does not support the --migration option " + "needed for migration with shared storage"), + swtpm); + goto error; + } + } + return g_steal_pointer(&cmd); error: @@ -967,6 +993,20 @@ qemuTPMEmulatorStart(virQEMUDriver *driver, } +bool +qemuTPMCanMigrateSharedStorage(virDomainDef *def) +{ + size_t i; + + for (i = 0; i < def->ntpms; i++) { + if (def->tpms[i]->type == VIR_DOMAIN_TPM_TYPE_EMULATOR && + !def->tpms[i]->data.emulator.canMigrateWithSharedStorage) { + return false; + } + } + return true; +} + /* --------------------- * Module entry points * --------------------- diff --git a/src/qemu/qemu_tpm.h b/src/qemu/qemu_tpm.h index 410c9ec1c6..630fa7074f 100644 --- a/src/qemu/qemu_tpm.h +++ b/src/qemu/qemu_tpm.h @@ -57,3 +57,6 @@ int qemuExtTPMSetupCgroup(virQEMUDriver *driver, virCgroup *cgroup) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) G_GNUC_WARN_UNUSED_RESULT; + +bool qemuTPMCanMigrateSharedStorage(virDomainDef *def) + ATTRIBUTE_NONNULL(1); -- 2.37.3

On 10/5/22 10:02, Stefan Berger wrote:
Always pass the --migration option to swtpm, if swptm supports it (staring with v0.8). Always apply the 'release-lock-outgoing' parameter with this option and apply the 'incoming' parameter for incoming migration so that swtpm releases the file lock on the source side when the state is migrated and locks the file on the destination side when the state is received.
If a started swtpm instance is capable of migrating with share storage then remember this with a flag in the virDomainTPMDef. This flag allows for modifications of the installed swtpm, such as installing a version that does not support migration with shared storage.
Report an error if swtpm does not support the --migration option and an incoming migration across shared storage is requested.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/conf/domain_conf.h | 1 + src/qemu/qemu_migration.c | 8 ++++++++ src/qemu/qemu_tpm.c | 40 +++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_tpm.h | 3 +++ 4 files changed, 52 insertions(+)
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 352b88eae5..4f9b5c6686 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1461,6 +1461,7 @@ struct _virDomainTPMDef { bool hassecretuuid; bool persistent_state; virBitmap *activePcrBanks; + bool canMigrateWithSharedStorage; } emulator; } data; }; diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index efb27a24aa..431b1b0bcb 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -38,6 +38,7 @@ #include "qemu_security.h" #include "qemu_slirp.h" #include "qemu_block.h" +#include "qemu_tpm.h"
#include "domain_audit.h" #include "virlog.h" @@ -2789,6 +2790,13 @@ qemuMigrationSrcBegin(virConnectPtr conn, goto cleanup; }
+ if ((flags & VIR_MIGRATE_TPM_SHARED_STORAGE) && + !qemuTPMCanMigrateSharedStorage(vm->def)) { + virReportError(VIR_ERR_NO_SUPPORT, "%s", + _("the running swtpm does not support migration with shared storage")); + goto cleanup; + } + if (flags & VIR_MIGRATE_POSTCOPY_RESUME) { ret = qemuMigrationSrcBeginResumePhase(conn, driver, vm, xmlin, cookieout, cookieoutlen, flags); diff --git a/src/qemu/qemu_tpm.c b/src/qemu/qemu_tpm.c index 07def3c840..fde15b7587 100644 --- a/src/qemu/qemu_tpm.c +++ b/src/qemu/qemu_tpm.c @@ -644,6 +644,32 @@ qemuTPMEmulatorBuildCommand(virDomainTPMDef *tpm, virCommandAddArgFormat(cmd, "pwdfd=%d,mode=aes-256-cbc", migpwdfile_fd); }
+ /* If swtpm supports it, start it with --migration release-lock-outgoing + * so it can migrate across shared storage if needed. + */ + tpm->data.emulator.canMigrateWithSharedStorage = false; + if (virTPMSwtpmCapsGet(VIR_TPM_SWTPM_FEATURE_CMDARG_MIGRATION)) { + bool incoming = false; + + if (incomingMigration && (flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) + incoming = true; + virCommandAddArg(cmd, "--migration"); + virCommandAddArgFormat(cmd, "release-lock-outgoing%s", + incoming ? ",incoming": ""); + tpm->data.emulator.canMigrateWithSharedStorage = true;
This flag remembers the capability of swtpm that was started and allows to downgrade swtpm while the VM is running for example (which doesn't influence the running swtpm). When libvirt is restarted is unfortunately refuses to migrate the VM when --tpm-shared-storage is set up because this flag was forgotten. How would I persist this flag? 6/9 has wrong title... I will fix. Stefan

On 10/5/22 16:02, Stefan Berger wrote:
Always pass the --migration option to swtpm, if swptm supports it (staring with v0.8). Always apply the 'release-lock-outgoing' parameter with this option and apply the 'incoming' parameter for incoming migration so that swtpm releases the file lock on the source side when the state is migrated and locks the file on the destination side when the state is received.
If a started swtpm instance is capable of migrating with share storage then remember this with a flag in the virDomainTPMDef. This flag allows for modifications of the installed swtpm, such as installing a version that does not support migration with shared storage.
Report an error if swtpm does not support the --migration option and an incoming migration across shared storage is requested.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/conf/domain_conf.h | 1 + src/qemu/qemu_migration.c | 8 ++++++++ src/qemu/qemu_tpm.c | 40 +++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_tpm.h | 3 +++ 4 files changed, 52 insertions(+)
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 352b88eae5..4f9b5c6686 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1461,6 +1461,7 @@ struct _virDomainTPMDef { bool hassecretuuid; bool persistent_state; virBitmap *activePcrBanks; + bool canMigrateWithSharedStorage;
I wonder whether this is something that should be stored in privateData. I mean, gradually we are introducing 'void *privateData' not only to virDomainObj but also to devices (lastly in ea3c3f88462153139b36282a22d727e920735f8b). We could then store other information there too, e.g. whether given TPM is migrating or not and use that information when removing an inactive domain.
} emulator; } data; }; diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index efb27a24aa..431b1b0bcb 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -38,6 +38,7 @@ #include "qemu_security.h" #include "qemu_slirp.h" #include "qemu_block.h" +#include "qemu_tpm.h"
#include "domain_audit.h" #include "virlog.h" @@ -2789,6 +2790,13 @@ qemuMigrationSrcBegin(virConnectPtr conn, goto cleanup; }
+ if ((flags & VIR_MIGRATE_TPM_SHARED_STORAGE) && + !qemuTPMCanMigrateSharedStorage(vm->def)) { + virReportError(VIR_ERR_NO_SUPPORT, "%s", + _("the running swtpm does not support migration with shared storage")); + goto cleanup; + } +
This works, but I think qemuMigrationSrcIsAllowed() is better place for this check.
if (flags & VIR_MIGRATE_POSTCOPY_RESUME) { ret = qemuMigrationSrcBeginResumePhase(conn, driver, vm, xmlin, cookieout, cookieoutlen, flags); diff --git a/src/qemu/qemu_tpm.c b/src/qemu/qemu_tpm.c index 07def3c840..fde15b7587 100644 --- a/src/qemu/qemu_tpm.c +++ b/src/qemu/qemu_tpm.c @@ -644,6 +644,32 @@ qemuTPMEmulatorBuildCommand(virDomainTPMDef *tpm, virCommandAddArgFormat(cmd, "pwdfd=%d,mode=aes-256-cbc", migpwdfile_fd); }
+ /* If swtpm supports it, start it with --migration release-lock-outgoing + * so it can migrate across shared storage if needed. + */ + tpm->data.emulator.canMigrateWithSharedStorage = false; + if (virTPMSwtpmCapsGet(VIR_TPM_SWTPM_FEATURE_CMDARG_MIGRATION)) { + bool incoming = false; + + if (incomingMigration && (flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) + incoming = true; + virCommandAddArg(cmd, "--migration"); + virCommandAddArgFormat(cmd, "release-lock-outgoing%s", + incoming ? ",incoming": ""); + tpm->data.emulator.canMigrateWithSharedStorage = true; + } else { + /* Report an error if there's an incoming migration across shared + * storage and swtpm does not support the --migration option. + */ + if (incomingMigration && (flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, + _("%s (on destination side) does not support the --migration option " + "needed for migration with shared storage"),
Error messages are exempt from long line rule.
+ swtpm); + goto error; + } + } + return g_steal_pointer(&cmd);
error:
Michal

When using shared storage there is no need to apply security labels on the storage since the files have to have been labeled already on the source side and we must assume that the source and destination side have been setup to use the same uid and gid for running swtpm as well as share the same security labels. Whether the security labels can be used at all depends on the shared storage and whether and how it supports them. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/qemu/qemu_tpm.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_tpm.c b/src/qemu/qemu_tpm.c index fde15b7587..2b2d2eba5a 100644 --- a/src/qemu/qemu_tpm.c +++ b/src/qemu/qemu_tpm.c @@ -937,10 +937,18 @@ qemuTPMEmulatorStart(virQEMUDriver *driver, virCommandSetPidFile(cmd, pidfile); virCommandSetErrorFD(cmd, &errfd); - if (qemuSecurityStartTPMEmulator(driver, vm, cmd, - cfg->swtpm_user, cfg->swtpm_group, - NULL, &cmdret) < 0) - return -1; + if (incomingMigration && (flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) { + /* security labels must have been set up on source already */ + if (qemuSecurityCommandRun(driver, vm, cmd, + cfg->swtpm_user, cfg->swtpm_group, + NULL, &cmdret) < 0) { + goto error; + } + } else if (qemuSecurityStartTPMEmulator(driver, vm, cmd, + cfg->swtpm_user, cfg->swtpm_group, + NULL, &cmdret) < 0) { + goto error; + } if (cmdret < 0) { /* virCommandRun() hidden in qemuSecurityStartTPMEmulator() -- 2.37.3

When migrating the TPM in a setup that has shared storage for the TPM state files setup between hosts we never remove the state. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/qemu/qemu_tpm.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/qemu/qemu_tpm.c b/src/qemu/qemu_tpm.c index 2b2d2eba5a..59de13061a 100644 --- a/src/qemu/qemu_tpm.c +++ b/src/qemu/qemu_tpm.c @@ -737,6 +737,10 @@ static void qemuTPMEmulatorCleanupHost(virDomainTPMDef *tpm, virDomainUndefineFlagsValues flags) { + /* Never remove the state in case of migration with shared storage. */ + if ((flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) + return; + /* * remove TPM state if: * - persistent_state flag is set and the UNDEFINE_TPM flag is set -- 2.37.3

On 10/5/22 16:02, Stefan Berger wrote:
When migrating the TPM in a setup that has shared storage for the TPM state files setup between hosts we never remove the state.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/qemu/qemu_tpm.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/src/qemu/qemu_tpm.c b/src/qemu/qemu_tpm.c index 2b2d2eba5a..59de13061a 100644 --- a/src/qemu/qemu_tpm.c +++ b/src/qemu/qemu_tpm.c @@ -737,6 +737,10 @@ static void qemuTPMEmulatorCleanupHost(virDomainTPMDef *tpm, virDomainUndefineFlagsValues flags) { + /* Never remove the state in case of migration with shared storage. */ + if ((flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) + return;
This is testing a flag from a different enum. If there's ever an undefine flag like: VIR_DOMAIN_UNDEFINE_EXAMPLE = (1<<21) then this is going to be wrongly evaluated. Can't callers just pass VIR_DOMAIN_UNDEFINE_KEEP_TPM? Alternatively, if we invent private data (see my comment to one of previous patches), this can be plain: if (QEMU_DOMAIN_TPM_PRIVATE(tpm)->migrating) return; (or whatever member I suggested). Michal

On 10/6/22 09:26, Michal Prívozník wrote:
On 10/5/22 16:02, Stefan Berger wrote:
When migrating the TPM in a setup that has shared storage for the TPM state files setup between hosts we never remove the state.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/qemu/qemu_tpm.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/src/qemu/qemu_tpm.c b/src/qemu/qemu_tpm.c index 2b2d2eba5a..59de13061a 100644 --- a/src/qemu/qemu_tpm.c +++ b/src/qemu/qemu_tpm.c @@ -737,6 +737,10 @@ static void qemuTPMEmulatorCleanupHost(virDomainTPMDef *tpm, virDomainUndefineFlagsValues flags) { + /* Never remove the state in case of migration with shared storage. */ + if ((flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) + return;
This is testing a flag from a different enum. If there's ever an
Uuuh, right. If we had a function that we could always call to determine whether we migrate across shared storage then any access to VIR_MIGRATE_TPM_SHARED_STORAGE could be replaced with a call to this function.
undefine flag like:
VIR_DOMAIN_UNDEFINE_EXAMPLE = (1<<21)
then this is going to be wrongly evaluated. Can't callers just pass VIR_DOMAIN_UNDEFINE_KEEP_TPM?
We have this here in this function. /* * remove TPM state if: * - persistent_state flag is set and the UNDEFINE_TPM flag is set * - persistent_state flag is not set and the KEEP_TPM flag is not set */ if ((tpm->data.emulator.persistent_state && (flags & VIR_DOMAIN_UNDEFINE_TPM)) || (!tpm->data.emulator.persistent_state && !(flags & VIR_DOMAIN_UNDEFINE_KEEP_TPM))) { qemuTPMEmulatorDeleteStorage(tpm); }
Alternatively, if we invent private data (see my comment to one of previous patches), this can be plain:
if (QEMU_DOMAIN_TPM_PRIVATE(tpm)->migrating) return;
Iirc you responded to me asking for being able to store info whether the currently running version of swtpm is able to handle shared storage migration (due to the additional flags for the release of the lock) and you suggested that boolean could be stored there but this boolean is NOT an indicator for whether shared storage is actually set up but whether it could handle shared storage migration if it had to. --> QEMU_DOMAIN_TPM_PRIVATE(tpm)->swtpm.shared_storage_migration stefan
(or whatever member I suggested).
Michal

Implement functions to determine whether to remove the TPM state upon migration failure on the destination side or migration success on the source side. In both cases always keep the state when shared storage is used and always remove the state if no shared storage is used. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/qemu/qemu_migration.c | 13 ++++++++++--- src/qemu/qemu_tpm.h | 18 ++++++++++++++++++ 2 files changed, 28 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 431b1b0bcb..44e0488303 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3996,6 +3996,7 @@ qemuMigrationSrcConfirm(virQEMUDriver *driver, { qemuMigrationJobPhase phase; g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + virDomainUndefineFlagsValues undefFlags = 0; int ret = -1; VIR_DEBUG("vm=%p, flags=0x%x, cancelled=%d", vm, flags, cancelled); @@ -4044,7 +4045,9 @@ qemuMigrationSrcConfirm(virQEMUDriver *driver, virDomainDeleteConfig(cfg->configDir, cfg->autostartDir, vm); vm->persistent = 0; } - qemuDomainRemoveInactive(driver, vm, VIR_DOMAIN_UNDEFINE_TPM); + if (!qemuTPMCheckKeepTPMStateMigrationSrcSuccess(flags)) + undefFlags |= VIR_DOMAIN_UNDEFINE_TPM; + qemuDomainRemoveInactive(driver, vm, undefFlags); } cleanup: @@ -6633,6 +6636,7 @@ qemuMigrationDstFinishActive(virQEMUDriver *driver, virObjectEvent *event; bool inPostCopy = false; bool doKill = vm->job->phase != QEMU_MIGRATION_PHASE_FINISH_RESUME; + virDomainUndefineFlagsValues undefFlags = 0; int rc; VIR_DEBUG("vm=%p, flags=0x%lx, retcode=%d", @@ -6709,8 +6713,11 @@ qemuMigrationDstFinishActive(virQEMUDriver *driver, jobPriv->migParams, vm->job->apiFlags); } - if (!virDomainObjIsActive(vm)) - qemuDomainRemoveInactive(driver, vm, VIR_DOMAIN_UNDEFINE_TPM); + if (!virDomainObjIsActive(vm)) { + if (!qemuTPMCheckKeepTPMStateMigrationDstFailure(flags)) + undefFlags |= VIR_DOMAIN_UNDEFINE_TPM; + qemuDomainRemoveInactive(driver, vm, undefFlags); + } virErrorRestore(&orig_err); return NULL; diff --git a/src/qemu/qemu_tpm.h b/src/qemu/qemu_tpm.h index 630fa7074f..0cee08cd5c 100644 --- a/src/qemu/qemu_tpm.h +++ b/src/qemu/qemu_tpm.h @@ -60,3 +60,21 @@ int qemuExtTPMSetupCgroup(virQEMUDriver *driver, bool qemuTPMCanMigrateSharedStorage(virDomainDef *def) ATTRIBUTE_NONNULL(1); + +static inline bool +qemuTPMCheckKeepTPMStateMigrationSrcSuccess(virDomainMigrateFlags flags) +{ + /* always keep state when migrating across shared storage */ + if ((flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) + return true; + return false; +} + +static inline bool +qemuTPMCheckKeepTPMStateMigrationDstFailure(virDomainMigrateFlags flags) +{ + /* always keep state when migrating across shared storage */ + if ((flags & VIR_MIGRATE_TPM_SHARED_STORAGE)) + return true; + return false; +} -- 2.37.3

Add the flag VIR_MIGRATE_TPM_SHARED_STORAGE to the collection of supported flags for QEMU VM migration. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- src/qemu/qemu_migration.h | 1 + 1 file changed, 1 insertion(+) diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index fbea45ad4e..4203abcb1a 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -62,6 +62,7 @@ VIR_MIGRATE_NON_SHARED_SYNCHRONOUS_WRITES | \ VIR_MIGRATE_POSTCOPY_RESUME | \ VIR_MIGRATE_ZEROCOPY | \ + VIR_MIGRATE_TPM_SHARED_STORAGE | \ 0) /* All supported migration parameters and their types. */ -- 2.37.3

Add support for --tpm-shared-storage flag for migration across hosts that have shared storage set up for storing the state. Add documentation to the virsh man page. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> --- docs/manpages/virsh.rst | 6 ++++++ tools/virsh-domain.c | 7 +++++++ 2 files changed, 13 insertions(+) diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst index 5d11c48803..79626f2510 100644 --- a/docs/manpages/virsh.rst +++ b/docs/manpages/virsh.rst @@ -3297,6 +3297,7 @@ migrate [--parallel [--parallel-connections connections]] [--bandwidth bandwidth] [--tls-destination hostname] [--disks-uri URI] [--copy-storage-synchronous-writes] + [--tpm-shared-storage] Migrate domain to another host. Add *--live* for live migration; <--p2p> for peer-2-peer migration; *--direct* for direct migration; or *--tunnelled* @@ -3367,6 +3368,11 @@ For QEMU/KVM this means QEMU will be temporarily allowed to lock all guest pages in host's memory, although only those that are queued for transfer will be locked at the same time. +*--tpm-shared-storage* enables migration of a QEMU VM with TPM whose +persistents state is saved on shared storage set up between the source +and destination hosts. This option must be given when shared storage +is used and must not be given otherwise. + ``Note``: Individual hypervisors usually do not support all possible types of migration. For example, QEMU does not support direct migration. diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 2d22547cc6..05c4a827c3 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -11041,6 +11041,10 @@ static const vshCmdOptDef opts_migrate[] = { .completer = virshCompleteEmpty, .help = N_("override the destination host name used for TLS verification") }, + {.name = "tpm-shared-storage", + .type = VSH_OT_BOOL, + .help = N_("migrate TPM between hosts that have shared storage setup for the TPM's state") + }, {.name = NULL} }; @@ -11345,6 +11349,9 @@ doMigrate(void *opaque) if (vshCommandOptBool(cmd, "parallel")) flags |= VIR_MIGRATE_PARALLEL; + if (vshCommandOptBool(cmd, "tpm-shared-storage")) + flags |= VIR_MIGRATE_TPM_SHARED_STORAGE; + if (flags & VIR_MIGRATE_PEER2PEER || vshCommandOptBool(cmd, "direct")) { if (virDomainMigrateToURI3(dom, desturi, params, nparams, flags) == 0) data->ret = 0; -- 2.37.3

On 10/5/22 16:01, Stefan Berger wrote:
This series of patches adds support for migrating vTPMs across hosts whose storage has been set up to share the directory structure holding the state of the TPM (swtpm). A new migration flag VIR_MIGRATE_TPM_SHARED_STORAGE is added to enable this. This flag influences the management of the directory structure holding the TPM state, which for example is only removed when a domain is undefined and not when a VM is removed on the migration source host. Further, when shared storage is used then security labeling on the destination side is skipped assuming that the labeling was already done on the source side.
I have tested this with an NFS setup where I had to turn SELinux off on the hosts since the SELinux MLS range labeling is not supported by NFS.
Shared storage migration requires (upcoming) swtpm v0.8.
Stefan
Stefan Berger (9): util: Add parsing support for swtpm's cmdarg-migration capability qemu: Introduced VIR_MIGRATE_TPM_SHARED_STORAGE for TPM migration qemu: tpm: Conditionally create storage on incoming migration qemu: tpm: Pass --migration option to swtpm if supported qemu: tpm: Avoid security labels on incoming migration with shared storage qemu: tpm: Require UNDEFINE_TPM to be set to remove TPM state qemu: tpm: Determine whether to remove TPM state during migration qemu: tpm: Enable migration with VIR_MIGRATE_TPM_SHARED_STORAGE virsh: Add support for --tpm-shared-storage flag for migration
docs/manpages/virsh.rst | 6 +++ include/libvirt/libvirt-domain.h | 8 +++ src/conf/domain_conf.h | 1 + src/qemu/qemu_driver.c | 4 +- src/qemu/qemu_extdevice.c | 5 +- src/qemu/qemu_extdevice.h | 3 +- src/qemu/qemu_migration.c | 23 +++++++-- src/qemu/qemu_migration.h | 1 + src/qemu/qemu_process.c | 10 ++-- src/qemu/qemu_process.h | 6 ++- src/qemu/qemu_saveimage.c | 2 +- src/qemu/qemu_snapshot.c | 4 +- src/qemu/qemu_tpm.c | 87 ++++++++++++++++++++++++++++---- src/qemu/qemu_tpm.h | 24 ++++++++- src/util/virtpm.c | 1 + src/util/virtpm.h | 1 + tools/virsh-domain.c | 7 +++ 17 files changed, 164 insertions(+), 29 deletions(-)
Overall, I like this. I've raised couple of points in my review. I've made suggested changes as 'fixup' commits and pushed everything on my gitlab: https://gitlab.com/MichalPrivoznik/libvirt/-/commits/tpm_migration_v2 (except for private data for TPM which I'm suggesting somewhere in review). Feel free to take them an squash them in. Or just parts of it. I mean, I wasn't sure where exactly I should stop passing 'flags' and set 'sharedStorage' bool argument. Maybe I was too aggressive and flags can be passed all the way down. Michal

On 10/6/22 09:26, Michal Prívozník wrote:
On 10/5/22 16:01, Stefan Berger wrote:
This series of patches adds support for migrating vTPMs across hosts whose storage has been set up to share the directory structure holding the state of the TPM (swtpm). A new migration flag VIR_MIGRATE_TPM_SHARED_STORAGE is added to enable this. This flag influences the management of the directory structure holding the TPM state, which for example is only removed when a domain is undefined and not when a VM is removed on the migration source host. Further, when shared storage is used then security labeling on the destination side is skipped assuming that the labeling was already done on the source side.
I have tested this with an NFS setup where I had to turn SELinux off on the hosts since the SELinux MLS range labeling is not supported by NFS.
Shared storage migration requires (upcoming) swtpm v0.8.
Stefan
Stefan Berger (9): util: Add parsing support for swtpm's cmdarg-migration capability qemu: Introduced VIR_MIGRATE_TPM_SHARED_STORAGE for TPM migration qemu: tpm: Conditionally create storage on incoming migration qemu: tpm: Pass --migration option to swtpm if supported qemu: tpm: Avoid security labels on incoming migration with shared storage qemu: tpm: Require UNDEFINE_TPM to be set to remove TPM state qemu: tpm: Determine whether to remove TPM state during migration qemu: tpm: Enable migration with VIR_MIGRATE_TPM_SHARED_STORAGE virsh: Add support for --tpm-shared-storage flag for migration
docs/manpages/virsh.rst | 6 +++ include/libvirt/libvirt-domain.h | 8 +++ src/conf/domain_conf.h | 1 + src/qemu/qemu_driver.c | 4 +- src/qemu/qemu_extdevice.c | 5 +- src/qemu/qemu_extdevice.h | 3 +- src/qemu/qemu_migration.c | 23 +++++++-- src/qemu/qemu_migration.h | 1 + src/qemu/qemu_process.c | 10 ++-- src/qemu/qemu_process.h | 6 ++- src/qemu/qemu_saveimage.c | 2 +- src/qemu/qemu_snapshot.c | 4 +- src/qemu/qemu_tpm.c | 87 ++++++++++++++++++++++++++++---- src/qemu/qemu_tpm.h | 24 ++++++++- src/util/virtpm.c | 1 + src/util/virtpm.h | 1 + tools/virsh-domain.c | 7 +++ 17 files changed, 164 insertions(+), 29 deletions(-)
Overall, I like this. I've raised couple of points in my review. I've made suggested changes as 'fixup' commits and pushed everything on my gitlab:
https://gitlab.com/MichalPrivoznik/libvirt/-/commits/tpm_migration_v2
(except for private data for TPM which I'm suggesting somewhere in review). Feel free to take them an squash them in. Or just parts of it. I mean, I wasn't sure where exactly I should stop passing 'flags' and set 'sharedStorage' bool argument. Maybe I was too aggressive and flags can be passed all the way down.
I forgot about your series.. I put a v2 with fixes here now: https://github.com/stefanberger/libvirt-tpm/tree/swtpm_shared_storage.v2.fix... It has the private data support and a fix for post migration cleanup. The new flag is still in there... Stefan
Michal
participants (3)
-
Daniel P. Berrangé
-
Michal Prívozník
-
Stefan Berger