
On 8/23/22 08:52, Daniel P. Berrangé wrote:
On Mon, Aug 22, 2022 at 03:17:34PM -0400, Stefan Berger wrote:
On 8/22/22 12:46, Daniel P. Berrangé wrote:
On Mon, Aug 22, 2022 at 08:05:53AM -0400, Stefan Berger wrote:
When share storage for the TPM state files has been setup betwen hosts then remove the TPM state files and directory only when undefining a VM and only if the attribute persistent_state is not set. Avoid removing the TPM state files and directory structure when a VM is migrated and shared storage is used since this would also remove those files and directory structure on the destination side.
I think our current undefine behaviour is probably flawed. We go to the trouble of refusing to remove the firmware NVRAM when undefining because it contains important VM state, but then happily blow away the TPM state. Totally inconsistent behaviour :-( Its too late to change the default behaviour, but we likely ought to add a flag
VIR_DOMAIN_UNDEFINE_KEEP_TPM
and plumb that through the varius code paths, which would remove the need for this specific 'qemuDomainUndefineReason' enum.
I think the granularity encoded in the reason is necessary for the following patch I was going to post later on:
Subject: [PATCH] qemu: tpm: Remove TPM state with non-shared storage
This patch 'fixes' the behavior of the persistent_state TPM domain XML attribute that intends to preserve the state of the TPM but should not keep the state around on all the hosts a VM has been migrated to. It removes the TPM state directory structure from the source host upon successful migration when non-shared storage is used. Similarly, it removes it from the destination host upon migration failure when non-shared storage is used.
'persistent_state' is something that applies to transient VMs.
It's an attribute set in the TPM's domain XML and affects the TPM state permanently: persistent_state The persistent_state attribute indicates whether 'swtpm' TPM state is kept or not when a transient domain is powered off or undefined. This option can be used for preserving TPM state. By default the value is no. This attribute only works with the emulator backend. The accepted values are yes and no. Since 7.0.0 The documentation may say transient but the code seems to just not remove the state at all once a user has set this in the domain XML: static void qemuTPMEmulatorCleanupHost(virDomainTPMDef *tpm) { if (!tpm->data.emulator.persistent_state) qemuTPMEmulatorDeleteStorage(tpm); } https://github.com/libvirt/libvirt/blob/master/src/qemu/qemu_tpm.c#L707 We cannot get rid of the persistent_state in the domain XML but with --keep-tpm we would have a second method to keep the state. My guess is the author of the persistent_state patch did not want to instrument callers but achieve the effect with just producing a different XML.
What I'm suggesting is that we should have some control over this for persistent VMs, when calling 'undefine'.
ie virsh undefine --keep-nvram --keep-tpm $VMNAME
if we have this feature in the public API, then its impl should support both this feature and your future patch for 'persistent_state'.
And how do we handle removal decisions on shared storage so that the directory structure is not removed when a VM is undefined on the source machine versus we want to have the directory structure removed on the source machine upon successful migration when no shared storage is used? This TPM-specific decision for removal seems better on the qemu_tpm.c level rather than having to decide whether to keep the TPM state around at all the call-sites that patch 1/7 instrumented with the reason parameter (albeit unspecific in most cases) and pass down a KEEP_TPM parameter instead. Getting the reason for the undefine on the qemu_tpm.c level lets us make this TPM-specific decision there rather than at all sites further up in the call stack. Regards, Stefan