On Tue, Aug 23, 2022 at 09:15:52AM -0400, Stefan Berger wrote:
On 8/23/22 08:52, Daniel P. Berrangé wrote:
> On Mon, Aug 22, 2022 at 03:17:34PM -0400, Stefan Berger wrote:
> >
> >
> > On 8/22/22 12:46, Daniel P. Berrangé wrote:
> > > On Mon, Aug 22, 2022 at 08:05:53AM -0400, Stefan Berger wrote:
> > > > When share storage for the TPM state files has been setup betwen
hosts then
> > > > remove the TPM state files and directory only when undefining a VM
and only
> > > > if the attribute persistent_state is not set. Avoid removing the TPM
state
> > > > files and directory structure when a VM is migrated and shared
storage is
> > > > used since this would also remove those files and directory structure
on
> > > > the destination side.
> > >
> > > I think our current undefine behaviour is probably flawed. We go to the
> > > trouble of refusing to remove the firmware NVRAM when undefining because
> > > it contains important VM state, but then happily blow away the TPM state.
> > > Totally inconsistent behaviour :-( Its too late to change the default
> > > behaviour, but we likely ought to add a flag
> > >
> > > VIR_DOMAIN_UNDEFINE_KEEP_TPM
> > >
> > > and plumb that through the varius code paths, which would remove the
> > > need for this specific 'qemuDomainUndefineReason' enum.
> >
> > I think the granularity encoded in the reason is necessary for the following
> > patch I was going to post later on:
> >
> > Subject: [PATCH] qemu: tpm: Remove TPM state with non-shared storage
> >
> > This patch 'fixes' the behavior of the persistent_state TPM domain XML
> > attribute that intends to preserve the state of the TPM but should not
> > keep the state around on all the hosts a VM has been migrated to. It
> > removes the TPM state directory structure from the source host upon
> > successful migration when non-shared storage is used. Similarly, it
> > removes it from the destination host upon migration failure when
> > non-shared storage is used.
>
> 'persistent_state' is something that applies to transient VMs.
It's an attribute set in the TPM's domain XML and affects the TPM state
permanently:
persistent_state
The persistent_state attribute indicates whether 'swtpm' TPM state is
kept or not when a transient domain is powered off or undefined. This option
can be used for preserving TPM state. By default the value is no. This
attribute only works with the emulator backend. The accepted values are yes
and no. Since 7.0.0
The documentation may say transient but the code seems to just not remove
the state at all once a user has set this in the domain XML:
static void
qemuTPMEmulatorCleanupHost(virDomainTPMDef *tpm)
{
if (!tpm->data.emulator.persistent_state)
qemuTPMEmulatorDeleteStorage(tpm);
}
https://github.com/libvirt/libvirt/blob/master/src/qemu/qemu_tpm.c#L707
We cannot get rid of the persistent_state in the domain XML but with
--keep-tpm we would have a second method to keep the state. My guess is the
author of the persistent_state patch did not want to instrument callers but
achieve the effect with just producing a different XML.
Yeah, IMHO, the persistent_state attribute was a mistake, but we can't
get rid of it due to back compatibility. The need of persistence is
something that is contextually dependant, rather than a property of
the VM you can know upfront. When undefining a VM there are reasons to
both delete and preserve the TPM state that can only be known at the
time the undefine operation is invoked.
So we can't delete the XML attr, but we can add new flags to the undefine
API and define how they interact wit the persistent_state attribute.
* persistent_state=absent, flags=0 => delete
* persistent_state=absent, flags=UNDEFINE_TPM => delete
* persistent_state=absent, flags=UNDEFINE_KEEP_TPM => keep
* persistent_state=yes, flags=0, => keep
* persistent_state=yes, flags=UNDEFINE_TPM => delete
* persistent_state=yes, flags=UNDEFINE_KEEP_TPM => keep
* persistent_state=no, flags=0, => delete
* persistent_state=no, flags=UNDEFINE_TPM => delete
* persistent_state=no, flags=UNDEFINE_KEEP_TPM => keep
> What I'm suggesting is that we should have some control over
> this for persistent VMs, when calling 'undefine'.
>
> ie virsh undefine --keep-nvram --keep-tpm $VMNAME
>
> if we have this feature in the public API, then its impl should
> support both this feature and your future patch for
> 'persistent_state'.
And how do we handle removal decisions on shared storage so that the
directory structure is not removed when a VM is undefined on the source
machine versus we want to have the directory structure removed on the source
machine upon successful migration when no shared storage is used? This
TPM-specific decision for removal seems better on the qemu_tpm.c level
rather than having to decide whether to keep the TPM state around at all the
call-sites that patch 1/7 instrumented with the reason parameter (albeit
unspecific in most cases) and pass down a KEEP_TPM parameter instead.
No matter what, we need to plumb in an extra parameter across the call
sites.
Getting the reason for the undefine on the qemu_tpm.c level lets us
make
this TPM-specific decision there rather than at all sites further up in the
call stack.
When exposing the above mentioned flags to virDomainUndefine I don't
think the "undefine reason" makes sense as a concept to pass around,
as compared to a 'bool deleteTPMState'
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|