On Fri, Oct 18, 2024 at 10:31:34AM +0200, Peter Krempa wrote:
On Thu, Oct 17, 2024 at 16:16:25 +0100, Daniel P. Berrangé wrote:
> On Thu, Oct 17, 2024 at 05:02:00PM +0200, Peter Krempa wrote:
> > Normally when starting up a VM with a transient disk, if the file
> > libvirt would use as the temp overlay for the original disk image exists
> > libvirt will not touch it and fail startup. This is done in order to
> > avoid any potential removal of user files if they manage to create a
> > colliding filename.
> >
> > In case when the user doesn't want the above behaviour this patch'
> > introduces a 'overwriteTemp' attribute for the
'<transient/>' element
> > which allows users to opt into simply removing the file before the next
> > start.
> >
> > Closes:
https://gitlab.com/libvirt/libvirt/-/issues/684
> > Signed-off-by: Peter Krempa <pkrempa(a)redhat.com>
> > ---
> > docs/formatdomain.rst | 7 +++++++
> > src/conf/domain_conf.c | 7 +++++++
> > src/conf/domain_conf.h | 1 +
> > src/conf/schemas/domaincommon.rng | 5 +++++
> > src/qemu/qemu_snapshot.c | 17 +++++++++++++----
> > .../disk-transient.x86_64-latest.xml | 2 +-
> > tests/qemuxmlconfdata/disk-transient.xml | 2 +-
> > 7 files changed, 35 insertions(+), 6 deletions(-)
> >
> > diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
> > index e6f09a728f..ab2fccdd97 100644
> > --- a/docs/formatdomain.rst
> > +++ b/docs/formatdomain.rst
> > @@ -3479,6 +3479,13 @@ paravirtualized driver is specified via the ``disk``
element.
> > ``shareBacking`` attribute should be set to ``yes``. Note that hypervisor
> > drivers may need to hotplug such disk and thus it works only with
> > configurations supporting hotplug. :since:`Since 7.4.0`
> > +
> > + Hypervisors may need to store a temporary file containing the data written
> > + by the domain while running, which may be stored in the same location
> > + as the original source of the disk. Note that in some cases the temporary
> > + file can't be cleaned up (e.g. when the domain is not shutdown before
the host).
> > + An optional attribute ``overwriteTemp`` set to ``yes`` (:since:`Since
10.10`)
> > + indicates that the hypervisor may overwrite the file rather than fail
startup.
>
> Is this really something we want to express in the guest XML ?
>
> The times at which you want to overwrite an existing file look
> to be tailored to the specific failure scenario hit.
>
> Users never expect to see a failure at first.
>
> If they don't add this attribute, and hit the failure, they
> must go back & edit the guest to add the attr, start the
> guest, and then possibly edit it again to remove the attr.
>
> If they add the attr unconditionally before seeing any failure,
> they'll never have an protection against mistakes.
>
> It feels like the same kind of situation that motivated flags
> VIR_DOMAIN_START_FORCE_BOOT and VIR_DOMAIN_START_RESET_NVRAM.
>
> IOW, suggesting we need VIR_DOMAIN_START_REPLACE_TRANSIENT ?
I thought about this as well. The drawback of this solution is that it's
hard to use in scripts as users would have to parse errors. and/or
always send this flag.
Any other option would require adding another error code and that still
wouldn't help virsh users as they don't get to see the error code, just
the translated string corresponding to it.
And lastly it's harder to implement (wiring up the propagation of the
flag).
Thinking about it a bit more the semantics of the transient disk allow
us to simply delete the file without asking. The only thing I was
paranoid about, which is the reason for the check to exist is if users
would "accidentaly" have file with such name. But .. that is extremely
unlikely. We generate the filename for the temporary overlay as:
$(origpath).TRANSIENT-$(vmname)
So I guess if we just document the above we should be okay simply delete
the file if it exists and create another overlay.
Right, if we have $(vmname) in the template, then the only way we could
delete something that is still in use, is if libvirtd somehow forgot
that the QEMU exists while it was still running. That shouldn't happen,
and if it did, we should fix the bug.
In any case transient overlays should be used for inherantly throw-away
data, since QEMU can crash at any time and libvirt will delete the overlay
then no matter what.
FWIW, it'd be ideal to simply unlink the file after qemu starts
using it
so that it doesn't linger until libvirt cleans that up, but that'd
require going through the block job code ensuring that it works properly
with the file missing, which is something not worth doing IMO.
So I suggest I document the temporary file naming scheme and simply
delete it always instead of the new attribute. Does that work for you?
Yes.
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|