[keeping context, because of adding Dan in cc]
On 3/22/19 9:35 AM, John Ferlan wrote:
On 3/20/19 5:32 PM, Eric Blake wrote:
> Rather than one file per snapshot, store all qemu snapshots in a
> single file, using the recently added bulk snapshot list
> operations. For now, this doesn't change how often libvirt writes a
> snapshot file, but it does open the door for the next patch to update
> the signature to qemuDomainSnapshotWriteMetadata() and call it less
> frequently. One of the main benefits for doing a bulk write is that
> you only have to do a single file system write at the end of an
> operation, rather than potentially 3 during
> virDomainSnapshotCreateXML(REDEFINE|CURRENT) (delete the old snapshot
> definition being redefined, rewrite the previous current snapshot to
> longer be current, and store the new snapshot definition) or even more
> during virDomainSnapshotDelete(DESCENDENTS) (a file system hit per
> snapshot being deleted). It also makes it perhaps a bit more feasible
> to roll back to earlier state if something fails horribly midway
> through an operation (until you write the new file, the old file is
> still a reliable record of what state to roll back to), compared to
> the current code which has to track lots of things locally; although I
> did not attempt to play with any patches along those lines.
>
> Another benefit of the bulk write - it's less code to maintain, and
> will make it easier for me to model qemu's checkpoint storage in the
> same way (and for checkpoints, I don't even have to worry about legacy
> parsing).
>
> This is a one-way upgrade - if you have snapshots created by an older
> libvirt, the new libvirt will correctly load those snapshots and
> convert to the new format. But as the new libvirt will no longer
> output the old format, reverting back to the old libvirt will make it
> appear that all snapshots have disappeared (merely hidden until you
> upgrade libvirt again). But then again, we've never promised that
> downgrading libvirt after an upgrade was supposed to work flawlessly.
This kind of information I would think would need to be heavily if not
overly documented. I assume someone could say this should be "stuck
behind" a forced decision by the consumer to do the conversion to the
new style. What worries me most is the issues in the subsequent
paragraphs that could bite someone.
Indeed, I would need to provide a patch to release notes to match this
commit, if we decide to go with it.
To some degree we could be 'stuck with' the model for snapshots, but
that doesn't mean checkpoints couldn't make use of some newer
functionality that stores everything in one file and not allow the
storage into multiple files.
This is one of those gray areas for me that from an overall
architectural level that we struggle with related to dealing with our
technical debt because we've guaranteed some sort of backcompat for
various things "forever".
It's not the first time where upgrading libvirt performs a one-way of
the old-style internals to the new-style, where you then can't downgrade
libvirt, but at least the bulk of the libvirt code no longer has to
worry about generating both old- and new-style output. But yes, getting
a second opinion won't hurt.
I also wonder what harm it does to keep the old style around and *force*
the consumer to do the deletion. Whether that's by them running rmdir on
the old directory or running some libvirt command to clean out the
legacy. I lean towards leaving the old stuff there - if nothing else it
provides some strange sort of comfort. Maybe even rename the directory
to have something like ".old" in the name.
I'm OK with the choice and arguments you make for the algorithm you
chose, but a ping of Daniel to make sure he's OK with it probably would
help here, especially since he commented on the last pile and brought
the issue to the forefront.
added in cc.
>
> There is a slight chance for confusion if a user named two separate
> domains 'foo' and 'foo.xml'; the new scheme expects 'foo.xml'
to be a
> regular file for the former domain, while the old scheme expected
> 'foo.xml' to be a directory for the latter domain; if we are worried
> about that, we could tweak the code to instead output new state in a
> file named 'foo' instead of the more typical 'foo.xml' and just key
> off of whether the file is regular or a directory when deciding if
> this is the first libvirt run after an upgrade. But I felt that the
> chances of a user abusing domain names like that is not worth the
> effort.
I think this is just another thing to document. As in, if you've done
this and use snapshots, then "fix your stupidity" ;-). Why anyone other
that QE to test all possibilities would name a domain with ".xml" in the
name suffix goes beyond common sense to me.
>
> The bulk snapshot file can be huge (over RPC, we allow <domain> to be
> up to 4M, and we allow up to 16k snapshots; since each each snapshot
> includes a <domain>, a worst-case domain would result in gigabytes of
> bulk data); it is no worse on overall filesystem usage than before,
> but now in a single file vs. a series of files it requires more memory
> to read in at once. I don't know if we have to worry about that in
> practice, but my patch does cap things to read in no more than an
> arbitrarily-picked 128M, which we may have to raise in the future.
and even more. I think when I first reviewed this I thought of the RPC
limit stuff - at least w/r/t how the stats code is the other place where
we've hit this limit. But for some reason I felt that perhaps some under
the covers code in that code we solved something that allowed for things
to work, but now I'm not so sure.
I don't know if virXML has some way of visiting a seekable stream, and
parsing in only chunks at a time. You still have to PARSE the entire
file to learn where the chunks are, but if the parser is stream based
and keeps track of offsets where interesting chunks start, it is perhaps
feasable that you could cap your memory usage by revisiting chunks as
needed. (But doing that to save memory ALSO means reparsing portions as
you reload them into memory, so you're slower than if you had kept the
full file in memory anyway).
Or maybe we need to think about ways to compress the information (most
of the <domain>s in the overall <snapshotlist> will share a lot of
commonality, even if the user hot-plugged devices between snapshots).
But that's a big project, which I don't want to tackle.
>
> Signed-off-by: Eric Blake <eblake(a)redhat.com>
> ---
> src/qemu/qemu_domain.c | 59 ++++++++-------------------
> src/qemu/qemu_driver.c | 93 +++++++++++++++++++++++++++++++++---------
> 2 files changed, 91 insertions(+), 61 deletions(-)
>
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index ea7b31dab3..424f839a00 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -8448,45 +8448,28 @@ qemuFindQemuImgBinary(virQEMUDriverPtr driver)
>
> int
> qemuDomainSnapshotWriteMetadata(virDomainObjPtr vm,
> - virDomainMomentObjPtr snapshot,
> + virDomainMomentObjPtr snapshot ATTRIBUTE_UNUSED,
> virCapsPtr caps,
> virDomainXMLOptionPtr xmlopt,
> const char *snapshotDir)
> {
> - char *newxml = NULL;
> - int ret = -1;
> - char *snapDir = NULL;
> - char *snapFile = NULL;
> char uuidstr[VIR_UUID_STRING_BUFLEN];
> - unsigned int flags = VIR_DOMAIN_SNAPSHOT_FORMAT_SECURE |
> - VIR_DOMAIN_SNAPSHOT_FORMAT_INTERNAL;
> - virDomainSnapshotDefPtr def = virDomainSnapshotObjGetDef(snapshot);
> + unsigned int flags = VIR_DOMAIN_SNAPSHOT_FORMAT_SECURE;
> + VIR_AUTOFREE(char *) newxml = NULL;
> + VIR_AUTOFREE(char *) snapDir = NULL;
@snapDir is not used
Rebase fluff (I played around with multiple ideas before settling on
this patch); will scrub.
> + VIR_AUTOFREE(char *) snapFile = NULL;
> + VIR_AUTOCLEAN(virBuffer) buf = VIR_BUFFER_INITIALIZER;
>
> - if (virDomainSnapshotGetCurrent(vm->snapshots) == snapshot)
> - flags |= VIR_DOMAIN_SNAPSHOT_FORMAT_CURRENT;
> virUUIDFormat(vm->def->uuid, uuidstr);
> - newxml = virDomainSnapshotDefFormat(uuidstr, def, caps, xmlopt, flags);
> - if (newxml == NULL)
> + if (virDomainSnapshotObjListFormat(&buf, uuidstr, vm->snapshots, caps,
> + xmlopt, flags) < 0)
> return -1;
>
> - if (virAsprintf(&snapDir, "%s/%s", snapshotDir,
vm->def->name) < 0)
> - goto cleanup;
> - if (virFileMakePath(snapDir) < 0) {
> - virReportSystemError(errno, _("cannot create snapshot directory
'%s'"),
> - snapDir);
> - goto cleanup;
> - }
> -
> - if (virAsprintf(&snapFile, "%s/%s.xml", snapDir,
def->common.name) < 0)
> - goto cleanup;
> -
> - ret = virXMLSaveFile(snapFile, NULL, "snapshot-edit", newxml);
> + if (virAsprintf(&snapFile, "%s/%s.xml", snapshotDir,
vm->def->name) < 0)
> + return -1;
>
> - cleanup:
> - VIR_FREE(snapFile);
> - VIR_FREE(snapDir);
> - VIR_FREE(newxml);
> - return ret;
> + newxml = virBufferContentAndReset(&buf);
> + return virXMLSaveFile(snapFile, NULL, "snapshot-edit", newxml);
> }
>
> /* The domain is expected to be locked and inactive. Return -1 on normal
> @@ -8589,7 +8572,6 @@ qemuDomainSnapshotDiscard(virQEMUDriverPtr driver,
> bool update_parent,
> bool metadata_only)
> {
> - char *snapFile = NULL;
> int ret = -1;
> qemuDomainObjPrivatePtr priv;
> virDomainMomentObjPtr parentsnap = NULL;
> @@ -8610,10 +8592,6 @@ qemuDomainSnapshotDiscard(virQEMUDriverPtr driver,
> }
> }
>
> - if (virAsprintf(&snapFile, "%s/%s/%s.xml", cfg->snapshotDir,
> - vm->def->name, snap->def->name) < 0)
> - goto cleanup;
> -
So "theoretically" we'll never clean up the old way? Maybe this should
survive and have a prefix of virFileExists before the unlink below? or
some sort of rename for posterity (depends on the opinion about keeping
the historical references).
Hmm - you have a point. My upgrade path parsed the old way, created the
new way, then tried to clean the old way. But if the attempt to clean
the old way failed, future starts will see only the new way, leaving the
old way to litter the storage. So yeah, opinions on the best logic to
follow are worthwhile.
> if (snap == virDomainSnapshotGetCurrent(vm->snapshots)) {
> virDomainSnapshotSetCurrent(vm->snapshots, NULL);
> if (update_parent && snap->def->parent) {
> @@ -8635,8 +8613,6 @@ qemuDomainSnapshotDiscard(virQEMUDriverPtr driver,
> }
> }
>
> - if (unlink(snapFile) < 0)
> - VIR_WARN("Failed to unlink %s", snapFile);
> if (update_parent)
> virDomainMomentDropParent(snap);
> virDomainSnapshotObjListRemove(vm->snapshots, snap);
> @@ -8644,7 +8620,6 @@ qemuDomainSnapshotDiscard(virQEMUDriverPtr driver,
> ret = 0;
>
> cleanup:
> - VIR_FREE(snapFile);
> virObjectUnref(cfg);
> return ret;
> }
> @@ -8691,7 +8666,7 @@ qemuDomainRemoveInactiveCommon(virQEMUDriverPtr driver,
> virDomainObjPtr vm)
> {
> virQEMUDriverConfigPtr cfg;
> - VIR_AUTOFREE(char *) snapDir = NULL;
> + VIR_AUTOFREE(char *) snapFile = NULL;
>
FWIW: Similar thoughts here about keeping the historical reference and
then "forcing" the user to delete it themselves or provide some sort of
new flag to delete the old historical files.
I'm open to whatever Dan may suggest.
> cfg = virQEMUDriverGetConfig(driver);
>
> @@ -8699,12 +8674,12 @@ qemuDomainRemoveInactiveCommon(virQEMUDriverPtr driver,
> if (qemuDomainSnapshotDiscardAllMetadata(driver, vm) < 0) {
> VIR_WARN("unable to remove all snapshots for domain %s",
> vm->def->name);
> - } else if (virAsprintf(&snapDir, "%s/%s", cfg->snapshotDir,
> + } else if (virAsprintf(&snapFile, "%s/%s.xml",
cfg->snapshotDir,
> vm->def->name) < 0) {
> - VIR_WARN("unable to remove snapshot directory %s/%s",
> + VIR_WARN("unable to remove snapshots storage %s/%s.xml",
> cfg->snapshotDir, vm->def->name);
> - } else if (rmdir(snapDir) < 0 && errno != ENOENT) {
> - VIR_WARN("unable to remove snapshot directory %s", snapDir);
> + } else if (unlink(snapFile) < 0 && errno != ENOENT) {
> + VIR_WARN("unable to remove snapshots storage %s", snapFile);
> }
> qemuExtDevicesCleanupHost(driver, vm->def);
>
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 9c2245b095..018d1cdc87 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -406,12 +406,14 @@ qemuSecurityInit(virQEMUDriverPtr driver)
> }
>
>
> +/* Older qemu used a series of $dir/snapshot/domainname/snap.xml
> + * files, instead of the modern $dir/snapshot/domainname.xml bulk
> + * file. Called while vm is locked. */
> static int
> -qemuDomainSnapshotLoad(virDomainObjPtr vm,
> - void *data)
> +qemuDomainSnapshotLoadLegacy(virDomainObjPtr vm,
> + char *snapDir,
> + virCapsPtr caps)
> {
> - char *baseDir = (char *)data;
> - char *snapDir = NULL;
> DIR *dir = NULL;
> struct dirent *entry;
> char *xmlStr;
> @@ -424,21 +426,8 @@ qemuDomainSnapshotLoad(virDomainObjPtr vm,
> VIR_DOMAIN_SNAPSHOT_PARSE_DISKS |
> VIR_DOMAIN_SNAPSHOT_PARSE_INTERNAL);
> int ret = -1;
> - virCapsPtr caps = NULL;
> int direrr;
>
> - virObjectLock(vm);
> - if (virAsprintf(&snapDir, "%s/%s", baseDir, vm->def->name)
< 0) {
> - virReportError(VIR_ERR_INTERNAL_ERROR,
> - _("Failed to allocate memory for "
> - "snapshot directory for domain %s"),
> - vm->def->name);
> - goto cleanup;
> - }
> -
> - if (!(caps = virQEMUDriverGetCapabilities(qemu_driver, false)))
> - goto cleanup;
> -
> VIR_INFO("Scanning for snapshots for domain %s in %s",
vm->def->name,
> snapDir);
>
> @@ -503,6 +492,74 @@ qemuDomainSnapshotLoad(virDomainObjPtr vm,
> _("Snapshots have inconsistent relations for domain
%s"),
> vm->def->name);
>
> + virResetLastError();
> +
> + ret = 0;
> + cleanup:
> + VIR_DIR_CLOSE(dir);
> + return ret;
> +}
> +
> +
> +/* Load all snapshots associated with domain */
> +static int
> +qemuDomainSnapshotLoad(virDomainObjPtr vm,
> + void *data)
> +{
> + char *baseDir = (char *)data;
> + unsigned int flags = (VIR_DOMAIN_SNAPSHOT_PARSE_REDEFINE |
> + VIR_DOMAIN_SNAPSHOT_PARSE_DISKS);
> + int ret = -1;
> + virCapsPtr caps = NULL;
> + VIR_AUTOFREE(char *) snapFile = NULL;
> + VIR_AUTOFREE(char *) snapDir = NULL;
> + VIR_AUTOFREE(char *) xmlStr = NULL;
> +
> + virObjectLock(vm);
> + VIR_INFO("Scanning for snapshots for domain %s in %s",
vm->def->name,
> + baseDir);
> + if (virAsprintf(&snapFile, "%s/%s.xml", baseDir,
vm->def->name) < 0) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("Failed to allocate memory for "
> + "snapshots storage for domain %s"),
> + vm->def->name);
> + goto cleanup;
> + }
> +
> + if (!(caps = virQEMUDriverGetCapabilities(qemu_driver, false)))
> + goto cleanup;
> +
> + if (virFileExists(snapFile)) {
> + /* State last saved by modern libvirt in single file. As each
> + * snapshot contains a <domain>, it can be quite large. */
> + if (virFileReadAll(snapFile, 128*1024*1024*1, &xmlStr) < 0) {
> + /* Nothing we can do here */
> + virReportSystemError(errno,
> + _("Failed to read snapshot file %s"),
> + snapFile);
> + goto cleanup;
> + }
> +
> + ret = virDomainSnapshotObjListParse(xmlStr, vm->def->uuid,
> + vm->snapshots, caps,
> + qemu_driver->xmlopt, flags);
> + if (ret < 0)
> + goto cleanup;
> + VIR_INFO("Read in %d snapshots for domain %s", ret,
vm->def->name);
> + } else if (virAsprintf(&snapDir, "%s/%s", baseDir,
vm->def->name) >= 0 &&
> + virFileExists(snapDir)) {
> + /* State may have been saved by earlier libvirt; if so, try to
> + * read it in, convert to modern style, and remove the old
> + * directory if successful. */
> + if (qemuDomainSnapshotLoadLegacy(vm, snapDir, caps) < 0)
> + goto cleanup;
> + if (qemuDomainSnapshotWriteMetadata(vm, NULL, caps,
> + qemu_driver->xmlopt, baseDir) <
0)
> + goto cleanup;
> + if (virFileDeleteTree(snapDir) < 0)
> + VIR_WARN("Unable to remove legacy snapshot directory %s",
snapDir);
This is where I wonder if it'd be better to keep the historical
reference and then perhaps in the "if" portion of this code do a similar
generation of snapDir, virFileExists (OK, Directory), and then VIR_WARN
that old stuff still exists...
For what's here - sure it seems to do the right thing, but let's be sure
we think we're making the right decision before an R-by, fair enough?
Indeed.
John
> + }
> +
> /* FIXME: qemu keeps internal track of snapshots. We can get access
> * to this info via the "info snapshots" monitor command for running
> * domains, or via "qemu-img snapshot -l" for shutoff domains. It
would
> @@ -516,8 +573,6 @@ qemuDomainSnapshotLoad(virDomainObjPtr vm,
>
> ret = 0;
> cleanup:
> - VIR_DIR_CLOSE(dir);
> - VIR_FREE(snapDir);
> virObjectUnref(caps);
> virObjectUnlock(vm);
> return ret;
>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org