On 4/24/19 10:53 AM, Peter Krempa wrote:
On Wed, Apr 17, 2019 at 09:09:06 -0500, Eric Blake wrote:
> Add a new file checkpoint_conf.c that performs the translation to and
> from new XML describing a checkpoint. The code shares a common base
> class with snapshots, since a checkpoint similarly represents the
> domain state at a moment in time. Add some basic testing of round trip
> XML handling through the new code.
>
> + bitmap = virXMLPropString(node, "bitmap");
> + if (bitmap) {
> + if (def->type != VIR_DOMAIN_CHECKPOINT_TYPE_BITMAP) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("disk checkpoint bitmap '%s' requires
"
> + "type='bitmap'"),
> + bitmap);
This should be expressable by schema. If we make schema validation
mandatory for the new APIs which we should anyways we can skip all of
these as we'll automagically get them validated before.
We have several existing APIs that have not yet been taught flags to
perform validation (virDomainManagedSave, for example). Do we really
want all new APIs to perform unconditional schema validation, or do we
want to finish the ongoing project to add flags to existing APIs and
merely insist that new APIs also be given a schema validation flag? So
far, my implementation has not wired in any schema validation.
> +static virDomainCheckpointDefPtr
> +virDomainCheckpointDefParse(xmlXPathContextPtr ctxt,
> + virCapsPtr caps,
> + virDomainXMLOptionPtr xmlopt,
> + bool *current,
> + unsigned int flags)
> +{
> + virDomainCheckpointDefPtr def = NULL;
> + virDomainCheckpointDefPtr ret = NULL;
> + xmlNodePtr *nodes = NULL;
I'm not going to point out that we have more modern approaches here ...
> + size_t i;
> + int n;
> + char *creation = NULL;
> + int active;
> + char *tmp;
> +
> + if (VIR_ALLOC(def) < 0)
> + goto cleanup;
> +
> + def->common.name = virXPathString("string(./name)", ctxt);
> + if (def->common.name == NULL) {
> + if (flags & VIR_DOMAIN_CHECKPOINT_PARSE_REDEFINE) {
... but validation of the XML should not be intermixed with the parser
itself. The parser should only validate what is described by the schema.
(I'm aware that there is plenty prior art, but adding new code just
increases the technical debt and this kind of stuff is hard to refactor
later)
This is copying from the snapshot code. If we desire, it may be worth
cleaning up the snapshot code to use the modern approaches first, and
then the checkpoint code should resemble the cleaned up version (rather
than having to clean up both copies at once).
Do we want mandatory schema validation for the new API? Then again, a
lot of things in the schema are optional - in particular, it is okay for
a user to define a new checkpoint without a name (we create a name
during post-parse processing); but NOT okay to use the redefine flag
without a name - but the schema doesn't really have a way to express a
conditional (aspects of the XML that are optional under one use, but
mandatory under another - where that use is an external flag rather
than something in the XML itself).
> +/**
> + * virDomainCheckpointDefAssignBitmapNames:
> + * @def: checkpoint def object
> + *
> + * Generate default bitmap names for checkpoint targets. Returns 0 on
> + * success, -1 on error.
Isn't a bitmap a qemuism?
Yes, bitmaps are how qemu implements checkpoints. But per patch 1/21,
the XML being proposed uses:
+ <dt><code>checkpoint</code></dt>
+ <dd>An optional attribute; possible values
+ are <code>no</code> when the disk does not participate
+ in this checkpoint; or <code>bitmap</code> if the disk
+ will track all changes since the creation of this
+ checkpoint via a bitmap.</dd>
+ <dt><code>bitmap</code></dt>
+ <dd>The attribute <code>bitmap</code> is only valid
+ if <code>checkpoint='bitmap'</code>; it describes
the
+ name of the tracking bitmap (defaulting to the
+ checkpoint name).</dd>
It is conceivable that adding checkpoint support for another driver may
pick a different (third) checkpoint='MODE', at which point that mode
will add a different attribute that ties in with that mode (other than
qemu's checkpoint='bitmap' bitmap='NAME'). So parsing and assigning
the
'bitmap' name, when checkpoint='bitmap', makes sense here, while
generating a default bitmap name starts to be a bit more specific to
qemu. Is the question whether this function should be moved into
src/qemu/ somewhere, rather than generic to the checkpoint XML
parse/format code?
> +/* Align def->disks to def->domain. Sort the list of
def->disks,
> + * filling in any missing disks with appropriate default. Convert
> + * paths to disk targets for uniformity. Issue an error and return -1
> + * if any def->disks[n]->name appears more than once or does not map
> + * to dom->disks. */
> +int
> +virDomainCheckpointAlignDisks(virDomainCheckpointDefPtr def)
> +{
> + int ret = -1;
> + virBitmapPtr map = NULL;
> + size_t i;
> + int ndisks;
> + int checkpoint_default = VIR_DOMAIN_CHECKPOINT_TYPE_NONE;
> +
> + if (!def->common.dom) {
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("missing domain in checkpoint"));
> + goto cleanup;
This seems unrelated to what the function is doing.
XPath wise, you can't match "/domaincheckpoint/disks/disk[N]" to
"/domaincheckpoint/domain/devices/disk[N]" unless you have an associated
domain definition from the time the checkpoint was created. This code
matches what snapshots do, except that in snapshots, the domain
definition was optional for back-compat, while for checkpoints, I'm
insisting that the domain definition always be present.
> + }
> +
> + if (def->ndisks > def->common.dom->ndisks) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("too many disk checkpoint requests for
domain"));
> + goto cleanup;
> + }
> +
> + /* Unlikely to have a guest without disks but technically possible. */
> + if (!def->common.dom->ndisks) {
Use an explicit == 0 check here. You used plenty of explicit == NULL
checks above.
Copy and paste from the snapshot code. (Personally, I prefer
minimalisting typing, and would rather drop == NULL where I can).
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("domain must have at least one disk to perform "
> + "checkpoints"));
> + goto cleanup;
> + }
> +
> + /* If <disks> omitted, do bitmap on all disks; otherwise, do nothing
> + * for omitted disks */
> + if (!def->ndisks)
Again, explicit comparison please.
But if adding explicit == 0 makes this code get merged sooner, I can do
that.
> + checkpoint_default = VIR_DOMAIN_CHECKPOINT_TYPE_BITMAP;
> +
> + if (!(map = virBitmapNew(def->common.dom->ndisks)))
> + goto cleanup;
> +
> + /* Double check requested disks. */
> + for (i = 0; i < def->ndisks; i++) {
> + virDomainCheckpointDiskDefPtr disk = &def->disks[i];
> + int idx = virDomainDiskIndexByName(def->common.dom, disk->name,
false);
> +
> + if (idx < 0) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("no disk named '%s'"),
disk->name);
> + goto cleanup;
> + }
> +
> + if (virBitmapIsBitSet(map, idx)) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("disk '%s' specified twice"),
> + disk->name);
> + goto cleanup;
> + }
> + ignore_value(virBitmapSetBit(map, idx));
> + disk->idx = idx;
Storing this is dangerous with disk hotplug.
No, this is the same boat as snapshots. We are correlating the list of
"/domainbackup/disks/disk[N]" in the backup with the list of
"/domainbackup/domain/devices/disk[N]" elements from the point-in-time
domain definition when the checkpoint was created - and that
point-in-time definition is unaffected by later hotplugs. You are right
that later hotplugs can affect the ability to USE a checkpoint (a disk
that is present at time of backup but not at the checkpoint undergoes a
full backup, since it was hotplugged in the meantime), but that is a
different correlation than what is being determined here.
> +
> + if (STRNEQ(disk->name, def->common.dom->disks[idx]->dst)) {
> + VIR_FREE(disk->name);
> + if (VIR_STRDUP(disk->name, def->common.dom->disks[idx]->dst)
< 0)
> + goto cleanup;
> + }
> + }
> +
> + /* Provide defaults for all remaining disks. */
> + ndisks = def->ndisks;
> + if (VIR_EXPAND_N(def->disks, def->ndisks,
> + def->common.dom->ndisks - def->ndisks) < 0)
> + goto cleanup;
> +
> + for (i = 0; i < def->common.dom->ndisks; i++) {
> + virDomainCheckpointDiskDefPtr disk;
> +
> + if (virBitmapIsBitSet(map, i))
> + continue;
> + disk = &def->disks[ndisks++];
> + if (VIR_STRDUP(disk->name, def->common.dom->disks[i]->dst) <
0)
> + goto cleanup;
> + disk->idx = i;
> +
> + /* Don't checkpoint empty drives */
What about readonly drives?
Easy enough to default to no backup on those as well.
> +static int
> +virDomainCheckpointDiskDefFormat(virBufferPtr buf,
> + virDomainCheckpointDiskDefPtr disk,
> + unsigned int flags)
> +{
> + if (!disk->name)
> + return 0;
> +
> + virBufferEscapeString(buf, "<disk name='%s'",
disk->name);
> + if (disk->type)
We should make type mandatory.
Mandatory on output, but with sane defaulting on input? Or mandatory on
input as well?
> +static int
> +virDomainCheckpointDefFormatInternal(virBufferPtr buf,
> + virDomainCheckpointDefPtr def,
> + virCapsPtr caps,
> + virDomainXMLOptionPtr xmlopt,
> + unsigned int flags)
> +{
> + size_t i;
> + unsigned int domainflags = VIR_DOMAIN_DEF_FORMAT_INACTIVE;
> +
> +
> + if (virBufferCheckError(buf) < 0)
> + goto error;
> +
> + return 0;
> +
> + error:
> + virBufferFreeAndReset(buf);
I'm not sure it's fair to the caller to free it's buffer.
Copying from what snapshots did.
> +++
b/tests/domaincheckpointxml2xmlout/internal-active-invalid.xml
> @@ -0,0 +1,53 @@
> +<domaincheckpoint>
> + <name>1525889631</name>
> + <description>Completion of updates after OS install</description>
> + <parent>
> + <name>1525111885</name>
> + </parent>
> + <creationTime>1525889631</creationTime>
> + <disks>
> + <disk name='vda' checkpoint='bitmap'
bitmap='1525889631'/>
> + <disk name='vdb' checkpoint='no'/>
> + </disks>
> + <domain type='qemu'>
Thinking about it a bit. Is it necessary to store domain at all here?
The checkpoint does not allow to roll back state as snapshots do so
having the domain is quite pointless.
Not pointless - a checkpoint can be created independently from a backup
or snapshot, but you HAVE to know which disks were present in the domain
at the time of the checkpoint in order to detect which other disks were
hotplugged since the checkpoint (and where the hotplugged disks get a
full backup, rather than incremental).
Additionally with external snapshots the checkpoint actually needs to be
changed with a snapshot as moving around in the snapshot tree makes some
checkpoints invalid.
Correct, we have to figure out which snapshot interactions may
invalidate checkpoints. If you create a snapshot without also creating a
new checkpoint at that time, then reverting to the snapshot will
invalidate the bitmap covering the period between the checkpoint before
the snapshot to the next checkpoint after the first branch from the
snapshot; otherwise, if you have a checkpoint created in tandem with the
snapshot, then reverting to the snapshot can create a new checkpoint
child of the same parent that the snapshot's original checkpoint was
also child of. Or visually, if we only have:
C1 ... Snap ... C2
the bitmap that covers the delta from C1 to C2 is invalidated when
reverting to Snap (we don't know how much of the bitmap was from C1 to
snap, vs. how much was from Snap to C2); but if we have:
C1 ... Snap/C2 ... C3
the bitmap from C2.C3 is a child of the bitmap from C1.C2, and reverting
to Snap lets us also create:
C1 ... Snap/C4 ... C5
where the bitmap from C4.C5 is also a child of the bitmap from C1.C2 but
now tracks the alternate execution. (Checkpoint C2 and C4 cover the
same moment in guest execution, but given that implementation names the
bitmap after a checkpoint, it means we need two different checkpoint
names at the same point in time to track the two different bitmaps
covering the divergent guest changes)
So in addition to being able to create a checkpoint with a snapshot, we
also need to be able to create a checkpoint at the time of reverting to
a snapshot.
> diff --git a/tests/domaincheckpointxml2xmltest.c
b/tests/domaincheckpointxml2xmltest.c
Since this is not prefixed with qemu ...
> new file mode 100644
> index 0000000000..11ec61658b
> --- /dev/null
> +++ b/tests/domaincheckpointxml2xmltest.c
> @@ -0,0 +1,223 @@
> +#include <config.h>
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <string.h>
> +
> +#include <sys/types.h>
> +#include <fcntl.h>
> +
> +#include "testutils.h"
> +
> +#ifdef WITH_QEMU
> +
> +# include "internal.h"
> +# include "qemu/qemu_conf.h"
> +# include "qemu/qemu_domain.h"
... it should not require qemu. Either drop those or rename it to
qemudomaincheckpointxml2xmltest ...
Copied from snapshots, but I can do that.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org