[libvirt] [PATCH v4 0/8] bulk snapshot list/redefine (incremental backup saga)

While looking at my work on incremental backups, Nir raised a good point: if we want to recreate a set of known checkpoints on one machine that will be taking over a domain from another machine, my initial proposal required making multiple API calls to list the XML for each checkpoint on the source, and then more API calls to redefine each checkpoint on the destination; it also had the drawback that the list has to be presented in topological order (the API won't let you define a child checkpoint if the parent is not defined first). He asked if I could instead add bulk APIs, for getting the XML for all checkpoints at once, and then for redefining all checkpoints at once. Since my checkpoint code borrows heavily from concepts in the snapshot code, I chose to tackle the problem by starting with this series, which does the same thing for snapshots as what I plan to do for checkpoints. That is, since this patch series adds 2 new APIs, the checkpoint series will do likewise with very similar code. I'm also toying with the idea of followup patches that store all snapshots alongside the persistent <domain> XML in a single file, rather than in one snapshot per <domainsnapshot> file (we'd still support reading from the old method at libvirtd startup, but would not need to output in that manner any more). Based-on: <20190308060512.17733-1-eblake@redhat.com> [0/5 snapshots: topological sorting] Changes from v3: - introduce 2 new API instead of flags to 2 existing API [Jan] - rebase on top of minor changes that already landed in the non-controversial first half of the v3 series [John] git backport-diff gets confused by patch renames, but this is quite obviously more or less a rewrite: 001/8:[down] 'snapshot: Add new API for bulk dumpxml/redefine' 002/8:[down] 'snapshot: Support topological virDomainSnapshotForEach()' 003/8:[down] 'snapshot: Tweaks to support new bulk dumpxml/import API' 004/8:[down] 'remote: Wire up snapshot bulk dumpxml/import' 005/8:[down] 'virsh: Expose bulk snapshot dumpxml/import' 006/8:[0087] [FC] 'test: Implement bulk snapshot operations' 007/8:[0040] [FC] 'qemu: Factor out qemuDomainSnapshotValidate() helper' 008/8:[0152] [FC] 'qemu: Implement bulk snapshot operations' Eric Blake (8): snapshot: Add new API for bulk dumpxml/redefine snapshot: Support topological virDomainSnapshotForEach() snapshot: Tweaks to support new bulk dumpxml/import API remote: Wire up snapshot bulk dumpxml/import virsh: Expose bulk snapshot dumpxml/import test: Implement bulk snapshot operations qemu: Factor out qemuDomainSnapshotValidate() helper qemu: Implement bulk snapshot operations include/libvirt/libvirt-domain-snapshot.h | 16 +- src/conf/snapshot_conf.h | 1 + src/driver-hypervisor.h | 13 +- src/conf/snapshot_conf.c | 34 ++-- src/libvirt-domain-snapshot.c | 122 ++++++++++- src/libvirt_public.syms | 2 + src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_driver.c | 236 +++++++++++++++++----- src/remote/remote_driver.c | 6 +- src/remote/remote_protocol.x | 38 +++- src/remote_protocol-structs | 17 ++ src/test/test_driver.c | 62 +++++- src/vz/vz_driver.c | 3 +- tools/virsh-snapshot.c | 111 ++++++++-- tools/virsh.pod | 15 +- 15 files changed, 584 insertions(+), 94 deletions(-) -- 2.20.1

Right now, copying the state of a transient domain with snapshots from one host to another requires multiple API calls on both machines - on the host: get the domain XML, get a list of the snapshots, and then for each snapshot get the snapshot's XML; then on the destination: create the domain, then multiple domain snapshot create calls (must be in topological order) with the REDEFINE flag. This patch aims to make the process use fewer APIs by making it possible to grab the XML for all snapshots at once on the source, then redefine from that same list at the destination. In fact, although it is easy to match recent changes to allow output in topological order, it is also easy to do bulk redefine in arbitrary order. Consideration was given to adding a flag to existing APIs; but to do that, there would be an asymmetry of dumping <snapshots> as a sub-element of <domain> during virDomainGetXMLDesc(), vs. redefining <snapshots> as a top-level element (and different than the normal top-level element) of virDomainSnapshotCreateXML(). Thus, it is cleaner to have two new APIs: virDomainGetSnapshotsXMLDesc virDomainImportSnapshotsXML Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt-domain-snapshot.h | 16 ++- src/driver-hypervisor.h | 13 ++- src/libvirt-domain-snapshot.c | 122 +++++++++++++++++++++- src/libvirt_public.syms | 2 + 4 files changed, 148 insertions(+), 5 deletions(-) diff --git a/include/libvirt/libvirt-domain-snapshot.h b/include/libvirt/libvirt-domain-snapshot.h index 602e5def59..7c2e001b87 100644 --- a/include/libvirt/libvirt-domain-snapshot.h +++ b/include/libvirt/libvirt-domain-snapshot.h @@ -3,7 +3,7 @@ * Summary: APIs for management of domain snapshots * Description: Provides APIs for the management of domain snapshots * - * Copyright (C) 2006-2014 Red Hat, Inc. + * Copyright (C) 2006-2019 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -78,6 +78,11 @@ virDomainSnapshotPtr virDomainSnapshotCreateXML(virDomainPtr domain, const char *xmlDesc, unsigned int flags); +/* Bulk import a list of snapshots */ +int virDomainImportSnapshotsXML(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags); + typedef enum { VIR_DOMAIN_SNAPSHOT_XML_SECURE = VIR_DOMAIN_XML_SECURE, /* dump security sensitive information too */ } virDomainSnapshotXMLFlags; @@ -86,6 +91,15 @@ typedef enum { char *virDomainSnapshotGetXMLDesc(virDomainSnapshotPtr snapshot, unsigned int flags); +typedef enum { + VIR_DOMAIN_GET_SNAPSHOTS_XML_SECURE = (1 << 0), /* dump security sensitive information too */ + VIR_DOMAIN_GET_SNAPSHOTS_XML_TOPOLOGICAL = (1 << 1), /* ensure parents occur before children */ +} virDomainGetSnapshotsXMLFlags; + +/* Dump the XML of all snapshots */ +char *virDomainGetSnapshotsXMLDesc(virDomainPtr domain, + unsigned int flags); + /** * virDomainSnapshotListFlags: * diff --git a/src/driver-hypervisor.h b/src/driver-hypervisor.h index 5315e33dde..c30e16b690 100644 --- a/src/driver-hypervisor.h +++ b/src/driver-hypervisor.h @@ -1,7 +1,7 @@ /* * driver-hypervisor.h: entry points for hypervisor drivers * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2019 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -797,10 +797,19 @@ typedef virDomainSnapshotPtr const char *xmlDesc, unsigned int flags); +typedef int +(*virDrvDomainImportSnapshotsXML)(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags); + typedef char * (*virDrvDomainSnapshotGetXMLDesc)(virDomainSnapshotPtr snapshot, unsigned int flags); +typedef char * +(*virDrvDomainGetSnapshotsXMLDesc)(virDomainPtr domain, + unsigned int flags); + typedef int (*virDrvDomainSnapshotNum)(virDomainPtr domain, unsigned int flags); @@ -1494,7 +1503,9 @@ struct _virHypervisorDriver { virDrvDomainManagedSaveGetXMLDesc domainManagedSaveGetXMLDesc; virDrvDomainManagedSaveDefineXML domainManagedSaveDefineXML; virDrvDomainSnapshotCreateXML domainSnapshotCreateXML; + virDrvDomainImportSnapshotsXML domainImportSnapshotsXML; virDrvDomainSnapshotGetXMLDesc domainSnapshotGetXMLDesc; + virDrvDomainGetSnapshotsXMLDesc domainGetSnapshotsXMLDesc; virDrvDomainSnapshotNum domainSnapshotNum; virDrvDomainSnapshotListNames domainSnapshotListNames; virDrvDomainListAllSnapshots domainListAllSnapshots; diff --git a/src/libvirt-domain-snapshot.c b/src/libvirt-domain-snapshot.c index d133c84933..1ad8dfbfb8 100644 --- a/src/libvirt-domain-snapshot.c +++ b/src/libvirt-domain-snapshot.c @@ -133,7 +133,8 @@ virDomainSnapshotGetConnect(virDomainSnapshotPtr snapshot) * not exist, the hypervisor may validate that reverting to the * snapshot appears to be possible (for example, disk images have * snapshot contents by the requested name). Not all hypervisors - * support these flags. + * support these flags; and some hypervisors support + * virDomainImportSnapshotsXML() for redefining all metadata in one call. * * If @flags includes VIR_DOMAIN_SNAPSHOT_CREATE_NO_METADATA, then the * domain's disk images are modified according to @xmlDesc, but then @@ -242,6 +243,65 @@ virDomainSnapshotCreateXML(virDomainPtr domain, } +/** + * virDomainImportSnapshotsXML: + * @domain: a domain object + * @xmlDesc: string containing an XML description of a list of domain snapshots + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Imports the metadata for a list of domain snapshots using + * @xmlDesc with a top-level element of <snapshots>. + * + * This call requires that the domain currently has no snapshot + * metadata, and the snapshots can be listed in any order, whereas + * using virDomainSnapshotCreateXML() with its + * VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE flag requires multiple calls + * and topological sorting. Bulk redefinition is mainly useful for + * reinstating metadata in situations such as transient domains or + * migration. + * + * Generally, the list of snapshots is obtained from + * virDomainGetSnapshotsXMLDesc() prior to a scenario that requires + * removing snapshot metadata (such as virDomainUndefine() on a + * transient domain); although it could also be constructed by + * concatenating virDomainSnapshotGetXMLDesc() for each snapshot and + * wrapping with a <snapshots> element and optional attribute + * current='name' pointing to the current snapshot. + * + * Returns a count of snapshots imported on success, or -1 on failure. + */ +int +virDomainImportSnapshotsXML(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "xmlDesc=%s, flags=0x%x", xmlDesc, flags); + + virResetLastError(); + + virCheckDomainReturn(domain, -1); + conn = domain->conn; + + virCheckNonNullArgGoto(xmlDesc, error); + virCheckReadOnlyGoto(conn->flags, error); + + if (conn->driver->domainImportSnapshotsXML) { + int ret = conn->driver->domainImportSnapshotsXML(domain, xmlDesc, + flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + /** * virDomainSnapshotGetXMLDesc: * @snapshot: a domain snapshot object @@ -254,8 +314,8 @@ virDomainSnapshotCreateXML(virDomainPtr domain, * VIR_DOMAIN_SNAPSHOT_XML_SECURE; this flag is rejected on read-only * connections. * - * Returns a 0 terminated UTF-8 encoded XML instance, or NULL in case of error. - * the caller must free() the returned value. + * Returns a 0 terminated UTF-8 encoded XML instance, or NULL in case + * of error. The caller must free() the returned value. */ char * virDomainSnapshotGetXMLDesc(virDomainSnapshotPtr snapshot, @@ -291,6 +351,62 @@ virDomainSnapshotGetXMLDesc(virDomainSnapshotPtr snapshot, } +/** + * virDomainGetSnapshotsXMLDesc: + * @domain: a domain object + * @flags: bitwise-OR of virDomainGetSnapshotsXMLFlags + * + * Provide an XML description of all domain snapshots, with a top-level + * element of <snapshots>. + * + * No security-sensitive data will be included unless @flags contains + * VIR_DOMAIN_GET_SNAPSHOTS_XML_SECURE; this flag is rejected on read-only + * connections. + * + * If @flags contains VIR_DOMAIN_GET_SNAPSHOTS_XML_TOPOLOGICAL, then + * it is guaranteed that no snapshot appears in the resulting XML + * prior to its parent; otherwise, the order of snapshots in the + * resulting list is unspecified. Use of this flag is not required for + * virDomainImportSnapshotsXML() to transfer snapshot metadata to + * another instance of libvirt. + * + * Returns a 0 terminated UTF-8 encoded XML instance, or NULL in case + * of error. The caller must free() the returned value. + */ +char * +virDomainGetSnapshotsXMLDesc(virDomainPtr domain, + unsigned int flags) +{ + virConnectPtr conn; + VIR_DOMAIN_DEBUG(domain, "flags=0x%x", flags); + + virResetLastError(); + + virCheckDomainReturn(domain, NULL); + conn = domain->conn; + + if ((conn->flags & VIR_CONNECT_RO) && + (flags & VIR_DOMAIN_GET_SNAPSHOTS_XML_SECURE)) { + virReportError(VIR_ERR_OPERATION_DENIED, "%s", + _("virDomainGetSnapshotsXMLDesc with secure flag")); + goto error; + } + + if (conn->driver->domainGetSnapshotsXMLDesc) { + char *ret; + ret = conn->driver->domainGetSnapshotsXMLDesc(domain, flags); + if (!ret) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + /** * virDomainSnapshotNum: * @domain: a domain object diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index dbce3336d5..d40da8b893 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -817,6 +817,8 @@ LIBVIRT_4.10.0 { LIBVIRT_5.2.0 { global: virConnectGetStoragePoolCapabilities; + virDomainGetSnapshotsXMLDesc; + virDomainImportSnapshotsXML; } LIBVIRT_4.10.0; # .... define new API here using predicted next version number .... -- 2.20.1

On Mon, Mar 11, 2019 at 09:38:32PM -0500, Eric Blake wrote:
Right now, copying the state of a transient domain with snapshots from one host to another requires multiple API calls on both machines - on the host: get the domain XML, get a list of the snapshots, and then for each snapshot get the snapshot's XML; then on the destination: create the domain, then multiple domain snapshot create calls (must be in topological order) with the REDEFINE flag. This patch aims to make the process use fewer APIs by making it possible to grab the XML for all snapshots at once on the source, then redefine from that same list at the destination. In fact, although it is easy to match recent changes to allow output in topological order, it is also easy to do bulk redefine in arbitrary order.
Consideration was given to adding a flag to existing APIs; but to do that, there would be an asymmetry of dumping <snapshots> as a sub-element of <domain> during virDomainGetXMLDesc(), vs. redefining <snapshots> as a top-level element (and different than the normal top-level element) of virDomainSnapshotCreateXML(). Thus, it is cleaner to have two new APIs:
virDomainGetSnapshotsXMLDesc virDomainImportSnapshotsXML
These do not share a common prefix, but that makes them look nicer IMO.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt-domain-snapshot.h | 16 ++- src/driver-hypervisor.h | 13 ++- src/libvirt-domain-snapshot.c | 122 +++++++++++++++++++++- src/libvirt_public.syms | 2 + 4 files changed, 148 insertions(+), 5 deletions(-)
diff --git a/include/libvirt/libvirt-domain-snapshot.h b/include/libvirt/libvirt-domain-snapshot.h index 602e5def59..7c2e001b87 100644 --- a/include/libvirt/libvirt-domain-snapshot.h +++ b/include/libvirt/libvirt-domain-snapshot.h @@ -3,7 +3,7 @@ * Summary: APIs for management of domain snapshots * Description: Provides APIs for the management of domain snapshots * - * Copyright (C) 2006-2014 Red Hat, Inc. + * Copyright (C) 2006-2019 Red Hat, Inc.
Do we actually care about updating these? But an include file will possibly be read by people without git access.
* * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public
[...]
@@ -242,6 +243,65 @@ virDomainSnapshotCreateXML(virDomainPtr domain, }
+/** + * virDomainImportSnapshotsXML: + * @domain: a domain object + * @xmlDesc: string containing an XML description of a list of domain snapshots + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Imports the metadata for a list of domain snapshots using + * @xmlDesc with a top-level element of <snapshots>. + * + * This call requires that the domain currently has no snapshot + * metadata, and the snapshots can be listed in any order, whereas + * using virDomainSnapshotCreateXML() with its + * VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE flag requires multiple calls + * and topological sorting. Bulk redefinition is mainly useful for + * reinstating metadata in situations such as transient domains or + * migration. + * + * Generally, the list of snapshots is obtained from + * virDomainGetSnapshotsXMLDesc() prior to a scenario that requires + * removing snapshot metadata (such as virDomainUndefine() on a + * transient domain); although it could also be constructed by + * concatenating virDomainSnapshotGetXMLDesc() for each snapshot and + * wrapping with a <snapshots> element and optional attribute + * current='name' pointing to the current snapshot. + * + * Returns a count of snapshots imported on success, or -1 on failure.
This says nothing about partial success, i.e. whether the count returned can be different from the number of snapshots present in the file.
+ */ +int +virDomainImportSnapshotsXML(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "xmlDesc=%s, flags=0x%x", xmlDesc, flags); + + virResetLastError(); + + virCheckDomainReturn(domain, -1); + conn = domain->conn; + + virCheckNonNullArgGoto(xmlDesc, error); + virCheckReadOnlyGoto(conn->flags, error); + + if (conn->driver->domainImportSnapshotsXML) { + int ret = conn->driver->domainImportSnapshotsXML(domain, xmlDesc, + flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + /** * virDomainSnapshotGetXMLDesc: * @snapshot: a domain snapshot object @@ -254,8 +314,8 @@ virDomainSnapshotCreateXML(virDomainPtr domain, * VIR_DOMAIN_SNAPSHOT_XML_SECURE; this flag is rejected on read-only * connections. * - * Returns a 0 terminated UTF-8 encoded XML instance, or NULL in case of error. - * the caller must free() the returned value. + * Returns a 0 terminated UTF-8 encoded XML instance, or NULL in case + * of error. The caller must free() the returned value.
Unrelated change.
*/ char * virDomainSnapshotGetXMLDesc(virDomainSnapshotPtr snapshot,
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

On 3/12/19 7:57 AM, Ján Tomko wrote:
On Mon, Mar 11, 2019 at 09:38:32PM -0500, Eric Blake wrote:
Right now, copying the state of a transient domain with snapshots from one host to another requires multiple API calls on both machines - on the host: get the domain XML, get a list of the snapshots, and then for each snapshot get the snapshot's XML; then on the destination: create the domain, then multiple domain snapshot create calls (must be in topological order) with the REDEFINE flag. This patch aims to make the process use fewer APIs by making it possible to grab the XML for all snapshots at once on the source, then redefine from that same list at the destination. In fact, although it is easy to match recent changes to allow output in topological order, it is also easy to do bulk redefine in arbitrary order.
Consideration was given to adding a flag to existing APIs; but to do that, there would be an asymmetry of dumping <snapshots> as a sub-element of <domain> during virDomainGetXMLDesc(), vs. redefining <snapshots> as a top-level element (and different than the normal top-level element) of virDomainSnapshotCreateXML(). Thus, it is cleaner to have two new APIs:
virDomainGetSnapshotsXMLDesc virDomainImportSnapshotsXML
These do not share a common prefix, but that makes them look nicer IMO.
In particular, they do NOT have the virDomainSnapshot prefix, which means I don't have to special-case them in the python generator :)
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt-domain-snapshot.h | 16 ++- src/driver-hypervisor.h | 13 ++- src/libvirt-domain-snapshot.c | 122 +++++++++++++++++++++- src/libvirt_public.syms | 2 + 4 files changed, 148 insertions(+), 5 deletions(-)
diff --git a/include/libvirt/libvirt-domain-snapshot.h b/include/libvirt/libvirt-domain-snapshot.h index 602e5def59..7c2e001b87 100644 --- a/include/libvirt/libvirt-domain-snapshot.h +++ b/include/libvirt/libvirt-domain-snapshot.h @@ -3,7 +3,7 @@ * Summary: APIs for management of domain snapshots * Description: Provides APIs for the management of domain snapshots * - * Copyright (C) 2006-2014 Red Hat, Inc. + * Copyright (C) 2006-2019 Red Hat, Inc.
Do we actually care about updating these?
I have an emacs hook that prompts me to auto-update on files I touch. I know we haven't been insistent on keeping things up-to-date, but I also don't think it hurts.
But an include file will possibly be read by people without git access.
Yeah, especially for public-facing files, having a reliable year in there might matter (although these days, I hope that any court litigating a copyright claim against open source would be smart enough to rely more on version control history than any thing in the file under question).
+ * + * Returns a count of snapshots imported on success, or -1 on failure.
This says nothing about partial success, i.e. whether the count returned can be different from the number of snapshots present in the file.
Partial success is not possible; it's an all-or-none return. But I can try to tweak that wording to make it clearer.
/** * virDomainSnapshotGetXMLDesc: * @snapshot: a domain snapshot object @@ -254,8 +314,8 @@ virDomainSnapshotCreateXML(virDomainPtr domain, * VIR_DOMAIN_SNAPSHOT_XML_SECURE; this flag is rejected on read-only * connections. * - * Returns a 0 terminated UTF-8 encoded XML instance, or NULL in case of error. - * the caller must free() the returned value. + * Returns a 0 terminated UTF-8 encoded XML instance, or NULL in case + * of error. The caller must free() the returned value.
Unrelated change.
Noticed while copy-and-pasting. If needed, I can split it out as a trivial patch. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

Previous patches added topological sorting only for existing public API functions, but it turns out that it will also useful for an upcoming API addition that wants to visit all snapshots. Add a parameter, and update all existing callers (none of which care about ordering). Signed-off-by: Eric Blake <eblake@redhat.com> --- src/conf/snapshot_conf.h | 1 + src/conf/snapshot_conf.c | 6 +++++- src/qemu/qemu_domain.c | 2 +- src/vz/vz_driver.c | 3 ++- 4 files changed, 9 insertions(+), 3 deletions(-) diff --git a/src/conf/snapshot_conf.h b/src/conf/snapshot_conf.h index 6d79dbb0da..ba9362c744 100644 --- a/src/conf/snapshot_conf.h +++ b/src/conf/snapshot_conf.h @@ -170,6 +170,7 @@ virDomainSnapshotObjPtr virDomainSnapshotFindByName(virDomainSnapshotObjListPtr void virDomainSnapshotObjListRemove(virDomainSnapshotObjListPtr snapshots, virDomainSnapshotObjPtr snapshot); int virDomainSnapshotForEach(virDomainSnapshotObjListPtr snapshots, + bool topological, virHashIterator iter, void *data); int virDomainSnapshotForEachChild(virDomainSnapshotObjPtr snapshot, diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c index e2c91a5072..8235d7c526 100644 --- a/src/conf/snapshot_conf.c +++ b/src/conf/snapshot_conf.c @@ -1050,7 +1050,7 @@ virDomainSnapshotObjListFormat(virBufferPtr buf, current_snapshot->def->name); virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); - if (virDomainSnapshotForEach(snapshots, virDomainSnapshotFormatOne, + if (virDomainSnapshotForEach(snapshots, false, virDomainSnapshotFormatOne, &data) < 0) { virBufferFreeAndReset(buf); return -1; @@ -1293,9 +1293,13 @@ void virDomainSnapshotObjListRemove(virDomainSnapshotObjListPtr snapshots, int virDomainSnapshotForEach(virDomainSnapshotObjListPtr snapshots, + bool topological, virHashIterator iter, void *data) { + if (topological) + return virDomainSnapshotForEachDescendant(&snapshots->metaroot, + iter, data); return virHashForEach(snapshots->objs, iter, data); } diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 1659e88478..34f3669967 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8675,7 +8675,7 @@ qemuDomainSnapshotDiscardAllMetadata(virQEMUDriverPtr driver, rem.vm = vm; rem.metadata_only = true; rem.err = 0; - virDomainSnapshotForEach(vm->snapshots, qemuDomainSnapshotDiscardAll, + virDomainSnapshotForEach(vm->snapshots, false, qemuDomainSnapshotDiscardAll, &rem); if (rem.current) vm->current_snapshot = NULL; diff --git a/src/vz/vz_driver.c b/src/vz/vz_driver.c index 066d617524..2a0fc98f72 100644 --- a/src/vz/vz_driver.c +++ b/src/vz/vz_driver.c @@ -2170,7 +2170,8 @@ vzFindCurrentSnapshot(virDomainSnapshotObjListPtr snapshots) { virDomainSnapshotObjPtr current = NULL; - virDomainSnapshotForEach(snapshots, vzCurrentSnapshotIterator, ¤t); + virDomainSnapshotForEach(snapshots, false, vzCurrentSnapshotIterator, + ¤t); return current; } -- 2.20.1

On Mon, Mar 11, 2019 at 09:38:33PM -0500, Eric Blake wrote:
Previous patches added topological sorting only for existing public API functions, but it turns out that it will also useful for an upcoming API addition that wants to visit all snapshots. Add a parameter, and update all existing callers (none of which care about ordering).
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/conf/snapshot_conf.h | 1 + src/conf/snapshot_conf.c | 6 +++++- src/qemu/qemu_domain.c | 2 +- src/vz/vz_driver.c | 3 ++- 4 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/src/conf/snapshot_conf.h b/src/conf/snapshot_conf.h index 6d79dbb0da..ba9362c744 100644 --- a/src/conf/snapshot_conf.h +++ b/src/conf/snapshot_conf.h @@ -170,6 +170,7 @@ virDomainSnapshotObjPtr virDomainSnapshotFindByName(virDomainSnapshotObjListPtr void virDomainSnapshotObjListRemove(virDomainSnapshotObjListPtr snapshots, virDomainSnapshotObjPtr snapshot); int virDomainSnapshotForEach(virDomainSnapshotObjListPtr snapshots, + bool topological, virHashIterator iter, void *data); int virDomainSnapshotForEachChild(virDomainSnapshotObjPtr snapshot, diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c index e2c91a5072..8235d7c526 100644 --- a/src/conf/snapshot_conf.c +++ b/src/conf/snapshot_conf.c @@ -1050,7 +1050,7 @@ virDomainSnapshotObjListFormat(virBufferPtr buf, current_snapshot->def->name); virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); - if (virDomainSnapshotForEach(snapshots, virDomainSnapshotFormatOne, + if (virDomainSnapshotForEach(snapshots, false, virDomainSnapshotFormatOne, &data) < 0) { virBufferFreeAndReset(buf); return -1; @@ -1293,9 +1293,13 @@ void virDomainSnapshotObjListRemove(virDomainSnapshotObjListPtr snapshots,
int virDomainSnapshotForEach(virDomainSnapshotObjListPtr snapshots, + bool topological,
Rather than introducing a bool parameter (which shows up as non-descriptive 'false' in the caller), I'd put it into the name of the function: virDomainSnapshotForEachTopological which would be a thin wrapper for virDomainSnapshotForEachDescendant. Alternatively, the one caller can call virDomainSnapshotForEachDescendant with &snapshots->metaroot directly. Jano
virHashIterator iter, void *data) { + if (topological) + return virDomainSnapshotForEachDescendant(&snapshots->metaroot, + iter, data); return virHashForEach(snapshots->objs, iter, data); }

On 3/12/19 8:00 AM, Ján Tomko wrote:
On Mon, Mar 11, 2019 at 09:38:33PM -0500, Eric Blake wrote:
Previous patches added topological sorting only for existing public API functions, but it turns out that it will also useful for an upcoming API addition that wants to visit all snapshots. Add a parameter, and update all existing callers (none of which care about ordering).
Signed-off-by: Eric Blake <eblake@redhat.com> ---
int virDomainSnapshotForEach(virDomainSnapshotObjListPtr snapshots, + bool topological,
Rather than introducing a bool parameter (which shows up as non-descriptive 'false' in the caller), I'd put it into the name of the function:
virDomainSnapshotForEachTopological
which would be a thin wrapper for virDomainSnapshotForEachDescendant. Alternatively, the one caller can call virDomainSnapshotForEachDescendant with &snapshots->metaroot directly.
I was debating whether any caller outside of snapshot_conf.c would need a topological visit (knowing about the metaroot outside of snapshot_conf.c is a leaky abstraction that I want to avoid). But you're right that for THIS series, the only caller resides in snapshot_conf.c, and therefore doing it directly is just as nice. I think that means I can completely drop this patch and go with that approach instead. By the way, did you have any comments on my other series adding topological sorting to the various virDomainSnapshotList API calls? (I may also end up squashing some of these changes into those, as that series has not been pushed yet). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

Change the return value of virDomainSnapshotObjLisParse() to return the number of snapshots imported, and allow a return of 0 (the original proposal of adding a flag to virDomainSnapshotCreateXML required returning an arbitrary non-NULL snapshot, but with a new API that returns a count, we are no longer constrained to a non-empty list). Change virDomainSnapshotObjListFormat()'s flags argument to be the new public virDomainGetSnapshotsXMLFlags, since it is easy to support both flag values. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/conf/snapshot_conf.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c index 8235d7c526..3f24a80f76 100644 --- a/src/conf/snapshot_conf.c +++ b/src/conf/snapshot_conf.c @@ -507,8 +507,10 @@ virDomainSnapshotRedefineValidate(virDomainSnapshotDefPtr def, } -/* Parse a <snapshots> XML entry into snapshots, which must start empty. - * Any <domain> sub-elements of a <domainsnapshot> must match domain_uuid. +/* Parse a <snapshots> XML entry into snapshots, which must start + * empty. Any <domain> sub-elements of a <domainsnapshot> must match + * domain_uuid. @flags is virDomainSnapshotParseFlags. Return the + * number of snapshots parsed, or -1 on error. */ int virDomainSnapshotObjListParse(const char *xmlStr, @@ -562,11 +564,6 @@ virDomainSnapshotObjListParse(const char *xmlStr, if ((n = virXPathNodeSet("./domainsnapshot", ctxt, &nodes)) < 0) goto cleanup; - if (!n) { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("expected at least one <domainsnapshot> child")); - goto cleanup; - } for (i = 0; i < n; i++) { virDomainSnapshotDefPtr def; @@ -601,7 +598,7 @@ virDomainSnapshotObjListParse(const char *xmlStr, (*current_snap)->def->current = true; } - ret = 0; + ret = n; cleanup: if (ret < 0) { /* There were no snapshots before this call; so on error, just @@ -1025,8 +1022,9 @@ virDomainSnapshotFormatOne(void *payload, } -/* Format the XML for all snapshots in the list into buf. On error, - * clear the buffer and return -1. */ +/* Format the XML for all snapshots in the list into buf. @flags is + * virDomainGetSnapshotsXMLFlags. On error, clear the buffer and + * return -1. */ int virDomainSnapshotObjListFormat(virBufferPtr buf, const char *uuidstr, @@ -1041,17 +1039,23 @@ virDomainSnapshotObjListFormat(virBufferPtr buf, .uuidstr = uuidstr, .caps = caps, .xmlopt = xmlopt, - .flags = flags, + .flags = 0, }; + bool topological = flags & VIR_DOMAIN_GET_SNAPSHOTS_XML_TOPOLOGICAL; + virCheckFlags(VIR_DOMAIN_GET_SNAPSHOTS_XML_SECURE | + VIR_DOMAIN_GET_SNAPSHOTS_XML_TOPOLOGICAL, -1); + + if (flags & VIR_DOMAIN_GET_SNAPSHOTS_XML_SECURE) + data.flags |= VIR_DOMAIN_SNAPSHOT_FORMAT_SECURE; virBufferAddLit(buf, "<snapshots"); if (current_snapshot) virBufferEscapeString(buf, " current='%s'", current_snapshot->def->name); virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); - if (virDomainSnapshotForEach(snapshots, false, virDomainSnapshotFormatOne, - &data) < 0) { + if (virDomainSnapshotForEach(snapshots, topological, + virDomainSnapshotFormatOne, &data) < 0) { virBufferFreeAndReset(buf); return -1; } -- 2.20.1

On Mon, Mar 11, 2019 at 09:38:34PM -0500, Eric Blake wrote:
Change the return value of virDomainSnapshotObjLisParse() to return the number of snapshots imported, and allow a return of 0 (the original proposal of adding a flag to virDomainSnapshotCreateXML required returning an arbitrary non-NULL snapshot, but with a new API that returns a count, we are no longer constrained to a non-empty list).
Change virDomainSnapshotObjListFormat()'s flags argument to be the new public virDomainGetSnapshotsXMLFlags, since it is easy to support both flag values.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/conf/snapshot_conf.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-)
diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c index 8235d7c526..3f24a80f76 100644 --- a/src/conf/snapshot_conf.c +++ b/src/conf/snapshot_conf.c @@ -507,8 +507,10 @@ virDomainSnapshotRedefineValidate(virDomainSnapshotDefPtr def, }
-/* Parse a <snapshots> XML entry into snapshots, which must start empty. - * Any <domain> sub-elements of a <domainsnapshot> must match domain_uuid. +/* Parse a <snapshots> XML entry into snapshots, which must start + * empty. Any <domain> sub-elements of a <domainsnapshot> must match + * domain_uuid. @flags is virDomainSnapshotParseFlags. Return the
Do we need to pass (and check) the flags here? Given this series, it would also make sense to me to drop the flags argument and just imply the ones that are needed for bulk parse.
+ * number of snapshots parsed, or -1 on error. */ int virDomainSnapshotObjListParse(const char *xmlStr, @@ -562,11 +564,6 @@ virDomainSnapshotObjListParse(const char *xmlStr,
if ((n = virXPathNodeSet("./domainsnapshot", ctxt, &nodes)) < 0) goto cleanup; - if (!n) { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("expected at least one <domainsnapshot> child")); - goto cleanup; - }
for (i = 0; i < n; i++) { virDomainSnapshotDefPtr def; @@ -601,7 +598,7 @@ virDomainSnapshotObjListParse(const char *xmlStr, (*current_snap)->def->current = true; }
- ret = 0; + ret = n; cleanup: if (ret < 0) { /* There were no snapshots before this call; so on error, just @@ -1025,8 +1022,9 @@ virDomainSnapshotFormatOne(void *payload, }
-/* Format the XML for all snapshots in the list into buf. On error, - * clear the buffer and return -1. */ +/* Format the XML for all snapshots in the list into buf. @flags is + * virDomainGetSnapshotsXMLFlags. On error, clear the buffer and + * return -1. */ int virDomainSnapshotObjListFormat(virBufferPtr buf, const char *uuidstr, @@ -1041,17 +1039,23 @@ virDomainSnapshotObjListFormat(virBufferPtr buf, .uuidstr = uuidstr, .caps = caps, .xmlopt = xmlopt, - .flags = flags, + .flags = 0, };
You can use virXMLFormatElement here: VIR_AUTOCLEAN(virBuffer) attrBuf = VIR_BUFFER_INITIALIZER; VIR_AUTOCLEAN(virBuffer) childBuf = VIR_BUFFER_INITIALIZER; data.buf = &childBuf;
+ bool topological = flags & VIR_DOMAIN_GET_SNAPSHOTS_XML_TOPOLOGICAL;
+ virCheckFlags(VIR_DOMAIN_GET_SNAPSHOTS_XML_SECURE | + VIR_DOMAIN_GET_SNAPSHOTS_XML_TOPOLOGICAL, -1); + + if (flags & VIR_DOMAIN_GET_SNAPSHOTS_XML_SECURE) + data.flags |= VIR_DOMAIN_SNAPSHOT_FORMAT_SECURE;
This:
virBufferAddLit(buf, "<snapshots"); if (current_snapshot) virBufferEscapeString(buf, " current='%s'", current_snapshot->def->name); virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); - if (virDomainSnapshotForEach(snapshots, false, virDomainSnapshotFormatOne, - &data) < 0) { + if (virDomainSnapshotForEach(snapshots, topological, + virDomainSnapshotFormatOne, &data) < 0) {
would become: if (current_snapshot) { virBufferEscapeString(&attrBuf, " current='%s'", current_snapshot->def->name); } virBufferSetChildIndent(&childBuf, buf); if (virDomainSnapshotForEach() < 0) ... if (virXMLFormatElement(buf, "snapshots", &attrBuf, &childBuf) < 0) { virBufferFreeAndReset(buf); /* as required by the function documentation */ return -1; } With the benefit of outputting a non-pair tag for empty snapshot list.
virBufferFreeAndReset(buf); return -1; }
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

On Mon, Mar 11, 2019 at 09:38:34PM -0500, Eric Blake wrote:
Change the return value of virDomainSnapshotObjLisParse() to return
also s/ObjLis/ObjList/ Jano
the number of snapshots imported, and allow a return of 0 (the original proposal of adding a flag to virDomainSnapshotCreateXML required returning an arbitrary non-NULL snapshot, but with a new API that returns a count, we are no longer constrained to a non-empty list).
Change virDomainSnapshotObjListFormat()'s flags argument to be the new public virDomainGetSnapshotsXMLFlags, since it is easy to support both flag values.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/conf/snapshot_conf.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-)

Typical copy-and-paste addition. Fortunately, the generator handles this one without needing manual overrides. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/remote/remote_driver.c | 6 ++++-- src/remote/remote_protocol.x | 38 ++++++++++++++++++++++++++++++++++-- src/remote_protocol-structs | 17 ++++++++++++++++ 3 files changed, 57 insertions(+), 4 deletions(-) diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index eabe7a3823..51b2a6a448 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -2,7 +2,7 @@ * remote_driver.c: driver to provide access to libvirtd running * on a remote machine * - * Copyright (C) 2007-2015 Red Hat, Inc. + * Copyright (C) 2007-2019 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -8516,7 +8516,9 @@ static virHypervisorDriver hypervisor_driver = { .connectCompareHypervisorCPU = remoteConnectCompareHypervisorCPU, /* 4.4.0 */ .connectBaselineHypervisorCPU = remoteConnectBaselineHypervisorCPU, /* 4.4.0 */ .nodeGetSEVInfo = remoteNodeGetSEVInfo, /* 4.5.0 */ - .domainGetLaunchSecurityInfo = remoteDomainGetLaunchSecurityInfo /* 4.5.0 */ + .domainGetLaunchSecurityInfo = remoteDomainGetLaunchSecurityInfo, /* 4.5.0 */ + .domainGetSnapshotsXMLDesc = remoteDomainGetSnapshotsXMLDesc, /* 5.2.0 */ + .domainImportSnapshotsXML = remoteDomainImportSnapshotsXML, /* 5.2.0 */ }; static virNetworkDriver network_driver = { diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 74be4b37d0..c585ec10ef 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -3,7 +3,7 @@ * remote_internal driver and libvirtd. This protocol is * internal and may change at any time. * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2019 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -3573,6 +3573,26 @@ struct remote_connect_get_storage_pool_capabilities_ret { remote_nonnull_string capabilities; }; +struct remote_domain_get_snapshots_xml_desc_args { + remote_nonnull_domain domain; + unsigned int flags; +}; + +struct remote_domain_get_snapshots_xml_desc_ret { + remote_nonnull_string xml; +}; + +struct remote_domain_import_snapshots_xml_args { + remote_nonnull_domain dom; + remote_nonnull_string xml; + unsigned int flags; +}; + +struct remote_domain_import_snapshots_xml_ret { + int result; +}; + + /*----- Protocol. -----*/ /* Define the program number, protocol version and procedure numbers here. */ @@ -6342,5 +6362,19 @@ enum remote_procedure { * @generate: both * @acl: connect:read */ - REMOTE_PROC_CONNECT_GET_STORAGE_POOL_CAPABILITIES = 403 + REMOTE_PROC_CONNECT_GET_STORAGE_POOL_CAPABILITIES = 403, + + /** + * @generate: both + * @priority: high + * @acl: domain:read + * @acl: domain:read_secure:VIR_DOMAIN_GET_SNAPSHOTS_XML_SECURE + */ + REMOTE_PROC_DOMAIN_GET_SNAPSHOTS_XML_DESC = 404, + + /** + * @generate: both + * @acl: domain:snapshot + */ + REMOTE_PROC_DOMAIN_IMPORT_SNAPSHOTS_XML = 405 }; diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 768189c573..b783d43a42 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -2981,6 +2981,21 @@ struct remote_connect_get_storage_pool_capabilities_args { struct remote_connect_get_storage_pool_capabilities_ret { remote_nonnull_string capabilities; }; +struct remote_domain_get_snapshots_xml_desc_args { + remote_nonnull_domain domain; + u_int flags; +}; +struct remote_domain_get_snapshots_xml_desc_ret { + remote_nonnull_string xml; +}; +struct remote_domain_import_snapshots_xml_args { + remote_nonnull_domain dom; + remote_nonnull_string xml; + u_int flags; +}; +struct remote_domain_import_snapshots_xml_ret { + int result; +}; enum remote_procedure { REMOTE_PROC_CONNECT_OPEN = 1, REMOTE_PROC_CONNECT_CLOSE = 2, @@ -3385,4 +3400,6 @@ enum remote_procedure { REMOTE_PROC_CONNECT_LIST_ALL_NWFILTER_BINDINGS = 401, REMOTE_PROC_DOMAIN_SET_IOTHREAD_PARAMS = 402, REMOTE_PROC_CONNECT_GET_STORAGE_POOL_CAPABILITIES = 403, + REMOTE_PROC_DOMAIN_GET_SNAPSHOTS_XML_DESC = 404, + REMOTE_PROC_DOMAIN_IMPORT_SNAPSHOTS_XML = 405, }; -- 2.20.1

On Mon, Mar 11, 2019 at 09:38:35PM -0500, Eric Blake wrote:
Typical copy-and-paste addition. Fortunately, the generator handles this one without needing manual overrides.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/remote/remote_driver.c | 6 ++++-- src/remote/remote_protocol.x | 38 ++++++++++++++++++++++++++++++++++-- src/remote_protocol-structs | 17 ++++++++++++++++ 3 files changed, 57 insertions(+), 4 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Add a new 'snapshot-import' command for bulk import. For bulk XML, it was easier to just add new flags to the existing 'snapshot-dumpxml' than to figure out a new command name. I debated about whether omitting the snapshotname should be enough to trigger the new API call, or whether to require a new bool --all; in the end, I went with the latter. Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-snapshot.c | 111 ++++++++++++++++++++++++++++++++++++----- tools/virsh.pod | 15 +++++- 2 files changed, 112 insertions(+), 14 deletions(-) diff --git a/tools/virsh-snapshot.c b/tools/virsh-snapshot.c index 31153f5b10..2a2fc0108b 100644 --- a/tools/virsh-snapshot.c +++ b/tools/virsh-snapshot.c @@ -1,7 +1,7 @@ /* * virsh-snapshot.c: Commands to manage domain snapshot * - * Copyright (C) 2005, 2007-2016 Red Hat, Inc. + * Copyright (C) 2005-2019 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -444,6 +444,61 @@ cmdSnapshotCreateAs(vshControl *ctl, const vshCmd *cmd) return ret; } +/* + * "snapshot-import" command + */ +static const vshCmdInfo info_snapshot_import[] = { + {.name = "help", + .data = N_("Bulk import snapshots from XML") + }, + {.name = "desc", + .data = N_("Import the metadata for multiple snapshots from XML") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_snapshot_import[] = { + VIRSH_COMMON_OPT_DOMAIN_FULL(0), + {.name = "xmlfile", + .type = VSH_OT_DATA, + .flags = VSH_OFLAG_REQ, + .help = N_("domain snapshots XML"), + }, + {.name = NULL} +}; + +static bool +cmdSnapshotImport(vshControl *ctl, const vshCmd *cmd) +{ + virDomainPtr dom = NULL; + bool ret = false; + const char *from = NULL; + char *buffer = NULL; + int count; + + if (!(dom = virshCommandOptDomain(ctl, cmd, NULL))) + goto cleanup; + + if (vshCommandOptStringReq(ctl, cmd, "xmlfile", &from) < 0) + goto cleanup; + if (virFileReadAll(from, VSH_MAX_XML_FILE, &buffer) < 0) { + vshSaveLibvirtError(); + goto cleanup; + } + + if ((count = virDomainImportSnapshotsXML(dom, buffer, 0)) < 0) + goto cleanup; + vshPrint(ctl, _("Imported %d snapshots"), count); + + ret = true; + + cleanup: + VIR_FREE(buffer); + virshDomainFree(dom); + + return ret; +} + /* Helper for resolving {--current | --ARG name} into a snapshot * belonging to DOM. If EXCLUSIVE, fail if both --current and arg are * present. On success, populate *SNAP and *NAME, before returning 0. @@ -1664,11 +1719,18 @@ static const vshCmdInfo info_snapshot_dumpxml[] = { static const vshCmdOptDef opts_snapshot_dumpxml[] = { VIRSH_COMMON_OPT_DOMAIN_FULL(0), {.name = "snapshotname", - .type = VSH_OT_DATA, - .flags = VSH_OFLAG_REQ, + .type = VSH_OT_STRING, .help = N_("snapshot name"), .completer = virshSnapshotNameCompleter, }, + {.name = "all", + .type = VSH_OT_BOOL, + .help = N_("list all snapshots at once"), + }, + {.name = "topological", + .type = VSH_OT_BOOL, + .help = N_("with --all, ensure listing is topologically sorted"), + }, {.name = "security-info", .type = VSH_OT_BOOL, .help = N_("include security sensitive information in XML dump") @@ -1681,32 +1743,49 @@ cmdSnapshotDumpXML(vshControl *ctl, const vshCmd *cmd) { virDomainPtr dom = NULL; bool ret = false; - const char *name = NULL; + const char *snapshotname = NULL; virDomainSnapshotPtr snapshot = NULL; char *xml = NULL; unsigned int flags = 0; + bool all = vshCommandOptBool(cmd, "all"); + + if (vshCommandOptStringReq(ctl, cmd, "snapshotname", &snapshotname) < 0) + return false; + + VSH_EXCLUSIVE_OPTIONS_VAR(snapshotname, all); + VSH_EXCLUSIVE_OPTIONS("snapshotname", "topological"); if (vshCommandOptBool(cmd, "security-info")) - flags |= VIR_DOMAIN_XML_SECURE; - - if (vshCommandOptStringReq(ctl, cmd, "snapshotname", &name) < 0) - return false; + flags |= all ? VIR_DOMAIN_GET_SNAPSHOTS_XML_SECURE : + VIR_DOMAIN_XML_SECURE; + if (vshCommandOptBool(cmd, "topological")) + flags |= VIR_DOMAIN_GET_SNAPSHOTS_XML_TOPOLOGICAL; if (!(dom = virshCommandOptDomain(ctl, cmd, NULL))) return false; - if (!(snapshot = virDomainSnapshotLookupByName(dom, name, 0))) - goto cleanup; + if (all) { + if (!(xml = virDomainGetSnapshotsXMLDesc(dom, flags))) + goto cleanup; + } else { + if (!snapshotname) { + vshError(ctl, "%s", _("either snapshotname or --all required")); + goto cleanup; + } + if (!(snapshot = virDomainSnapshotLookupByName(dom, snapshotname, 0))) + goto cleanup; - if (!(xml = virDomainSnapshotGetXMLDesc(snapshot, flags))) - goto cleanup; + if (!(xml = virDomainSnapshotGetXMLDesc(snapshot, flags))) + goto cleanup; + } vshPrint(ctl, "%s", xml); ret = true; cleanup: VIR_FREE(xml); - virshDomainSnapshotFree(snapshot); + if (!all) + virshDomainSnapshotFree(snapshot); virshDomainFree(dom); return ret; @@ -1952,6 +2031,12 @@ const vshCmdDef snapshotCmds[] = { .info = info_snapshot_create_as, .flags = 0 }, + {.name = "snapshot-import", + .handler = cmdSnapshotImport, + .opts = opts_snapshot_import, + .info = info_snapshot_import, + .flags = 0, + }, {.name = "snapshot-current", .handler = cmdSnapshotCurrent, .opts = opts_snapshot_current, diff --git a/tools/virsh.pod b/tools/virsh.pod index 66e2bf24ec..28adcb85c3 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -4680,6 +4680,15 @@ If I<--live> is specified, libvirt takes the snapshot while the guest is running. This increases the size of the memory image of the external snapshot. This is currently supported only for external full system snapshots. +=item B<snapshot-import> I<domain> I<xmlfile> + +Import the metadata for all snapshots, for a domain with no existing +snapshot metadata. This is useful for migrating snapshots from one +host (with B<snapshot-dumpxml --all --security-info>) to another. + +This command is a shortcut to using B<snapshot-create --redefine> one +snapshot at a time. + =item B<snapshot-current> I<domain> {[I<--name>] | [I<--security-info>] | [I<snapshotname>]} @@ -4786,9 +4795,13 @@ that use internal storage of existing disk images. If I<--external> is specified, the list will be filtered to snapshots that use external files for disk images or memory state. -=item B<snapshot-dumpxml> I<domain> I<snapshot> [I<--security-info>] +=item B<snapshot-dumpxml> I<domain> {I<snapshot> | I<--all> [I<--topological>]} +[I<--security-info>] Output the snapshot XML for the domain's snapshot named I<snapshot>. +Alternatively, if I<--all> is passed instead of I<snapshot> output XML +describing all snapshots at once, where I<--topological> can also be +used to ensure no child is listed before a parent. Using I<--security-info> will also include security sensitive information. Use B<snapshot-current> to easily access the XML of the current snapshot. -- 2.20.1

On Mon, Mar 11, 2019 at 09:38:36PM -0500, Eric Blake wrote:
Add a new 'snapshot-import' command for bulk import. For bulk XML, it was easier to just add new flags to the existing 'snapshot-dumpxml' than to figure out a new command name.
I debated about whether omitting the snapshotname should be enough to trigger the new API call, or whether to require a new bool --all; in the end, I went with the latter.
Signed-off-by: Eric Blake <eblake@redhat.com> --- tools/virsh-snapshot.c | 111 ++++++++++++++++++++++++++++++++++++----- tools/virsh.pod | 15 +++++- 2 files changed, 112 insertions(+), 14 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Implement the new API calls for bulk snapshot dump and import. The bulk of the work is already done by the common code. Since each connection to test:///default restarts at the same canned state, this can easily be tested with: $ virsh -c test:///default " snapshot-create-as test s1 snapshot-create-as test s2 echo witness snapshot-dumpxml test --all" | sed '1,/witness/d' > list.xml $ virsh -c test:///default " snapshot-list test snapshot-import test list.xml snapshot-current --name test snapshot-list --parent test " Name Creation Time State ------------------------------- Imported 2 snapshots s2 Name Creation Time State Parent ------------------------------------------------------ s1 2019-02-20 22:26:52 -0600 running s2 2019-02-20 22:26:52 -0600 running s1 The test driver also makes it easy to test input validation, by modifying list.xml incorrectly (such as trying to attempt circular dependencies). Proving that that --topological makes a difference is a bit harder (since we randomize the hash seed, it is not 100% reproducible which order you get without it), but I found that creating snapshots s1, s3, s2 in that order tended to be more likely to hash in non-topological order without the flag. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/test/test_driver.c | 62 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 61 insertions(+), 1 deletion(-) diff --git a/src/test/test_driver.c b/src/test/test_driver.c index 02cd4f4d07..085e228873 100644 --- a/src/test/test_driver.c +++ b/src/test/test_driver.c @@ -1,7 +1,7 @@ /* * test_driver.c: A "mock" hypervisor for use by application unit tests * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2019 Red Hat, Inc. * Copyright (C) 2006 Daniel P. Berrange * * This library is free software; you can redistribute it and/or @@ -6222,6 +6222,36 @@ testDomainSnapshotGetXMLDesc(virDomainSnapshotPtr snapshot, return xml; } +static char * +testDomainGetSnapshotsXMLDesc(virDomainPtr domain, + unsigned int flags) +{ + virDomainObjPtr vm = NULL; + char *xml = NULL; + char uuidstr[VIR_UUID_STRING_BUFLEN]; + testDriverPtr privconn = domain->conn->privateData; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + virCheckFlags(VIR_DOMAIN_GET_SNAPSHOTS_XML_SECURE | + VIR_DOMAIN_GET_SNAPSHOTS_XML_TOPOLOGICAL, NULL); + + if (!(vm = testDomObjFromDomain(domain))) + return NULL; + + virUUIDFormat(domain->uuid, uuidstr); + + if (virDomainSnapshotObjListFormat(&buf, uuidstr, vm->snapshots, + vm->current_snapshot, privconn->caps, + privconn->xmlopt, flags) < 0) + goto cleanup; + + xml = virBufferContentAndReset(&buf); + + cleanup: + virDomainObjEndAPI(&vm); + return xml; +} + static int testDomainSnapshotIsCurrent(virDomainSnapshotPtr snapshot, unsigned int flags) @@ -6409,6 +6439,34 @@ testDomainSnapshotCreateXML(virDomainPtr domain, } +static int +testDomainImportSnapshotsXML(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags) +{ + testDriverPtr privconn = domain->conn->privateData; + virDomainObjPtr vm = NULL; + int ret = -1; + unsigned int parse_flags = VIR_DOMAIN_SNAPSHOT_PARSE_REDEFINE | + VIR_DOMAIN_SNAPSHOT_PARSE_DISKS; + + virCheckFlags(0, -1); + + if (!(vm = testDomObjFromDomain(domain))) + return -1; + + ret = virDomainSnapshotObjListParse(xmlDesc, + vm->def->uuid, + vm->snapshots, + &vm->current_snapshot, + privconn->caps, + privconn->xmlopt, + parse_flags); + virDomainObjEndAPI(&vm); + return ret; +} + + typedef struct _testSnapRemoveData testSnapRemoveData; typedef testSnapRemoveData *testSnapRemoveDataPtr; struct _testSnapRemoveData { @@ -6840,6 +6898,7 @@ static virHypervisorDriver testHypervisorDriver = { .domainSnapshotListNames = testDomainSnapshotListNames, /* 1.1.4 */ .domainListAllSnapshots = testDomainListAllSnapshots, /* 1.1.4 */ .domainSnapshotGetXMLDesc = testDomainSnapshotGetXMLDesc, /* 1.1.4 */ + .domainGetSnapshotsXMLDesc = testDomainGetSnapshotsXMLDesc, /* 5.2.0 */ .domainSnapshotNumChildren = testDomainSnapshotNumChildren, /* 1.1.4 */ .domainSnapshotListChildrenNames = testDomainSnapshotListChildrenNames, /* 1.1.4 */ .domainSnapshotListAllChildren = testDomainSnapshotListAllChildren, /* 1.1.4 */ @@ -6850,6 +6909,7 @@ static virHypervisorDriver testHypervisorDriver = { .domainSnapshotIsCurrent = testDomainSnapshotIsCurrent, /* 1.1.4 */ .domainSnapshotHasMetadata = testDomainSnapshotHasMetadata, /* 1.1.4 */ .domainSnapshotCreateXML = testDomainSnapshotCreateXML, /* 1.1.4 */ + .domainImportSnapshotsXML = testDomainImportSnapshotsXML, /* 5.2.0 */ .domainRevertToSnapshot = testDomainRevertToSnapshot, /* 1.1.4 */ .domainSnapshotDelete = testDomainSnapshotDelete, /* 1.1.4 */ -- 2.20.1

On Mon, Mar 11, 2019 at 09:38:37PM -0500, Eric Blake wrote:
Implement the new API calls for bulk snapshot dump and import. The bulk of the work is already done by the common code.
Since each connection to test:///default restarts at the same canned state, this can easily be tested with:
$ virsh -c test:///default " snapshot-create-as test s1 snapshot-create-as test s2 echo witness snapshot-dumpxml test --all" | sed '1,/witness/d' > list.xml $ virsh -c test:///default " snapshot-list test snapshot-import test list.xml snapshot-current --name test snapshot-list --parent test " Name Creation Time State -------------------------------
Imported 2 snapshots s2 Name Creation Time State Parent ------------------------------------------------------ s1 2019-02-20 22:26:52 -0600 running s2 2019-02-20 22:26:52 -0600 running s1
The test driver also makes it easy to test input validation, by modifying list.xml incorrectly (such as trying to attempt circular dependencies). Proving that that --topological makes a difference is a bit harder (since we randomize the hash seed, it is not 100% reproducible which order you get without it), but I found that creating snapshots s1, s3, s2 in that order tended to be more likely to hash in non-topological order without the flag.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/test/test_driver.c | 62 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 61 insertions(+), 1 deletion(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Straight code motion, coupled with changing goto into return -1 as needed. This change will be important to later patches adding bulk redefinition (where each snapshot in a list has to meet the same constraints). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 120 ++++++++++++++++++++++------------------- 1 file changed, 66 insertions(+), 54 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e461fb51b0..d679587819 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15661,6 +15661,69 @@ qemuDomainSnapshotCreateActiveExternal(virQEMUDriverPtr driver, } +/* Validate that a snapshot object does not violate any qemu-specific + * constraints. */ +static int +qemuDomainSnapshotValidate(virDomainSnapshotDefPtr def, + virDomainSnapshotState state, + unsigned int flags) +{ + /* reject snapshot names containing slashes or starting with dot as + * snapshot definitions are saved in files named by the snapshot name */ + if (!(flags & VIR_DOMAIN_SNAPSHOT_CREATE_NO_METADATA)) { + if (strchr(def->name, '/')) { + virReportError(VIR_ERR_XML_DETAIL, + _("invalid snapshot name '%s': " + "name can't contain '/'"), + def->name); + return -1; + } + + if (def->name[0] == '.') { + virReportError(VIR_ERR_XML_DETAIL, + _("invalid snapshot name '%s': " + "name can't start with '.'"), + def->name); + return -1; + } + } + + /* allow snapshots only in certain states */ + switch (state) { + /* valid states */ + case VIR_DOMAIN_SNAPSHOT_RUNNING: + case VIR_DOMAIN_SNAPSHOT_PAUSED: + case VIR_DOMAIN_SNAPSHOT_SHUTDOWN: + case VIR_DOMAIN_SNAPSHOT_SHUTOFF: + case VIR_DOMAIN_SNAPSHOT_CRASHED: + break; + + case VIR_DOMAIN_SNAPSHOT_DISK_SNAPSHOT: + if (!(flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE)) { + virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid domain state %s"), + virDomainSnapshotStateTypeToString(state)); + return -1; + } + break; + + case VIR_DOMAIN_SNAPSHOT_PMSUSPENDED: + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("qemu doesn't support taking snapshots of " + "PMSUSPENDED guests")); + return -1; + + /* invalid states */ + case VIR_DOMAIN_SNAPSHOT_NOSTATE: + case VIR_DOMAIN_SNAPSHOT_BLOCKED: /* invalid state, unused in qemu */ + case VIR_DOMAIN_SNAPSHOT_LAST: + virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid domain state %s"), + virDomainSnapshotStateTypeToString(state)); + return -1; + } + return 0; +} + + static virDomainSnapshotPtr qemuDomainSnapshotCreateXML(virDomainPtr domain, const char *xmlDesc, @@ -15681,7 +15744,6 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; qemuDomainObjPrivatePtr priv; - virDomainSnapshotState state; virCheckFlags(VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE | VIR_DOMAIN_SNAPSHOT_CREATE_CURRENT | @@ -15736,25 +15798,9 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, parse_flags))) goto cleanup; - /* reject snapshot names containing slashes or starting with dot as - * snapshot definitions are saved in files named by the snapshot name */ - if (!(flags & VIR_DOMAIN_SNAPSHOT_CREATE_NO_METADATA)) { - if (strchr(def->name, '/')) { - virReportError(VIR_ERR_XML_DETAIL, - _("invalid snapshot name '%s': " - "name can't contain '/'"), - def->name); - goto cleanup; - } - - if (def->name[0] == '.') { - virReportError(VIR_ERR_XML_DETAIL, - _("invalid snapshot name '%s': " - "name can't start with '.'"), - def->name); - goto cleanup; - } - } + if (qemuDomainSnapshotValidate(def, redefine ? def->state : vm->state.state, + flags) < 0) + goto cleanup; /* reject the VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag where not supported */ if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE && @@ -15766,40 +15812,6 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, goto cleanup; } - /* allow snapshots only in certain states */ - state = redefine ? def->state : vm->state.state; - switch (state) { - /* valid states */ - case VIR_DOMAIN_SNAPSHOT_RUNNING: - case VIR_DOMAIN_SNAPSHOT_PAUSED: - case VIR_DOMAIN_SNAPSHOT_SHUTDOWN: - case VIR_DOMAIN_SNAPSHOT_SHUTOFF: - case VIR_DOMAIN_SNAPSHOT_CRASHED: - break; - - case VIR_DOMAIN_SNAPSHOT_DISK_SNAPSHOT: - if (!redefine) { - virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid domain state %s"), - virDomainSnapshotStateTypeToString(state)); - goto cleanup; - } - break; - - case VIR_DOMAIN_SNAPSHOT_PMSUSPENDED: - virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", - _("qemu doesn't support taking snapshots of " - "PMSUSPENDED guests")); - goto cleanup; - - /* invalid states */ - case VIR_DOMAIN_SNAPSHOT_NOSTATE: - case VIR_DOMAIN_SNAPSHOT_BLOCKED: /* invalid state, unused in qemu */ - case VIR_DOMAIN_SNAPSHOT_LAST: - virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid domain state %s"), - virDomainSnapshotStateTypeToString(state)); - goto cleanup; - } - /* We are going to modify the domain below. Internal snapshots would use * a regular job, so we need to set the job mask to disallow query as * 'savevm' blocks the monitor. External snapshot will then modify the -- 2.20.1

On Mon, Mar 11, 2019 at 09:38:38PM -0500, Eric Blake wrote:
Straight code motion, coupled with changing goto into return -1 as needed. This change will be important to later patches adding bulk redefinition (where each snapshot in a list has to meet the same constraints).
Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_driver.c | 120 ++++++++++++++++++++++------------------- 1 file changed, 66 insertions(+), 54 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Implement the new API calls for bulk snapshot dump and import. This borrows from ideas in the test driver, but import is further complicated by the fact that qemu writes snapshot XML to disk, and thus must do additional validation after the initial parse to ensure the user didn't attempt to create a snapshot with "../" or similar. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 116 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 115 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d679587819..e1be3d0c5e 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1,7 +1,7 @@ /* * qemu_driver.c: core driver methods for managing qemu guests * - * Copyright (C) 2006-2016 Red Hat, Inc. + * Copyright (C) 2006-2019 Red Hat, Inc. * Copyright (C) 2006 Daniel P. Berrange * * This library is free software; you can redistribute it and/or @@ -15974,6 +15974,84 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } +/* Struct and hash-iterator callback used when bulk redefining snapshots */ +struct qemuDomainSnapshotBulk { + virDomainObjPtr vm; + virQEMUDriverPtr driver; + const char *snapshotDir; +}; + +static int +qemuDomainSnapshotBulkRedefine(void *payload, + const void *name ATTRIBUTE_UNUSED, + void *opaque) +{ + virDomainSnapshotObjPtr snap = payload; + struct qemuDomainSnapshotBulk *data = opaque; + + if (qemuDomainSnapshotValidate(snap->def, snap->def->state, + VIR_DOMAIN_SNAPSHOT_PARSE_REDEFINE) < 0) + return -1; + if (qemuDomainSnapshotWriteMetadata(data->vm, snap, data->driver->caps, + data->driver->xmlopt, + data->snapshotDir) < 0) + return -1; + return 0; +} + + +static int +qemuDomainImportSnapshotsXML(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags) +{ + virQEMUDriverPtr driver = domain->conn->privateData; + virDomainObjPtr vm = NULL; + int ret = -1; + unsigned int parse_flags = VIR_DOMAIN_SNAPSHOT_PARSE_REDEFINE | + VIR_DOMAIN_SNAPSHOT_PARSE_DISKS; + struct qemuDomainSnapshotBulk bulk = { .driver = driver, }; + virQEMUDriverConfigPtr cfg = NULL; + + virCheckFlags(0, -1); + + if (!(vm = qemuDomObjFromDomain(domain))) + return -1; + + if (virDomainImportSnapshotsXMLEnsureACL(domain->conn, vm->def) < 0) + goto cleanup; + + cfg = virQEMUDriverGetConfig(driver); + + ret = virDomainSnapshotObjListParse(xmlDesc, vm->def->uuid, vm->snapshots, + &vm->current_snapshot, driver->caps, + driver->xmlopt, parse_flags); + if (ret < 0) + goto cleanup; + + /* Validate and save the snapshots to disk. Since we don't get + * here unless there were no snapshots beforehand, just delete + * everything on the first failure, ignoring further errors. */ + bulk.vm = vm; + bulk.snapshotDir = cfg->snapshotDir; + if (virDomainSnapshotForEach(vm->snapshots, false, + qemuDomainSnapshotBulkRedefine, &bulk) < 0) { + virErrorPtr orig_err = NULL; + + virErrorPreserveLast(&orig_err); + qemuDomainSnapshotDiscardAllMetadata(driver, vm); + virErrorRestore(&orig_err); + ret = 0; + goto cleanup; + } + + cleanup: + virDomainObjEndAPI(&vm); + virObjectUnref(cfg); + return ret; +} + + static int qemuDomainSnapshotListNames(virDomainPtr domain, char **names, @@ -16292,6 +16370,40 @@ qemuDomainSnapshotGetXMLDesc(virDomainSnapshotPtr snapshot, } +static char * +qemuDomainGetSnapshotsXMLDesc(virDomainPtr domain, + unsigned int flags) +{ + virQEMUDriverPtr driver = domain->conn->privateData; + virDomainObjPtr vm = NULL; + char *xml = NULL; + char uuidstr[VIR_UUID_STRING_BUFLEN]; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + virCheckFlags(VIR_DOMAIN_GET_SNAPSHOTS_XML_SECURE | + VIR_DOMAIN_GET_SNAPSHOTS_XML_TOPOLOGICAL, NULL); + + if (!(vm = qemuDomObjFromDomain(domain))) + return NULL; + + if (virDomainGetSnapshotsXMLDescEnsureACL(domain->conn, vm->def, flags) < 0) + goto cleanup; + + virUUIDFormat(domain->uuid, uuidstr); + + if (virDomainSnapshotObjListFormat(&buf, uuidstr, vm->snapshots, + vm->current_snapshot, driver->caps, + driver->xmlopt, flags) < 0) + goto cleanup; + + xml = virBufferContentAndReset(&buf); + + cleanup: + virDomainObjEndAPI(&vm); + return xml; +} + + static int qemuDomainSnapshotIsCurrent(virDomainSnapshotPtr snapshot, unsigned int flags) @@ -22585,7 +22697,9 @@ static virHypervisorDriver qemuHypervisorDriver = { .domainManagedSaveGetXMLDesc = qemuDomainManagedSaveGetXMLDesc, /* 3.7.0 */ .domainManagedSaveDefineXML = qemuDomainManagedSaveDefineXML, /* 3.7.0 */ .domainSnapshotCreateXML = qemuDomainSnapshotCreateXML, /* 0.8.0 */ + .domainImportSnapshotsXML = qemuDomainImportSnapshotsXML, /* 5.2.0 */ .domainSnapshotGetXMLDesc = qemuDomainSnapshotGetXMLDesc, /* 0.8.0 */ + .domainGetSnapshotsXMLDesc = qemuDomainGetSnapshotsXMLDesc, /* 5.2.0 */ .domainSnapshotNum = qemuDomainSnapshotNum, /* 0.8.0 */ .domainSnapshotListNames = qemuDomainSnapshotListNames, /* 0.8.0 */ .domainListAllSnapshots = qemuDomainListAllSnapshots, /* 0.9.13 */ -- 2.20.1

On Mon, Mar 11, 2019 at 09:38:39PM -0500, Eric Blake wrote:
Implement the new API calls for bulk snapshot dump and import. This borrows from ideas in the test driver, but import is further complicated by the fact that qemu writes snapshot XML to disk, and thus must do additional validation after the initial parse to ensure the user didn't attempt to create a snapshot with "../" or similar.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 116 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 115 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d679587819..e1be3d0c5e 100644 +static int +qemuDomainImportSnapshotsXML(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags) +{ + virQEMUDriverPtr driver = domain->conn->privateData; + virDomainObjPtr vm = NULL; + int ret = -1; + unsigned int parse_flags = VIR_DOMAIN_SNAPSHOT_PARSE_REDEFINE | + VIR_DOMAIN_SNAPSHOT_PARSE_DISKS; + struct qemuDomainSnapshotBulk bulk = { .driver = driver, }; + virQEMUDriverConfigPtr cfg = NULL; + + virCheckFlags(0, -1); + + if (!(vm = qemuDomObjFromDomain(domain))) + return -1; + + if (virDomainImportSnapshotsXMLEnsureACL(domain->conn, vm->def) < 0) + goto cleanup; + + cfg = virQEMUDriverGetConfig(driver); + + ret = virDomainSnapshotObjListParse(xmlDesc, vm->def->uuid, vm->snapshots, + &vm->current_snapshot, driver->caps, + driver->xmlopt, parse_flags); + if (ret < 0) + goto cleanup; + + /* Validate and save the snapshots to disk. Since we don't get + * here unless there were no snapshots beforehand, just delete + * everything on the first failure, ignoring further errors. */ + bulk.vm = vm; + bulk.snapshotDir = cfg->snapshotDir; + if (virDomainSnapshotForEach(vm->snapshots, false, + qemuDomainSnapshotBulkRedefine, &bulk) < 0) { + virErrorPtr orig_err = NULL; + + virErrorPreserveLast(&orig_err); + qemuDomainSnapshotDiscardAllMetadata(driver, vm); + virErrorRestore(&orig_err); + ret = 0;
ret = -1; Alternatively, use a separate 'rc' variable for the return value of virDomainSnapshotObjListParse
+ goto cleanup; + }
and add: ret = rc; here to only have two places where we modify this function's return value.
+ + cleanup: + virDomainObjEndAPI(&vm); + virObjectUnref(cfg); + return ret; +} +
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

On Mon, Mar 11, 2019 at 09:38:31PM -0500, Eric Blake wrote:
While looking at my work on incremental backups, Nir raised a good point: if we want to recreate a set of known checkpoints on one machine that will be taking over a domain from another machine, my initial proposal required making multiple API calls to list the XML for each checkpoint on the source, and then more API calls to redefine each checkpoint on the destination; it also had the drawback that the list has to be presented in topological order (the API won't let you define a child checkpoint if the parent is not defined first). He asked if I could instead add bulk APIs, for getting the XML for all checkpoints at once, and then for redefining all checkpoints at once.
To me the key blocking problem here is the topological sorting one as it makes it incredibly painful to run individual APIs. Your other series on list addresses this nicely by adding the topological flag. Using that flag, in psuedo-python like code the process of exporting and importing snapshots from one host to another should look approx like this snapshots = srcdom.ListAllSnapshots(libvirt.VIR_DOMAIN_SNAPSHOT_LIST_TOPOLOGICAL) for snapshot in snapshts: xml = snapshot.XMLDesc() dstdom.SnapshotCreateXML(xml) I don't really see a reason why this is so hard that it justifies adding these bulk snapshot list/define APIs. At the same time there is a very real downside to providing these bulk snapshot APIs in terms of scalability. The <domain> XML for a guest can get very large. The <domainsnapshot> XML contains a copy of the <domain> XML from the point in time of the snapsht. Therefore a <snapshots> XML document is going to contain *many* <domain> XML documents concatenated. For a domain with many devices and many snapshots, the bulk list APIs have non-negligible risk of hitting the RPC protocol limit on string size. We've seen this before with the bulk domain stats APIs we added. We none the less still added the bulk domain stats APIs because it was a compelling performance benefit for something that is run very frequently. For apps to be scalable though, they have to limit how many guests they request bulk stats for at any time. This in fact makes the code using the bulk stats APIs more complex than if it just called the individual APIs. The performance benefit is still a win though in this case. In the bulk snapshot import/export case though, I'm not seeing a compelling performance reason for their usage. Making the APIs scalable would require to limit how many snapshots are returned in any single call. There would then need to be repeated calls to fetch the rest. This makes it more complicated than just fetching/defining each individual snapshot IOW, overall I'm not seeing a big enough justification to add these new APIs, that would outweigh the downsides they bring in terms of scalability. The topological sorting flag solves the really big pain point well enough.
I'm also toying with the idea of followup patches that store all snapshots alongside the persistent <domain> XML in a single file, rather than in one snapshot per <domainsnapshot> file (we'd still support reading from the old method at libvirtd startup, but would not need to output in that manner any more).
What would be the benefit to having it all in one file ? During initial libvirtd startup we could read the info from one files, but we still need code to read the info from many files so we're not saving code in that respect. Most of the snapshot APIs that save XML to disk only care about a single <domainsnapshot> document so wouldn't benefit from having all snapshots in one place. Re-writing the entire <domain> XML means that a host failure during a snapshot save can put the entire domain definition at risk of loss. Though we do try to mitigate that by writing to a temp file and doing an atomic rename, that doesn't eliminate the risk entirely. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 3/12/19 1:10 PM, Daniel P. Berrangé wrote:
On Mon, Mar 11, 2019 at 09:38:31PM -0500, Eric Blake wrote:
While looking at my work on incremental backups, Nir raised a good point: if we want to recreate a set of known checkpoints on one machine that will be taking over a domain from another machine, my initial proposal required making multiple API calls to list the XML for each checkpoint on the source, and then more API calls to redefine each checkpoint on the destination; it also had the drawback that the list has to be presented in topological order (the API won't let you define a child checkpoint if the parent is not defined first). He asked if I could instead add bulk APIs, for getting the XML for all checkpoints at once, and then for redefining all checkpoints at once.
To me the key blocking problem here is the topological sorting one as it makes it incredibly painful to run individual APIs. Your other series on list addresses this nicely by adding the topological flag.
Using that flag, in psuedo-python like code the process of exporting and importing snapshots from one host to another should look approx like this
snapshots = srcdom.ListAllSnapshots(libvirt.VIR_DOMAIN_SNAPSHOT_LIST_TOPOLOGICAL) for snapshot in snapshts: xml = snapshot.XMLDesc() dstdom.SnapshotCreateXML(xml)
I don't really see a reason why this is so hard that it justifies adding these bulk snapshot list/define APIs.
Nir, do you want to chime in here?
At the same time there is a very real downside to providing these bulk snapshot APIs in terms of scalability. The <domain> XML for a guest can get very large. The <domainsnapshot> XML contains a copy of the <domain> XML from the point in time of the snapsht. Therefore a <snapshots> XML document is going to contain *many* <domain> XML documents concatenated.
For a domain with many devices and many snapshots, the bulk list APIs have non-negligible risk of hitting the RPC protocol limit on string size.
I did not consider that, but you are definitely correct about it being a real problem (I didn't trip over it in my small domain testing, but combined XML is definitely a repetition of a <domain> per <domainsnapshot>). One idea I had on the checkpoint series was adding a NO_DOMAIN flag which outputs much more compact XML by omitting the <domain> element. With checkpoints, the <domain> at the time of the checkpoint is less important (useful to know if we need it, but we can't revert to that state); but with snapshots, the <domain> at the time of the snapshot is very important (to the point that you have to request a flag for an unsafe revert in virDomainSnapshotRevert() if the <domain> information was incomplete). So using a NO_DOMAIN flag to minimize XML so that the bulk output doesn't exceed RPC limits is a possibility, but not a very nice one (as you'd STILL have to do a one-API-per-snapshot call to reinstate the <domain> of any snapshot that you plan to revert to).
We've seen this before with the bulk domain stats APIs we added. We none the less still added the bulk domain stats APIs because it was a compelling performance benefit for something that is run very frequently. For apps to be scalable though, they have to limit how many guests they request bulk stats for at any time. This in fact makes the code using the bulk stats APIs more complex than if it just called the individual APIs. The performance benefit is still a win though in this case.
In the bulk snapshot import/export case though, I'm not seeing a compelling performance reason for their usage. Making the APIs scalable would require to limit how many snapshots are returned in any single call. There would then need to be repeated calls to fetch the rest. This makes it more complicated than just fetching/defining each individual snapshot
And the frequency of migrating snapshots is less than the frequency of querying bulk stats. It makes the most sense to speed up APIs called frequently.
IOW, overall I'm not seeing a big enough justification to add these new APIs, that would outweigh the downsides they bring in terms of scalability.
You raise some good points.
The topological sorting flag solves the really big pain point well enough.
The topological sorting thing kind of fell on my lap as I was working on the bulk code, but now that I've written it, it definitely solves a problem that we previously had, even if I had not properly identified the problem (randomized list ordering is painful to work with). Either redefine has to accept randomized ordering (which it currently doesn't), or listing should output in sane ordering (which topological adds) - and once we have topological output, subsequent input gets a LOT easier, even if split across multiple API calls.
I'm also toying with the idea of followup patches that store all snapshots alongside the persistent <domain> XML in a single file, rather than in one snapshot per <domainsnapshot> file (we'd still support reading from the old method at libvirtd startup, but would not need to output in that manner any more).
What would be the benefit to having it all in one file ? During initial libvirtd startup we could read the info from one files, but we still need code to read the info from many files so we're not saving code in that respect. Most of the snapshot APIs that save XML to disk only care about a single <domainsnapshot> document so wouldn't benefit from having all snapshots in one place.
One possible benefit: in the current virDomainSnapshotCreateXML(REDEFINE) code, we potentially end up changing three separate snapshots in one call: if we are redefining a replacement snapshot (maybe supplying a <domain> subelement that was previously missing), AND marking the redefinition as the new current element, it requires rewriting the XML file of the previous current snapshot (to drop an internal-only <active>1</active> marker), deleting the XML file of the previous state of the snapshot being modified, and writing the new XML file of the redefined snapshot. If any one of those steps die, it's very hard to roll back to the earlier state. But if we store all snapshots in a single file using the <snapshots current='name'> (rather than a per-snapshot <active>1</active> marker), we only have to modify a single file when making any changes to the snapshot hierarchy. If we fail at any earlier step of the way, we have not partially corrupted the snapshot hierarchy in a manner that is hard to roll back. Another (weak) benefit - right now, qemu snapshots cannot be named "../blah", because that would create a filename that escapes the proper subdirectory. With a single <snapshots>, you can allow more flexible snapshot names, as those names no longer influence host file names.
Re-writing the entire <domain> XML means that a host failure during a snapshot save can put the entire domain definition at risk of loss. Though we do try to mitigate that by writing to a temp file and doing an atomic rename, that doesn't eliminate the risk entirely.
Okay, so it's probably better to store the <snapshots> XML in a file separate from the <domain> itself (failure to write the snapshots file may lose the snapshots, but does not lose the domain), but still, I can't help but think that two files is potentially nicer than a subdirectory full of one file per snapshot. Unless Nir speaks up with strong reasons in favor of new bulk APIs (rather than living with the new topological sort flags), I'm okay dropping this series. The patches that are already in (the head of the v3 posting of this series, for generating/parsing <snapshots>) may still prove to be useful if you agree with me that a single file for ALL snapshots is more robust than one file per snapshot, particularly when we have exiting APIs that already have a hard time rolling back to a sane state on partial failure when manipulating more than one file. But as I have not yet posted any series showing how that would work, that doesn't change that these v4 patches are probably not worth pursuing further. And, since it is more important to get new API in by 5.2 (since rebasing APIs is difficult to backport) than it is to do internal-only changes (a subdirectory of multiple files vs. one <snapshots> file is not user-visible, and can be backported without as much hassle), I'll focus more on the checkpoint APIs rather than any refactoring of internals. So that may mean that 5.2 is released with my <snapshots> format/parse code with no consumers of that code, but I don't think it hurts enough to revert them (as we may end up with a consumer for that code, even if not in 5.2). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

On Tue, Mar 12, 2019 at 9:49 PM Eric Blake <eblake@redhat.com> wrote:
On 3/12/19 1:10 PM, Daniel P. Berrangé wrote:
On Mon, Mar 11, 2019 at 09:38:31PM -0500, Eric Blake wrote:
While looking at my work on incremental backups, Nir raised a good point: if we want to recreate a set of known checkpoints on one machine that will be taking over a domain from another machine, my initial proposal required making multiple API calls to list the XML for each checkpoint on the source, and then more API calls to redefine each checkpoint on the destination; it also had the drawback that the list has to be presented in topological order (the API won't let you define a child checkpoint if the parent is not defined first). He asked if I could instead add bulk APIs, for getting the XML for all checkpoints at once, and then for redefining all checkpoints at once.
To me the key blocking problem here is the topological sorting one as it makes it incredibly painful to run individual APIs. Your other series on list addresses this nicely by adding the topological flag.
Using that flag, in psuedo-python like code the process of exporting and importing snapshots from one host to another should look approx like this
snapshots = srcdom.ListAllSnapshots(libvirt.VIR_DOMAIN_SNAPSHOT_LIST_TOPOLOGICAL) for snapshot in snapshts: xml = snapshot.XMLDesc() dstdom.SnapshotCreateXML(xml)
I don't really see a reason why this is so hard that it justifies adding these bulk snapshot list/define APIs.
Nir, do you want to chime in here?
We don't have a need to list or define snapshots since we managed snapshots on oVirt side. We want an API to list and redefine checkpoints. Our use case is: 1. Start VM on some host 2. When oVirt starts a backup, it adds a checkpoint in oVirt database 3. When libvirt starts a backup, libvirt add the checkpoint oVirt created 4. Steps 2 and 3 are repeated while the VM is running... 5. Stop VM - at this point we undefine the VM, so libivrt lost all info about checkopints 6. Start VM on some host 7. oVirt use the bulk checkpoint API to redefine all checkpoints in oVirt database 8. If there is unknown bitmap on a disk, or a bitmap is missing, it means that someone modified the disk behind oVirt back, and oVirt will delete the affected checkpoints and use libvirt checkpoints API to delete the checkpoints and bitmaps. This forces full backup for the next backup of the affected disks. We would like that libvirt will fail to redefine the checkpoints if the info in the checkpoints does not match the the bitmap on the disks mentioned in the checkpoints. To do this, libvirt must have the complete checkpoint state. It cannot if we define checkpoints one by one. If libivirt cannot fail, we need to a way to list all checkpoints and bitmaps, so we can sync oVirt database with reality. If we don't validate bitmaps with the expected bitmaps, the backup may fail when starting a backup (good), or succeed, including wrong data in the backup. It also does not make sense to redefine checkpoint state in incremental way, this like building a VM XML with many devices with one call per device. The same checkpoint management will have to be duplicated in any libvirt client that manage more then one host (e.g. KubeVirt, OpenStack), since libvirt does not handle more than one host. For more info see - https://ovirt.org/develop/release-management/features/storage/incremental-ba... - https://github.com/oVirt/vdsm/blob/master/lib/vdsm/api/vdsm-api.yml#L11525 - https://github.com/oVirt/vdsm/blob/0874b89a19af2d7726ed2f0b9d674cf0bb48a67e/...
At the same time there is a very real downside to providing these bulk snapshot APIs in terms of scalability. The <domain> XML for a guest can get very large. The <domainsnapshot> XML contains a copy of the <domain> XML from the point in time of the snapsht. Therefore a <snapshots> XML document is going to contain *many* <domain> XML documents concatenated.
For a domain with many devices and many snapshots, the bulk list APIs have non-negligible risk of hitting the RPC protocol limit on string size.
I did not consider that, but you are definitely correct about it being a real problem (I didn't trip over it in my small domain testing, but combined XML is definitely a repetition of a <domain> per <domainsnapshot>).
One idea I had on the checkpoint series was adding a NO_DOMAIN flag which outputs much more compact XML by omitting the <domain> element. With checkpoints, the <domain> at the time of the checkpoint is less important (useful to know if we need it, but we can't revert to that state); but with snapshots, the <domain> at the time of the snapshot is very important (to the point that you have to request a flag for an unsafe revert in virDomainSnapshotRevert() if the <domain> information was incomplete). So using a NO_DOMAIN flag to minimize XML so that the bulk output doesn't exceed RPC limits is a possibility, but not a very nice one (as you'd STILL have to do a one-API-per-snapshot call to reinstate the <domain> of any snapshot that you plan to revert to).
We've seen this before with the bulk domain stats APIs we added. We none the less still added the bulk domain stats APIs because it was a compelling performance benefit for something that is run very frequently. For apps to be scalable though, they have to limit how many guests they request bulk stats for at any time. This in fact makes the code using the bulk stats APIs more complex than if it just called the individual APIs. The performance benefit is still a win though in this case.
In the bulk snapshot import/export case though, I'm not seeing a compelling performance reason for their usage. Making the APIs scalable would require to limit how many snapshots are returned in any single call. There would then need to be repeated calls to fetch the rest. This makes it more complicated than just fetching/defining each individual snapshot
And the frequency of migrating snapshots is less than the frequency of querying bulk stats. It makes the most sense to speed up APIs called frequently.
IOW, overall I'm not seeing a big enough justification to add these new APIs, that would outweigh the downsides they bring in terms of scalability.
You raise some good points.
The topological sorting flag solves the really big pain point well enough.
The topological sorting thing kind of fell on my lap as I was working on the bulk code, but now that I've written it, it definitely solves a problem that we previously had, even if I had not properly identified the problem (randomized list ordering is painful to work with). Either redefine has to accept randomized ordering (which it currently doesn't), or listing should output in sane ordering (which topological adds) - and once we have topological output, subsequent input gets a LOT easier, even if split across multiple API calls.
I'm also toying with the idea of followup patches that store all snapshots alongside the persistent <domain> XML in a single file, rather than in one snapshot per <domainsnapshot> file (we'd still support reading from the old method at libvirtd startup, but would not need to output in that manner any more).
What would be the benefit to having it all in one file ? During initial libvirtd startup we could read the info from one files, but we still need code to read the info from many files so we're not saving code in that respect. Most of the snapshot APIs that save XML to disk only care about a single <domainsnapshot> document so wouldn't benefit from having all snapshots in one place.
One possible benefit: in the current virDomainSnapshotCreateXML(REDEFINE) code, we potentially end up changing three separate snapshots in one call: if we are redefining a replacement snapshot (maybe supplying a <domain> subelement that was previously missing), AND marking the redefinition as the new current element, it requires rewriting the XML file of the previous current snapshot (to drop an internal-only <active>1</active> marker), deleting the XML file of the previous state of the snapshot being modified, and writing the new XML file of the redefined snapshot. If any one of those steps die, it's very hard to roll back to the earlier state. But if we store all snapshots in a single file using the <snapshots current='name'> (rather than a per-snapshot <active>1</active> marker), we only have to modify a single file when making any changes to the snapshot hierarchy. If we fail at any earlier step of the way, we have not partially corrupted the snapshot hierarchy in a manner that is hard to roll back.
Another (weak) benefit - right now, qemu snapshots cannot be named "../blah", because that would create a filename that escapes the proper subdirectory. With a single <snapshots>, you can allow more flexible snapshot names, as those names no longer influence host file names.
Re-writing the entire <domain> XML means that a host failure during a snapshot save can put the entire domain definition at risk of loss. Though we do try to mitigate that by writing to a temp file and doing an atomic rename, that doesn't eliminate the risk entirely.
Okay, so it's probably better to store the <snapshots> XML in a file separate from the <domain> itself (failure to write the snapshots file may lose the snapshots, but does not lose the domain), but still, I can't help but think that two files is potentially nicer than a subdirectory full of one file per snapshot.
Unless Nir speaks up with strong reasons in favor of new bulk APIs (rather than living with the new topological sort flags), I'm okay dropping this series.
I'm ok with dropping bulk snapshots, but I think libvirt should provide bulk APIs for checkpoints. Nir
The patches that are already in (the head of the v3 posting of this series, for generating/parsing <snapshots>) may still prove to be useful if you agree with me that a single file for ALL snapshots is more robust than one file per snapshot, particularly when we have exiting APIs that already have a hard time rolling back to a sane state on partial failure when manipulating more than one file. But as I have not yet posted any series showing how that would work, that doesn't change that these v4 patches are probably not worth pursuing further.
And, since it is more important to get new API in by 5.2 (since rebasing APIs is difficult to backport) than it is to do internal-only changes (a subdirectory of multiple files vs. one <snapshots> file is not user-visible, and can be backported without as much hassle), I'll focus more on the checkpoint APIs rather than any refactoring of internals. So that may mean that 5.2 is released with my <snapshots> format/parse code with no consumers of that code, but I don't think it hurts enough to revert them (as we may end up with a consumer for that code, even if not in 5.2).
-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

On 3/12/19 3:29 PM, Nir Soffer wrote:
snapshots = srcdom.ListAllSnapshots(libvirt.VIR_DOMAIN_SNAPSHOT_LIST_TOPOLOGICAL) for snapshot in snapshts: xml = snapshot.XMLDesc() dstdom.SnapshotCreateXML(xml)
I don't really see a reason why this is so hard that it justifies adding these bulk snapshot list/define APIs.
Nir, do you want to chime in here?
We don't have a need to list or define snapshots since we managed snapshots on oVirt side. We want an API to list and redefine checkpoints.
But the proposed <domaincheckpoint> also has a <domain> subelement, so it has the same problem (the XML for a bulk operation can become prohibitively large).
Our use case is:
1. Start VM on some host 2. When oVirt starts a backup, it adds a checkpoint in oVirt database
And does this database preserve topological ordering? If not, why not?
3. When libvirt starts a backup, libvirt add the checkpoint oVirt created 4. Steps 2 and 3 are repeated while the VM is running... 5. Stop VM - at this point we undefine the VM, so libivrt lost all info about checkopints 6. Start VM on some host 7. oVirt use the bulk checkpoint API to redefine all checkpoints in oVirt database 8. If there is unknown bitmap on a disk, or a bitmap is missing, it means that someone modified the disk behind oVirt back, and oVirt will delete the affected checkpoints and use libvirt checkpoints API to delete the checkpoints and bitmaps. This forces full backup for the next backup of the affected disks.
We would like that libvirt will fail to redefine the checkpoints if the info in the checkpoints does not match the the bitmap on the disks mentioned in the checkpoints.
That's a different feature request - you're asking for virDomainCheckpointCreateXML(REDEFINE) to have some additional flag where the metadata re-creation fails if the underlying bitmap is corrupt/missing. Which may still be a good flag to add, but is something you can do with one-at-a-time checkpoint redefinitions, and doesn't, in itself, require bulk redefinition.
To do this, libvirt must have the complete checkpoint state. It cannot if we define checkpoints one by one.
As long as you redefine checkpoints in topological order, I don't see how defining one at a time or bulk changes whether libvirt can validate that the checkpoints are correct. And even then, redefining in topological order is only necessary so that libvirt can ensure that any <parent/> checkpoint exists before attempting to define the metadata for a child checkpoint.
If libivirt cannot fail, we need to a way to list all checkpoints and bitmaps, so we can sync oVirt database with reality.
If we don't validate bitmaps with the expected bitmaps, the backup may fail when starting a backup (good), or succeed, including wrong data in the backup.
It also does not make sense to redefine checkpoint state in incremental way, this like building a VM XML with many devices with one call per device.
The argument here is that it will be the same amount of XML whether it is one call per checkpoint or one call for all checkpoints (note - there are multiple devices in one checkpoint, so it is NOT one call per 'device * checkpoint')
The same checkpoint management will have to be duplicated in any libvirt client that manage more then one host (e.g. KubeVirt, OpenStack), since libvirt does not handle more than one host.
But again, if you are moving checkpoint metadata from one host to another, doing so one-at-a-time in topological order shouldn't be much harder than doing so in one single call, but without the drawbacks of exceeding XML size over RPC call limits.
For more info see - https://ovirt.org/develop/release-management/features/storage/incremental-ba... - https://github.com/oVirt/vdsm/blob/master/lib/vdsm/api/vdsm-api.yml#L11525 - https://github.com/oVirt/vdsm/blob/0874b89a19af2d7726ed2f0b9d674cf0bb48a67e/...
Unless Nir speaks up with strong reasons in favor of new bulk APIs (rather than living with the new topological sort flags), I'm okay dropping this series.
I'm ok with dropping bulk snapshots, but I think libvirt should provide bulk APIs for checkpoints.
Nir
Given that the same amount of XML is involved across RPC, I'm hard-pressed to include bulk operations on one object without providing it on both. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

On Tue, Mar 12, 2019 at 11:02 PM Eric Blake <eblake@redhat.com> wrote:
On 3/12/19 3:29 PM, Nir Soffer wrote:
snapshots = srcdom.ListAllSnapshots(libvirt.VIR_DOMAIN_SNAPSHOT_LIST_TOPOLOGICAL) for snapshot in snapshts: xml = snapshot.XMLDesc() dstdom.SnapshotCreateXML(xml)
I don't really see a reason why this is so hard that it justifies adding these bulk snapshot list/define APIs.
Nir, do you want to chime in here?
We don't have a need to list or define snapshots since we managed snapshots on oVirt side. We want an API to list and redefine checkpoints.
But the proposed <domaincheckpoint> also has a <domain> subelement, so it has the same problem (the XML for a bulk operation can become prohibitively large).
Why do we need <domain> in a checkpoint? You can see here what we store in a checkpoint (based on your patches v2 or v3): https://ovirt.org/develop/release-management/features/storage/incremental-ba... Which is: vm_checkpoints table - checkpoint_id: UUID - parent: UUID - vm_id: UUID - creation_date: TIMESTAMP vm_checkpoint_disk_map table - disk_id: UUID - checkpoint_id: UUID
Our use case is:
1. Start VM on some host 2. When oVirt starts a backup, it adds a checkpoint in oVirt database
And does this database preserve topological ordering? If not, why not?
Using the parent UUID we can create the correct chain in the right order, and this is what we send to vdsm. Based on this, vdsm will generate the XML for libvirt. There is no domain info related to these checkpoints, and I hope we are not adding such requirements at this stage.
3. When libvirt starts a backup, libvirt add the checkpoint oVirt created 4. Steps 2 and 3 are repeated while the VM is running... 5. Stop VM - at this point we undefine the VM, so libivrt lost all info about checkopints 6. Start VM on some host 7. oVirt use the bulk checkpoint API to redefine all checkpoints in oVirt database 8. If there is unknown bitmap on a disk, or a bitmap is missing, it means that someone modified the disk behind oVirt back, and oVirt will delete the affected checkpoints and use libvirt checkpoints API to delete the checkpoints and bitmaps. This forces full backup for the next backup of the affected disks.
We would like that libvirt will fail to redefine the checkpoints if the info in the checkpoints does not match the the bitmap on the disks mentioned in the checkpoints.
That's a different feature request - you're asking for virDomainCheckpointCreateXML(REDEFINE) to have some additional flag where the metadata re-creation fails if the underlying bitmap is corrupt/missing. Which may still be a good flag to add, but is something you can do with one-at-a-time checkpoint redefinitions, and doesn't, in itself, require bulk redefinition.
Checking for unknown bitmap on disk is not possible unless you have the complete list of checkpoints. Unless you add something like: virDomainCheckpointValidate() Which will fail if the checkpoints known to libvirt do not match the bitmaps in the underlying qcow2 images. But even if we add this, the user will have to do: try: for checkpoint in orderdered_checkpoints: generate xml... redefine checkpoint... except ...: remove all checkpoints... validate checkpoints Unless the xml for single checkpoint is big, I don't see why all libvirt users need to redefine checkopoints one by one and handle all the possible errors (e.g. some checkpoints redefined, some not). Note that vdsm may be killed in the middle of the redefine loop, and in this case we leave livbirt with partial info about checkpoints, and we need to redefine the checkpoints again handling this partial sate. We means all libvirt clients that need to managed more than one host.
To
do this, libvirt must have the complete checkpoint state. It cannot if we define checkpoints one by one.
As long as you redefine checkpoints in topological order, I don't see how defining one at a time or bulk changes whether libvirt can validate that the checkpoints are correct. And even then, redefining in topological order is only necessary so that libvirt can ensure that any <parent/> checkpoint exists before attempting to define the metadata for a child checkpoint.
If libivirt cannot fail, we need to a way to list all checkpoints and bitmaps, so we can sync oVirt database with reality.
If we don't validate bitmaps with the expected bitmaps, the backup may
fail
when starting a backup (good), or succeed, including wrong data in the backup.
It also does not make sense to redefine checkpoint state in incremental way, this like building a VM XML with many devices with one call per device.
The argument here is that it will be the same amount of XML whether it is one call per checkpoint or one call for all checkpoints (note - there are multiple devices in one checkpoint, so it is NOT one call per 'device * checkpoint')
The same checkpoint management will have to be duplicated in any libvirt client that manage more then one host (e.g. KubeVirt, OpenStack), since libvirt does not handle more than one host.
But again, if you are moving checkpoint metadata from one host to another, doing so one-at-a-time in topological order shouldn't be much harder than doing so in one single call, but without the drawbacks of exceeding XML size over RPC call limits.
For more info see -
https://ovirt.org/develop/release-management/features/storage/incremental-ba...
- https://github.com/oVirt/vdsm/blob/master/lib/vdsm/api/vdsm-api.yml#L11525 -
https://github.com/oVirt/vdsm/blob/0874b89a19af2d7726ed2f0b9d674cf0bb48a67e/...
Unless Nir speaks up with strong reasons in favor of new bulk APIs (rather than living with the new topological sort flags), I'm okay dropping this series.
I'm ok with dropping bulk snapshots, but I think libvirt should provide bulk APIs for checkpoints.
Nir
Given that the same amount of XML is involved across RPC, I'm hard-pressed to include bulk operations on one object without providing it on both.
-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

On 3/12/19 4:35 PM, Nir Soffer wrote:
We don't have a need to list or define snapshots since we managed snapshots on oVirt side. We want an API to list and redefine checkpoints.
But the proposed <domaincheckpoint> also has a <domain> subelement, so it has the same problem (the XML for a bulk operation can become prohibitively large).
Why do we need <domain> in a checkpoint?
The initial design for snapshots did not have <domain>, and it bit us hard; you cannot safely revert to a snapshot if you don't know the state of the domain at that time. The initial design for checkpoints has thus mandated that <domain> be present (my v5 series fails to redefine a snapshot if you omit <domain>, even though it has a NO_DOMAIN flag during dumpxml for reduced output). If we are convinced that defining a snapshot without <domain> is safe enough, then I can relax the checkpoint code to allow redefined metadata without <domain> the way snapshots already have to do it, even though I was hoping that checkpoints could start life with fewer back-compats that snapshot has had to carry along. But I'd rather start strict and relax later (require <domain> and then remove it when proven safe), and not start loose (leave <domain> optional, and then wish we had made it mandatory).
You can see here what we store in a checkpoint (based on your patches v2 or v3): https://ovirt.org/develop/release-management/features/storage/incremental-ba...
Which is:
vm_checkpoints table
- checkpoint_id: UUID - parent: UUID - vm_id: UUID - creation_date: TIMESTAMP
vm_checkpoint_disk_map table - disk_id: UUID - checkpoint_id: UUID
And no state of the <domain> at the time the checkpoint was created?
Our use case is:
1. Start VM on some host 2. When oVirt starts a backup, it adds a checkpoint in oVirt database
And does this database preserve topological ordering? If not, why not?
Using the parent UUID we can create the correct chain in the right order, and this is what we send to vdsm. Based on this, vdsm will generate the XML for libvirt.
There is no domain info related to these checkpoints, and I hope we are not adding such requirements at this stage.
Rather, that requirement has been there in all of my drafts, and it sounds like you are trying to get me to relax that requirement to not need a <domain> during checkpoint metadata redefine.
That's a different feature request - you're asking for virDomainCheckpointCreateXML(REDEFINE) to have some additional flag where the metadata re-creation fails if the underlying bitmap is corrupt/missing. Which may still be a good flag to add, but is something you can do with one-at-a-time checkpoint redefinitions, and doesn't, in itself, require bulk redefinition.
Checking for unknown bitmap on disk is not possible unless you have the complete list of checkpoints. Unless you add something like:
virDomainCheckpointValidate()
Which will fail if the checkpoints known to libvirt do not match the bitmaps in the underlying qcow2 images.
Yes, an API like that may be better than trying to validate on a per-redefine basis whether libvirt metadata matches qcow2 metadata.
But even if we add this, the user will have to do:
try: for checkpoint in orderdered_checkpoints: generate xml... redefine checkpoint... except ...: remove all checkpoints...
validate checkpoints
Yes, but that's not hard. And it's no different whether the 'try/except' code is a single libvirt API (with bulk redefine) or multiple libvirt APIs (no bulk redefine, but easy enough to one-at-a-time redefine in topological order). Why you have to generate the XML instead of reusing the XML that libvirt produces is something I'm not sure of (that is, what's wrong with using libvirt's dumpxml output as your input, rather than trying to reconstruct it on the fly?).
Unless the xml for single checkpoint is big, I don't see why all libvirt users need to redefine checkopoints one by one and handle all the possible errors (e.g. some checkpoints redefined, some not).
As currently proposed, <domaincheckpoint> includes a mandatory <domain> sub-element on redefinition, which means that the XML _is_ big for a series of checkpoints. It also means that you're better off storing the XML produced by libvirt than trying to regenerate it by hand, when it comes to metadata redefinition.
Note that vdsm may be killed in the middle of the redefine loop, and in this case we leave livbirt with partial info about checkpoints, and we need to redefine the checkpoints again handling this partial sate.
But that's relatively easy - if you don't know whether libvirt might have partial data, then wipe the data and start the redefine loop from scratch. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

On 3/12/19 4:52 PM, Eric Blake wrote:
vm_checkpoints table
- checkpoint_id: UUID - parent: UUID - vm_id: UUID - creation_date: TIMESTAMP
vm_checkpoint_disk_map table - disk_id: UUID - checkpoint_id: UUID
And no state of the <domain> at the time the checkpoint was created?
I meant to add: once we start playing with checkpoints vs. external snapshots vs. hotplug in anger, then knowing WHICH disks existed at the time of a given checkpoint may prove invaluable to even figure out WHICH disks need to participate in a given backup operation. The initial implementation uses XML (so that adding features can hopefully reuse the same API, rather than needing yet more one-off API additions), even if it is currently limited to not playing nicely with snapshots. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

On Wed, Mar 13, 2019 at 12:01 AM Eric Blake <eblake@redhat.com> wrote:
On 3/12/19 4:52 PM, Eric Blake wrote:
vm_checkpoints table
- checkpoint_id: UUID - parent: UUID - vm_id: UUID - creation_date: TIMESTAMP
vm_checkpoint_disk_map table - disk_id: UUID - checkpoint_id: UUID
And no state of the <domain> at the time the checkpoint was created?
I meant to add:
once we start playing with checkpoints vs. external snapshots vs. hotplug in anger, then knowing WHICH disks existed at the time of a given checkpoint may prove invaluable to even figure out WHICH disks need to participate in a given backup operation. The initial implementation uses XML (so that adding features can hopefully reuse the same API, rather than needing yet more one-off API additions), even if it is currently limited to not playing nicely with snapshots.
When you plug a disk, you cannot use any of the bitmaps on the disk because you don't have any guarantee that the user did not attach the disk to another machine that modified the data in some way without updating the bitmaps. So we are going to delete a disk from checkpoints once you unplug it, and start from scratch when you plug a disk. I think what we keep is enough to tell libvirt which bitmaps are needed for a backup. We probably need to keep the vm configuration per checkpoint or at least ensure consistency, so once you started a backup the configuration cannot change until the backup ends.

On 3/12/19 6:45 PM, Nir Soffer wrote:
When you plug a disk, you cannot use any of the bitmaps on the disk because you don't have any guarantee that the user did not attach the disk to another machine that modified the data in some way without updating the bitmaps. So we are going to delete a disk from checkpoints once you unplug it, and start from scratch when you plug a disk.
Correct - from libvirt's point of view, when a disk is hotplugged, the next backup needs to perform a full backup rather than an incremental backup on that disk, because that disk was not present in the machine on the last checkpoint and therefore does not have bitmap coverage from the last checkpoint to the present moment. But you don't necessarily have to delete bitmaps from the qcow2 file to get this effect. The fact that libvirt stored a full <domain> at the time of the checkpoint is enough for libvirt to tell that you are requesting a backup of a disk that did not exist at the starting checkpoint, and therefore that disk needs a full rather than incremental backup.
I think what we keep is enough to tell libvirt which bitmaps are needed for a backup.
The current design for virDomainBackup() is that you state which checkpoint to start from, and then libvirt computes all bitmaps from that checkpoint to the present in order to perform the incremental backup, or complains that there is no sequence of bitmaps that covers that length in time so you have to do a full backup instead. You do not tell libvirt 'perform a backup of the clusters represented in bitmaps B, C, and D', but 'perform a backup of the clusters modified since time T2'. Libvirt then crawls the chain of checkpoint objects to find T2 and verify which bitmaps between T2 and the present (if that be bitmaps B, C, and D) that it needs to use to perform that differential backup.
We probably need to keep the vm configuration per checkpoint or at least ensure consistency, so once you started a backup the configuration cannot change until the backup ends.
And that configuration per checkpoint IS the <domain> subelement. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

On Tue, Mar 12, 2019 at 11:52 PM Eric Blake <eblake@redhat.com> wrote:
On 3/12/19 4:35 PM, Nir Soffer wrote:
We don't have a need to list or define snapshots since we managed snapshots on oVirt side. We want an API to list and redefine checkpoints.
But the proposed <domaincheckpoint> also has a <domain> subelement, so it has the same problem (the XML for a bulk operation can become prohibitively large).
Why do we need <domain> in a checkpoint?
The initial design for snapshots did not have <domain>, and it bit us hard; you cannot safely revert to a snapshot if you don't know the state of the domain at that time. The initial design for checkpoints has thus mandated that <domain> be present (my v5 series fails to redefine a snapshot if you omit <domain>, even though it has a NO_DOMAIN flag during dumpxml for reduced output). If we are convinced that defining a snapshot without <domain> is safe enough, then I can relax the checkpoint code to allow redefined metadata without <domain> the way snapshots already have to do it, even though I was hoping that checkpoints could start life with fewer back-compats that snapshot has had to carry along. But I'd rather start strict and relax later (require <domain> and then remove it when proven safe), and not start loose (leave <domain> optional, and then wish we had made it mandatory).
You keep mentioning snapshots, but we are not going to redefine snapshots, only checkpoints. We need to do this only because qcow2 does not store a chain of bitmaps, but un-ordered bitmaps without any metadata, so we need to remind libvirt which bitmaps we have in every disk, and what is the order of the bitmaps. We will be happy if we could just specify the bitmaps and disks in the backup API, and avoid rebuilding the checkopints history, just like we avoid rebuilding the snapshot history.
You can see here what we store in a checkpoint (based on your patches v2 or
v3):
https://ovirt.org/develop/release-management/features/storage/incremental-ba...
Which is:
vm_checkpoints table
- checkpoint_id: UUID - parent: UUID - vm_id: UUID - creation_date: TIMESTAMP
vm_checkpoint_disk_map table - disk_id: UUID - checkpoint_id: UUID
And no state of the <domain> at the time the checkpoint was created?
No, but the configuration of the VM is stored in oVirt, so the backup application can get this configuration from oVirt at the time of the backup.
Our use case is:
1. Start VM on some host 2. When oVirt starts a backup, it adds a checkpoint in oVirt database
And does this database preserve topological ordering? If not, why not?
Using the parent UUID we can create the correct chain in the right order, and this is what we send to vdsm. Based on this, vdsm will generate the
XML
for libvirt.
There is no domain info related to these checkpoints, and I hope we are not adding such requirements at this stage.
Rather, that requirement has been there in all of my drafts, and it sounds like you are trying to get me to relax that requirement to not need a <domain> during checkpoint metadata redefine.
We never understood this requirement it in the html documentation built from your patches. Looking again in v3 - I see: domain The inactive domain configuration at the time the checkpoint was created. Readonly. And we have an example: <domaincheckpoint> <description>Completion of updates after OS install</description> <disks> <disk name='vda' checkpoint='bitmap'/> <disk name='vdb' checkpoint='no'/> </disks> </domaincheckpoint> Which does not have a domain, since it is read-only. There is an example of the xml we get from libvirt: <domaincheckpoint> <name>1525889631</name> <description>Completion of updates after OS install</description> <creationTime>1525889631</creationTime> <parent> <name>1525111885</name> </parent> <disks> <disk name='vda' checkpoint='bitmap' bitmap='1525889631'/> <disk name='vdb' checkpoint='no'/> </disks> <domain type='qemu'> <name>fedora</name> <uuid>93a5c045-6457-2c09-e56c-927cdf34e178</uuid> <memory>1048576</memory> ... <devices> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/path/to/file1'/> <target dev='vda' bus='virtio'/> </disk> <disk type='file' device='disk' snapshot='external'> <driver name='qemu' type='raw'/> <source file='/path/to/file2'/> <target dev='vdb' bus='virtio'/> </disk> ... </devices> </domain> </domaincheckpoint> But as the domain is read-only, we assumed that we don't need to pass it in the domaincheckpoint xml. We also don't need it since we keep the the disks uuids included in every checkpoint, and from this you can find the information needed to generate the tiny checkpoint xml above. Can you explain why historic domain xml is needed for performing a backup? The caller of the API is responsible to give you the list of bitmaps that should be included for every disk. Libvirt needs to create a new active bitmap and start the NBD server exposing the bitmaps specified by the caller. How having the xml for every checkpoint/bitmap is needed for perform these steps? Note also that libvirt does not have snapshot history on oVirt host, since we manage snapshots externally. When libvirt starts a VM, the only thing it knows about snapshots is the chain of disks. Even if we restore the domain xml for every checkpoint, libvirt cannot validate it against the non-existing snapshots. While we use persistent VMs since 4.2, our VMs are actually transient. We started to use persistent VMs because they are more tested, and we can use the VM <metadata> to persist stuff, instead of maintaining our own temporary metadata persistence solution.
That's a different feature request - you're asking for
virDomainCheckpointCreateXML(REDEFINE) to have some additional flag where the metadata re-creation fails if the underlying bitmap is corrupt/missing. Which may still be a good flag to add, but is something you can do with one-at-a-time checkpoint redefinitions, and doesn't, in itself, require bulk redefinition.
Checking for unknown bitmap on disk is not possible unless you have the complete list of checkpoints. Unless you add something like:
virDomainCheckpointValidate()
Which will fail if the checkpoints known to libvirt do not match the bitmaps in the underlying qcow2 images.
Yes, an API like that may be better than trying to validate on a per-redefine basis whether libvirt metadata matches qcow2 metadata.
But even if we add this, the user will have to do:
try: for checkpoint in orderdered_checkpoints: generate xml... redefine checkpoint... except ...: remove all checkpoints...
validate checkpoints
Yes, but that's not hard. And it's no different whether the 'try/except' code is a single libvirt API (with bulk redefine) or multiple libvirt APIs (no bulk redefine, but easy enough to one-at-a-time redefine in topological order). Why you have to generate the XML instead of reusing the XML that libvirt produces is something I'm not sure of (that is, what's wrong with using libvirt's dumpxml output as your input, rather than trying to reconstruct it on the fly?).
The xml libvirt produces may not be unstable. For example the disk uuid "22e5a7d6-5409-482d-bda4-f5939b27aad4" may be called now "sda", but when the vm was running on another host 2 weeks ago maybe it was called "sdb", since the user added or removed some devices. When we create the xml using the disk uuuid, we will always refer to the current name of the disk, regardless of the history of the vm.
Unless the xml for single checkpoint is big, I don't see why all libvirt users need to redefine checkopoints one by one and handle all the possible errors (e.g. some checkpoints redefined, some not).
As currently proposed, <domaincheckpoint> includes a mandatory <domain> sub-element on redefinition, which means that the XML _is_ big for a series of checkpoints. It also means that you're better off storing the XML produced by libvirt than trying to regenerate it by hand, when it comes to metadata redefinition.
Note that vdsm may be killed in the middle of the redefine loop, and in this case we leave livbirt with partial info about checkpoints, and we need to redefine the checkpoints again handling this partial sate.
But that's relatively easy - if you don't know whether libvirt might have partial data, then wipe the data and start the redefine loop from scratch.
But this easy boilerplate will be duplicated in all libvirt clients, instead of having single API to restore the state. Of course if we must keep historic domain configuration for every checkpoint bulk API does not make sense, but I don't understand this requirement. Nir

On 3/12/19 6:29 PM, Nir Soffer wrote:
On Tue, Mar 12, 2019 at 11:52 PM Eric Blake <eblake@redhat.com> wrote:
On 3/12/19 4:35 PM, Nir Soffer wrote:
We don't have a need to list or define snapshots since we managed snapshots on oVirt side. We want an API to list and redefine checkpoints.
But the proposed <domaincheckpoint> also has a <domain> subelement, so it has the same problem (the XML for a bulk operation can become prohibitively large).
Why do we need <domain> in a checkpoint?
The initial design for snapshots did not have <domain>, and it bit us hard; you cannot safely revert to a snapshot if you don't know the state of the domain at that time. The initial design for checkpoints has thus mandated that <domain> be present (my v5 series fails to redefine a snapshot if you omit <domain>, even though it has a NO_DOMAIN flag during dumpxml for reduced output). If we are convinced that defining a snapshot without <domain> is safe enough, then I can relax the
s/snapshot/checkpoint/ in that line
checkpoint code to allow redefined metadata without <domain> the way snapshots already have to do it, even though I was hoping that checkpoints could start life with fewer back-compats that snapshot has had to carry along. But I'd rather start strict and relax later (require <domain> and then remove it when proven safe), and not start loose (leave <domain> optional, and then wish we had made it mandatory).
You keep mentioning snapshots, but we are not going to redefine snapshots, only checkpoints.
We need to do this only because qcow2 does not store a chain of bitmaps, but
just as qcow2 does not store a chain of snapshots either.
un-ordered bitmaps without any metadata, so we need to remind libvirt which bitmaps we have in every disk, and what is the order of the bitmaps. We will be happy if we could just specify the bitmaps and disks in the backup API, and avoid rebuilding the checkopints history, just like we avoid rebuilding the snapshot history.
For persistent domains, it would be better if libvirt migrated snapshots (and therefore checkpoints) automatically, instead of being a coward and refusing the migration because a snapshot exists. The workaround is that you dumpxml then undefine snapshots on the source, then migrate, then createXML(REDEFINE) on the destination using the dumped XML. Since checkpoints share a lot of code with snapshots, it was easier to get the code working by having the same rules (and if we fix it for one, we should fix it for both, again because they share a lot of code). For transient domains, libvirt has no where to store the information about the snapshots or checkpoints, so it is relying on you to REDEFINE the SAME state as what it had last time libvirt was running.
There is no domain info related to these checkpoints, and I hope we are not adding such requirements at this stage.
Rather, that requirement has been there in all of my drafts, and it sounds like you are trying to get me to relax that requirement to not need a <domain> during checkpoint metadata redefine.
We never understood this requirement it in the html documentation built from your patches.
Looking again in v3 - I see:
domain The inactive domain configuration at the time the checkpoint was created. Readonly.
Copied heavily from snapshots which has the same wording, so I'm going to improve that wording in the next round of patches. But the key point is that: CreateXML(, 0) takes abbreviated (or full) XML, ignores readonly fields, _and_ populates remaining fields with information learned AT THE TIME the object was created. Thereafter, GetXMLDesc() outputs the full XML, and: CreateXML(, _REDEFINE) says to take the FULL XML from a previous dump (and NOT the abbreviated XML at the initial creation) in order to restore ALL state as libvirt had (if you omit fields, libvirt has to guess at what state to recreate for those fields - for some fields, guessing might not be too bad, such as what timestamp to use. But for other fields, guessing what to use for <domain> by using the current state of the domain, which may have had hotplug events in the meantime, is WRONG compared to being told the exact <domain> state that libvirt previously had in memory and which was included in the previous dumpxml). In other words, _REDEFINE has never been about inventing state from scratch, but about restoring state that libvirt previously had, as dumped by libvirt.
And we have an example:
<domaincheckpoint> <description>Completion of updates after OS install</description> <disks> <disk name='vda' checkpoint='bitmap'/> <disk name='vdb' checkpoint='no'/> </disks> </domaincheckpoint>
Which does not have a domain, since it is read-only.
Yes, that's and example of CreateXML(, 0). But NOT an example of CreateXML(, _REDEFINE).
There is an example of the xml we get from libvirt:
<domaincheckpoint> <name>1525889631</name> <description>Completion of updates after OS install</description> <creationTime>1525889631</creationTime> <parent> <name>1525111885</name> </parent> <disks> <disk name='vda' checkpoint='bitmap' bitmap='1525889631'/> <disk name='vdb' checkpoint='no'/> </disks> <domain type='qemu'> <name>fedora</name> <uuid>93a5c045-6457-2c09-e56c-927cdf34e178</uuid> <memory>1048576</memory> ... <devices> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/path/to/file1'/> <target dev='vda' bus='virtio'/> </disk> <disk type='file' device='disk' snapshot='external'> <driver name='qemu' type='raw'/> <source file='/path/to/file2'/> <target dev='vdb' bus='virtio'/> </disk> ... </devices> </domain> </domaincheckpoint>
And _that's_ the XML that the _REDEFINE flag expects.
But as the domain is read-only, we assumed that we don't need to pass it in the domaincheckpoint xml. We also don't need it since we keep the the disks uuids included in every checkpoint, and from this you can find the information needed to generate the tiny checkpoint xml above.
But the tiny original input XML is no longer sufficient during _REDEFINE, because you no longer have the point in time where libvirt can populate missing/readonly fields with sane defaults.
Can you explain why historic domain xml is needed for performing a backup?
In the current implementation for qemu, the full <domain> is needed only for validation that the disks requested in the virBackupBegin() operation have a clean chain of bitmaps from the checkpoint to the present; and you may be right that the bulk of the information about the <domain> is spurious for qemu's needs. But it is not obvious whether throwing away information about the <domain> would be valid for other hypervisor implementations, nor whether some future tweak to qemu may actually require more of the <domain> state as it was at the time the snapshot was created.
The caller of the API is responsible to give you the list of bitmaps that should be included for every disk. Libvirt needs to create a new active bitmap and start the NBD server exposing the bitmaps specified by the caller. How having the xml for every checkpoint/bitmap is needed for perform these steps?
Because having the <domain> lets libvirt correlate that disk 'vda' at the time of the snapshot maps to whatever filename with UUID in the <domain> definition, to ensure that it is operating on the same device (and that 'vda' was not hot-unplugged and then hot-plugged to some other disk in the meantime).
The xml libvirt produces may not be unstable. For example the disk uuid "22e5a7d6-5409-482d-bda4-f5939b27aad4" may be called now "sda", but when the vm was running on another host 2 weeks ago maybe it was called "sdb", since the user added or removed some devices.
But at the time a <domainsnapshot> or <domaincheckpoint> was created, the <domain> subelement of that XML _is_ stable - that particular disk name/UUID WAS disk 'sda' at the time the object was created, no matter what name it has later.
Note that vdsm may be killed in the middle of the redefine loop, and in this case we leave livbirt with partial info about checkpoints, and we need to redefine the checkpoints again handling this partial sate.
But that's relatively easy - if you don't know whether libvirt might have partial data, then wipe the data and start the redefine loop from scratch.
But this easy boilerplate will be duplicated in all libvirt clients, instead of having single API to restore the state.
If the single API didn't have to include a <domain> element for every checkpoint, and thus risk exceeding RPC bounds, then we'd be okay. But at the present moment, bulk operations would require a LOT of XML, regardless of snapshot vs. checkpoint.
Of course if we must keep historic domain configuration for every checkpoint bulk API does not make sense, but I don't understand this requirement.
The requirement is stronger for snapshot (because you can't revert without either knowing the <domain> setup to revert to or using the UNSAFE flag) than for checkpoint; but my plan was to make <domain> mandatory for checkpoint because experience with checkpoint has shown that making <domain> optional on REDEFINE causes more grief than desired, and because I wanted to share as much good code between the two (while leaving the back-compats to just snapshots) as possible. If it turns out that we want <domain> to be optional on REDEFINE for checkpoints, we can do that as a followup patch, but first I want to focus on getting the API in (we can relax API restrictions later, but only if the API even exists to make it into backports). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

On Tue, Mar 12, 2019 at 04:52:24PM -0500, Eric Blake wrote:
On 3/12/19 4:35 PM, Nir Soffer wrote:
We don't have a need to list or define snapshots since we managed snapshots on oVirt side. We want an API to list and redefine checkpoints.
But the proposed <domaincheckpoint> also has a <domain> subelement, so it has the same problem (the XML for a bulk operation can become prohibitively large).
Why do we need <domain> in a checkpoint?
The initial design for snapshots did not have <domain>, and it bit us hard; you cannot safely revert to a snapshot if you don't know the state of the domain at that time. The initial design for checkpoints has thus mandated that <domain> be present (my v5 series fails to redefine a snapshot if you omit <domain>, even though it has a NO_DOMAIN flag during dumpxml for reduced output). If we are convinced that defining a snapshot without <domain> is safe enough, then I can relax the checkpoint code to allow redefined metadata without <domain> the way snapshots already have to do it, even though I was hoping that checkpoints could start life with fewer back-compats that snapshot has had to carry along. But I'd rather start strict and relax later (require <domain> and then remove it when proven safe), and not start loose (leave <domain> optional, and then wish we had made it mandatory).
Given that a guest domain XML can be change at runtime at any point, I don't see how omitting <domain> from the checkpoint XML is safe in general. Even if apps think it is safe now and omit it, a future version of the app might change in a way that makes omitting the <domain> unsafe. If we didn't historically record the <domain> in the checkpoint in the first place, then the new version of the app is potentially in trouble. So I think it is good that we are strict and mandate the <domain> XML even if it is not technically required in some use cases.
Note that vdsm may be killed in the middle of the redefine loop, and in this case we leave livbirt with partial info about checkpoints, and we need to redefine the checkpoints again handling this partial sate.
But that's relatively easy - if you don't know whether libvirt might have partial data, then wipe the data and start the redefine loop from scratch.
Of course the same failure scenario applies if libvirt is doing it via a bulk operation. The redefine loop still exists, just inside libvirt instead, which might be killed or die part way though. So you're not really fixing a failure scenario, just moving the failure to a different piece. That's no net win. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (4)
-
Daniel P. Berrangé
-
Eric Blake
-
Ján Tomko
-
Nir Soffer