[libvirt] [PATCH 0/8] Work-in-progress: Incremental Backup API additions

I'm offline the rest of this week, but wanted to post the progress I've made on patches towards the Incremental Backup RFC: https://www.redhat.com/archives/libvir-list/2018-May/msg01403.html Comments welcome, including any naming suggestions Still to go: - Add .rng file for validating the XML format used in virDomainBackupBegin() - Add flags for validating XML - Add src/conf/checkpoint_conf.c mirroring src/conf/snapshot_conf.c for tracking tree of checkpoints - Add virsh wrappers for calling everything - Add qemu implementation - my first addition will probably just be for push model full backups, then additional patches to expand into pull model (on the qemu list, I still need to review and incorporate Vladimir's patches for exporting a bitmap over NBD) - Bug fixes (but why would there be any bugs in the first place? :) I've got portions of the qemu code working locally, but not polished enough to post as a patch yet; my end goal is to have a working demo against current qemu.git showing the use of virDomainBackupBegin() for incremental backups with the push model prior to the code freeze for 4.5.0 this month, even if that code doesn't get checked into libvirt until later when the qemu code is changed to drop x- prefixes. (That is, I'm hoping to demo that my API is sound, and thus we can include the entrypoints in the libvirt.so for this release, even if the libvirt code for driving pull mode over qemu waits until after a qemu release where the pieces are promoted to a stable form.) Eric Blake (8): snapshots: Avoid term 'checkpoint' for full system snapshot backup: Document nuances between different state capture APIs backup: Introduce virDomainCheckpointPtr backup: Document new XML for backups backup: Introduce virDomainCheckpoint APIs backup: Introduce virDomainBackup APIs backup: Add new domain:checkpoint access control backup: Implement backup APIs for remote driver docs/Makefile.am | 3 + docs/apibuild.py | 2 + docs/docs.html.in | 9 +- docs/domainstatecapture.html.in | 190 ++++++ docs/formatcheckpoint.html.in | 273 +++++++++ docs/formatsnapshot.html.in | 16 +- docs/schemas/domaincheckpoint.rng | 89 +++ include/libvirt/libvirt-domain-checkpoint.h | 158 +++++ include/libvirt/libvirt-domain-snapshot.h | 10 +- include/libvirt/libvirt-domain.h | 14 +- include/libvirt/libvirt.h | 3 +- include/libvirt/virterror.h | 5 +- libvirt.spec.in | 2 + mingw-libvirt.spec.in | 4 + po/POTFILES | 1 + src/Makefile.am | 2 + src/access/viraccessperm.c | 5 +- src/access/viraccessperm.h | 8 +- src/conf/snapshot_conf.c | 2 +- src/datatypes.c | 62 +- src/datatypes.h | 31 +- src/driver-hypervisor.h | 74 ++- src/libvirt-domain-checkpoint.c | 908 ++++++++++++++++++++++++++++ src/libvirt-domain-snapshot.c | 4 +- src/libvirt-domain.c | 8 +- src/libvirt_private.syms | 2 + src/libvirt_public.syms | 19 + src/qemu/qemu_driver.c | 12 +- src/remote/remote_daemon_dispatch.c | 15 + src/remote/remote_driver.c | 31 +- src/remote/remote_protocol.x | 237 +++++++- src/remote_protocol-structs | 129 ++++ src/rpc/gendispatch.pl | 32 +- src/util/virerror.c | 15 +- tests/domaincheckpointxml2xmlin/empty.xml | 1 + tests/domaincheckpointxml2xmlout/empty.xml | 10 + tests/virschematest.c | 2 + tools/virsh-domain.c | 3 +- tools/virsh-snapshot.c | 2 +- tools/virsh.pod | 14 +- 40 files changed, 2347 insertions(+), 60 deletions(-) create mode 100644 docs/domainstatecapture.html.in create mode 100644 docs/formatcheckpoint.html.in create mode 100644 docs/schemas/domaincheckpoint.rng create mode 100644 include/libvirt/libvirt-domain-checkpoint.h create mode 100644 src/libvirt-domain-checkpoint.c create mode 100644 tests/domaincheckpointxml2xmlin/empty.xml create mode 100644 tests/domaincheckpointxml2xmlout/empty.xml -- 2.14.4

Upcoming patches plan to introduce virDomainCheckpointPtr as a new object for use in incremental backups, along with documentation how incremental backups differ from snapshots. But first, we need to rename any existing mention of a 'system checkpoint' to instead be a 'full system state snapshot', so that we aren't overloading the term checkpoint. Signed-off-by: Eric Blake <eblake@redhat.com> --- Bikeshed suggestions on what to name the new object for use in backups is welcome, if we would rather keep the term 'checkpoint' for a disk+memory snapshot. --- docs/formatsnapshot.html.in | 14 +++++++------- include/libvirt/libvirt-domain-snapshot.h | 2 +- src/conf/snapshot_conf.c | 2 +- src/libvirt-domain-snapshot.c | 4 ++-- src/qemu/qemu_driver.c | 12 ++++++------ tools/virsh-snapshot.c | 2 +- tools/virsh.pod | 14 +++++++------- 7 files changed, 25 insertions(+), 25 deletions(-) diff --git a/docs/formatsnapshot.html.in b/docs/formatsnapshot.html.in index fbbecfd242..f2e51df5ab 100644 --- a/docs/formatsnapshot.html.in +++ b/docs/formatsnapshot.html.in @@ -33,7 +33,7 @@ resume in a consistent state; but if the disks are modified externally in the meantime, this is likely to lead to data corruption.</dd> - <dt>system checkpoint</dt> + <dt>full system state</dt> <dd>A combination of disk snapshots for all disks as well as VM memory state, which can be used to resume the guest from where it left off with symptoms similar to hibernation (that is, TCP @@ -55,7 +55,7 @@ as <code>virDomainSaveImageGetXMLDesc()</code> to work with those files. </p> - <p>System checkpoints are created + <p>Full system state snapshots are created by <code>virDomainSnapshotCreateXML()</code> with no flags, and disk snapshots are created by the same function with the <code>VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY</code> flag; in @@ -128,13 +128,13 @@ what file name is created in an external snapshot. On output, this is fully populated to show the state of each disk in the snapshot, including any properties that were generated by the - hypervisor defaults. For system checkpoints, this field is - ignored on input and omitted on output (a system checkpoint + hypervisor defaults. For full system state snapshots, this field is + ignored on input and omitted on output (a full system state snapshot implies that all disks participate in the snapshot process, and since the current implementation only does internal system - checkpoints, there are no extra details to add); a future + snapshots, there are no extra details to add); a future release may allow the use of <code>disks</code> with a system - checkpoint. This element has a list of <code>disk</code> + snapshot. This element has a list of <code>disk</code> sub-elements, describing anywhere from zero to all of the disks associated with the domain. <span class="since">Since 0.9.5</span> @@ -206,7 +206,7 @@ </dd> <dt><code>state</code></dt> <dd>The state of the domain at the time this snapshot was taken. - If the snapshot was created as a system checkpoint, then this + If the snapshot was created with full system state, then this is the state of the domain at that time; when the domain is reverted to this snapshot, the domain's state will default to whatever is in this field unless additional flags are passed diff --git a/include/libvirt/libvirt-domain-snapshot.h b/include/libvirt/libvirt-domain-snapshot.h index 0f73f24b2b..e5a893a767 100644 --- a/include/libvirt/libvirt-domain-snapshot.h +++ b/include/libvirt/libvirt-domain-snapshot.h @@ -58,7 +58,7 @@ typedef enum { VIR_DOMAIN_SNAPSHOT_CREATE_HALT = (1 << 3), /* Stop running guest after snapshot */ VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY = (1 << 4), /* disk snapshot, not - system checkpoint */ + full system state */ VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT = (1 << 5), /* reuse any existing external files */ VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE = (1 << 6), /* use guest agent to diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c index 787c3d0feb..5efbef7e09 100644 --- a/src/conf/snapshot_conf.c +++ b/src/conf/snapshot_conf.c @@ -1307,7 +1307,7 @@ virDomainSnapshotRedefinePrep(virDomainPtr domain, (def->state == VIR_DOMAIN_DISK_SNAPSHOT)) { virReportError(VIR_ERR_INVALID_ARG, _("cannot change between disk snapshot and " - "system checkpoint in snapshot %s"), + "full system state in snapshot %s"), def->name); goto cleanup; } diff --git a/src/libvirt-domain-snapshot.c b/src/libvirt-domain-snapshot.c index 100326a5e7..71881b2db2 100644 --- a/src/libvirt-domain-snapshot.c +++ b/src/libvirt-domain-snapshot.c @@ -105,7 +105,7 @@ virDomainSnapshotGetConnect(virDomainSnapshotPtr snapshot) * contained in xmlDesc. * * If @flags is 0, the domain can be active, in which case the - * snapshot will be a system checkpoint (both disk state and runtime + * snapshot will be a full system state snapshot (both disk state and runtime * VM state such as RAM contents), where reverting to the snapshot is * the same as resuming from hibernation (TCP connections may have * timed out, but everything else picks up where it left off); or @@ -149,7 +149,7 @@ virDomainSnapshotGetConnect(virDomainSnapshotPtr snapshot) * is not paused while creating the snapshot. This increases the size * of the memory dump file, but reduces downtime of the guest while * taking the snapshot. Some hypervisors only support this flag during - * external checkpoints. + * external snapshots. * * If @flags includes VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY, then the * snapshot will be limited to the disks described in @xmlDesc, and no diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7c79c324e6..978c02fab9 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2167,7 +2167,7 @@ qemuDomainReset(virDomainPtr dom, unsigned int flags) } -/* Count how many snapshots in a set are external snapshots or checkpoints. */ +/* Count how many snapshots in a set are external snapshots. */ static int qemuDomainSnapshotCountExternal(void *payload, const void *name ATTRIBUTE_UNUSED, @@ -14688,7 +14688,7 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm, if ((def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL && !found_internal) || (found_internal && forbid_internal)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("internal snapshots and checkpoints require all " + _("internal and full system state snapshots require all " "disks to be selected for snapshot")); goto cleanup; } @@ -15161,7 +15161,7 @@ qemuDomainSnapshotCreateActiveExternal(virQEMUDriverPtr driver, if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_PMSUSPENDED) { pmsuspended = true; } else if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { - /* For external checkpoints (those with memory), the guest + /* For full system external snapshots (those with memory), the guest * must pause (either by libvirt up front, or by qemu after * _LIVE converges). For disk-only snapshots with multiple * disks, libvirt must pause externally to get all snapshots @@ -15398,7 +15398,7 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, redefine)) { virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("live snapshot creation is supported only " - "with external checkpoints")); + "with external full system state")); goto cleanup; } @@ -15518,12 +15518,12 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } else if (virDomainObjIsActive(vm)) { if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY || snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { - /* external checkpoint or disk snapshot */ + /* external full system or disk snapshot */ if (qemuDomainSnapshotCreateActiveExternal(driver, vm, snap, flags) < 0) goto endjob; } else { - /* internal checkpoint */ + /* internal full system */ if (qemuDomainSnapshotCreateActiveInternal(driver, vm, snap, flags) < 0) goto endjob; diff --git a/tools/virsh-snapshot.c b/tools/virsh-snapshot.c index 812fa91333..33e3107045 100644 --- a/tools/virsh-snapshot.c +++ b/tools/virsh-snapshot.c @@ -1432,7 +1432,7 @@ static const vshCmdOptDef opts_snapshot_list[] = { }, {.name = "active", .type = VSH_OT_BOOL, - .help = N_("filter by snapshots taken while active (system checkpoints)") + .help = N_("filter by snapshots taken while active (full system snapshots)") }, {.name = "disk-only", .type = VSH_OT_BOOL, diff --git a/tools/virsh.pod b/tools/virsh.pod index 3f3314a87e..cb0dbfa7dd 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -4468,8 +4468,8 @@ If I<--halt> is specified, the domain will be left in an inactive state after the snapshot is created. If I<--disk-only> is specified, the snapshot will only include disk -state rather than the usual system checkpoint with vm state. Disk -snapshots are faster than full system checkpoints, but reverting to a +state rather than the usual full system state snapshot with vm state. Disk +snapshots are faster than full system snapshots, but reverting to a disk snapshot may require fsck or journal replays, since it is like the disk state at the point when the power cord is abruptly pulled; and mixing I<--halt> and I<--disk-only> loses any data that was not @@ -4508,10 +4508,10 @@ this. If this flag is not specified, then some hypervisors may fail after partially performing the action, and B<dumpxml> must be used to see whether any partial changes occurred. -If I<--live> is specified, libvirt takes the snapshot (checkpoint) while +If I<--live> is specified, libvirt takes the snapshot while the guest is running. Both disk snapshot and domain memory snapshot are taken. This increases the size of the memory image of the external -checkpoint. This is currently supported only for external checkpoints. +snapshot. This is currently supported only for full system external snapshots. Existence of snapshot metadata will prevent attempts to B<undefine> a persistent domain. However, for transient domains, snapshot @@ -4531,7 +4531,7 @@ Otherwise, if I<--halt> is specified, the domain will be left in an inactive state after the snapshot is created, and if I<--disk-only> is specified, the snapshot will not include vm state. -The I<--memspec> option can be used to control whether a checkpoint +The I<--memspec> option can be used to control whether a full system snapshot is internal or external. The I<--memspec> flag is mandatory, followed by a B<memspec> of the form B<[file=]name[,snapshot=type]>, where type can be B<no>, B<internal>, or B<external>. To include a literal @@ -4539,7 +4539,7 @@ comma in B<file=name>, escape it with a second comma. I<--memspec> cannot be used together with I<--disk-only>. The I<--diskspec> option can be used to control how I<--disk-only> and -external checkpoints create external files. This option can occur +external full system snapshots create external files. This option can occur multiple times, according to the number of <disk> elements in the domain xml. Each <diskspec> is in the form B<disk[,snapshot=type][,driver=type][,file=name]>. A I<diskspec> @@ -4579,7 +4579,7 @@ see whether any partial changes occurred. If I<--live> is specified, libvirt takes the snapshot while the guest is running. This increases the size of the memory image of the external -checkpoint. This is currently supported only for external checkpoints. +snapshot. This is currently supported only for external full system snapshots. =item B<snapshot-current> I<domain> {[I<--name>] | [I<--security-info>] | [I<snapshotname>]} -- 2.14.4

On 06/13/2018 12:42 PM, Eric Blake wrote:
Upcoming patches plan to introduce virDomainCheckpointPtr as a new object for use in incremental backups, along with documentation how incremental backups differ from snapshots. But first, we need to rename any existing mention of a 'system checkpoint' to instead be a 'full system state snapshot', so that we aren't overloading the term checkpoint.
Signed-off-by: Eric Blake <eblake@redhat.com>
--- Bikeshed suggestions on what to name the new object for use in backups is welcome, if we would rather keep the term 'checkpoint' for a disk+memory snapshot. ---
"Naming is hard" and opinions can vary greatly - be careful for what you ask in case you receive something not wanted ;-). I haven't followed the discussions thus far all that closely, but I'll give this a go anyway since it's languishing and saying nothing is akin to implicitly agreeing everything is fine. Fair warning, I'm not all that familiar with snapshot algorithms having largely tried to ignore it since others (Eric and Peter) have far more in depth knowledge. In any case, another option for the proposed "checkpoint" could be a "snapshot reference". One can start or end a reference period and then set or clear a reference point. What I'm not clear on yet is whether the intention is to have this checkpoint (and backup) be integrated in any way to the existing snapshot algorithms. I guess part of me thinks that if I take a full system snapshot, then any backup/checkpoint data should be included so that if/when I go back to that point in time I can start from whence I left as it relates to my backup. Kind of a superset and/or integrated model rather than something bolted onto the side to resolve a specific need. I suppose a reservation I have about separate virDomainCheckpoint* and virDomainBackup* API's is understanding the relationship between the two naming spaces. IIUC though a Checkpoint would be reference point in time within a Backup period. I do have more comments in patch2, but I want to make them coherent before posting. Still I wanted to be sure you got at least "some" feedback for this and well of course an opinion on checkpoint ;-)
docs/formatsnapshot.html.in | 14 +++++++------- include/libvirt/libvirt-domain-snapshot.h | 2 +- src/conf/snapshot_conf.c | 2 +- src/libvirt-domain-snapshot.c | 4 ++-- src/qemu/qemu_driver.c | 12 ++++++------ tools/virsh-snapshot.c | 2 +- tools/virsh.pod | 14 +++++++------- 7 files changed, 25 insertions(+), 25 deletions(-)
diff --git a/docs/formatsnapshot.html.in b/docs/formatsnapshot.html.in index fbbecfd242..f2e51df5ab 100644 --- a/docs/formatsnapshot.html.in +++ b/docs/formatsnapshot.html.in @@ -33,7 +33,7 @@ resume in a consistent state; but if the disks are modified externally in the meantime, this is likely to lead to data corruption.</dd> - <dt>system checkpoint</dt> + <dt>full system state</dt>
Is "state" superfluous in this context? IOW: Everywhere that "full system state" exists, it seems "full system" could be used. Other synonyms that came up are complete, entire, integrated, or thorough (hah!). But I think "Full System" conveys enough meaning even though it could convey more meaning than intended.
<dd>A combination of disk snapshots for all disks as well as VM memory state, which can be used to resume the guest from where it left off with symptoms similar to hibernation (that is, TCP @@ -55,7 +55,7 @@ as <code>virDomainSaveImageGetXMLDesc()</code> to work with those files. </p> - <p>System checkpoints are created + <p>Full system state snapshots are created by <code>virDomainSnapshotCreateXML()</code> with no flags, and disk snapshots are created by the same function with the <code>VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY</code> flag; in
BTW: Existing and maybe it's just me, but when I read a conjunctive sentence with only two parts I don't expect to see ", and" or ", or" - it's just "and" or "or" without the comma.... Also the "flag; in both cases...", I think should be a "flag. Regardless of the flags value provided, restoration of the snapshot is handled by the virDomainRevertToSnapshot() function." But that's just me being "particular". ;-) There's bigger fish to fry here other than grammar issues. There's so many usages of the "; " to join two sentences in this page - it'd probably take more effort than desired to go through each one.
@@ -128,13 +128,13 @@ what file name is created in an external snapshot. On output, this is fully populated to show the state of each disk in the snapshot, including any properties that were generated by the - hypervisor defaults. For system checkpoints, this field is - ignored on input and omitted on output (a system checkpoint + hypervisor defaults. For full system state snapshots, this field is + ignored on input and omitted on output (a full system state snapshot implies that all disks participate in the snapshot process, and since the current implementation only does internal system - checkpoints, there are no extra details to add); a future + snapshots, there are no extra details to add); a future release may allow the use of <code>disks</code> with a system - checkpoint. This element has a list of <code>disk</code> + snapshot. This element has a list of <code>disk</code> sub-elements, describing anywhere from zero to all of the disks associated with the domain. <span class="since">Since 0.9.5</span> @@ -206,7 +206,7 @@ </dd> <dt><code>state</code></dt> <dd>The state of the domain at the time this snapshot was taken. - If the snapshot was created as a system checkpoint, then this + If the snapshot was created with full system state, then this is the state of the domain at that time; when the domain is reverted to this snapshot, the domain's state will default to whatever is in this field unless additional flags are passed
Oy - this is so hard to read... Such as what flags?.... leaves me searching... ahhh... REVERT_RUNNING or REVERT_PAUSED... So as a suggestion: If a full system snapshot was created, then this is the state of the domain at that time. When the domain is reverted to this snapshot, then the domain's state will default to this state unless overridden by virDomainRevertToSnapshot() flags, such as revert to running or to paused state.
diff --git a/include/libvirt/libvirt-domain-snapshot.h b/include/libvirt/libvirt-domain-snapshot.h index 0f73f24b2b..e5a893a767 100644 --- a/include/libvirt/libvirt-domain-snapshot.h +++ b/include/libvirt/libvirt-domain-snapshot.h @@ -58,7 +58,7 @@ typedef enum { VIR_DOMAIN_SNAPSHOT_CREATE_HALT = (1 << 3), /* Stop running guest after snapshot */ VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY = (1 << 4), /* disk snapshot, not - system checkpoint */ + full system state */ VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT = (1 << 5), /* reuse any existing external files */ VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE = (1 << 6), /* use guest agent to diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c index 787c3d0feb..5efbef7e09 100644 --- a/src/conf/snapshot_conf.c +++ b/src/conf/snapshot_conf.c @@ -1307,7 +1307,7 @@ virDomainSnapshotRedefinePrep(virDomainPtr domain, (def->state == VIR_DOMAIN_DISK_SNAPSHOT)) { virReportError(VIR_ERR_INVALID_ARG, _("cannot change between disk snapshot and " - "system checkpoint in snapshot %s"), + "full system state in snapshot %s"),
"cannot change between disk only and full system snapshots" [honestly, "full system state in snapshot" doesn't read well to me.]
def->name); goto cleanup; } diff --git a/src/libvirt-domain-snapshot.c b/src/libvirt-domain-snapshot.c index 100326a5e7..71881b2db2 100644 --- a/src/libvirt-domain-snapshot.c +++ b/src/libvirt-domain-snapshot.c @@ -105,7 +105,7 @@ virDomainSnapshotGetConnect(virDomainSnapshotPtr snapshot) * contained in xmlDesc. * * If @flags is 0, the domain can be active, in which case the - * snapshot will be a system checkpoint (both disk state and runtime + * snapshot will be a full system state snapshot (both disk state and runtime
"disk state"? Should that be disk contents?
* VM state such as RAM contents), where reverting to the snapshot is * the same as resuming from hibernation (TCP connections may have * timed out, but everything else picks up where it left off); or @@ -149,7 +149,7 @@ virDomainSnapshotGetConnect(virDomainSnapshotPtr snapshot) * is not paused while creating the snapshot. This increases the size * of the memory dump file, but reduces downtime of the guest while * taking the snapshot. Some hypervisors only support this flag during - * external checkpoints. + * external snapshots. * * If @flags includes VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY, then the * snapshot will be limited to the disks described in @xmlDesc, and no diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7c79c324e6..978c02fab9 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2167,7 +2167,7 @@ qemuDomainReset(virDomainPtr dom, unsigned int flags) }
-/* Count how many snapshots in a set are external snapshots or checkpoints. */ +/* Count how many snapshots in a set are external snapshots. */ static int qemuDomainSnapshotCountExternal(void *payload, const void *name ATTRIBUTE_UNUSED, @@ -14688,7 +14688,7 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm, if ((def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL && !found_internal) || (found_internal && forbid_internal)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("internal snapshots and checkpoints require all " + _("internal and full system state snapshots require all " "disks to be selected for snapshot")); goto cleanup; } @@ -15161,7 +15161,7 @@ qemuDomainSnapshotCreateActiveExternal(virQEMUDriverPtr driver, if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_PMSUSPENDED) { pmsuspended = true; } else if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { - /* For external checkpoints (those with memory), the guest + /* For full system external snapshots (those with memory), the guest * must pause (either by libvirt up front, or by qemu after * _LIVE converges). For disk-only snapshots with multiple * disks, libvirt must pause externally to get all snapshots @@ -15398,7 +15398,7 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, redefine)) { virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("live snapshot creation is supported only " - "with external checkpoints")); + "with external full system state"));
live snapshot creation is supported only using a full system snapshot
goto cleanup; }
@@ -15518,12 +15518,12 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } else if (virDomainObjIsActive(vm)) { if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY || snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { - /* external checkpoint or disk snapshot */ + /* external full system or disk snapshot */ if (qemuDomainSnapshotCreateActiveExternal(driver, vm, snap, flags) < 0) goto endjob; } else { - /* internal checkpoint */ + /* internal full system */ if (qemuDomainSnapshotCreateActiveInternal(driver, vm, snap, flags) < 0) goto endjob; diff --git a/tools/virsh-snapshot.c b/tools/virsh-snapshot.c index 812fa91333..33e3107045 100644 --- a/tools/virsh-snapshot.c +++ b/tools/virsh-snapshot.c @@ -1432,7 +1432,7 @@ static const vshCmdOptDef opts_snapshot_list[] = { }, {.name = "active", .type = VSH_OT_BOOL, - .help = N_("filter by snapshots taken while active (system checkpoints)") + .help = N_("filter by snapshots taken while active (full system snapshots)") }, {.name = "disk-only", .type = VSH_OT_BOOL, diff --git a/tools/virsh.pod b/tools/virsh.pod index 3f3314a87e..cb0dbfa7dd 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -4468,8 +4468,8 @@ If I<--halt> is specified, the domain will be left in an inactive state after the snapshot is created.
If I<--disk-only> is specified, the snapshot will only include disk -state rather than the usual system checkpoint with vm state. Disk -snapshots are faster than full system checkpoints, but reverting to a +state rather than the usual full system state snapshot with vm state. Disk
Here, "with vm state" would seem to be redundant. Also here again, is it really "disk state" or "disk content". John
+snapshots are faster than full system snapshots, but reverting to a disk snapshot may require fsck or journal replays, since it is like the disk state at the point when the power cord is abruptly pulled; and mixing I<--halt> and I<--disk-only> loses any data that was not @@ -4508,10 +4508,10 @@ this. If this flag is not specified, then some hypervisors may fail after partially performing the action, and B<dumpxml> must be used to see whether any partial changes occurred.
-If I<--live> is specified, libvirt takes the snapshot (checkpoint) while +If I<--live> is specified, libvirt takes the snapshot while the guest is running. Both disk snapshot and domain memory snapshot are taken. This increases the size of the memory image of the external -checkpoint. This is currently supported only for external checkpoints. +snapshot. This is currently supported only for full system external snapshots.
Existence of snapshot metadata will prevent attempts to B<undefine> a persistent domain. However, for transient domains, snapshot @@ -4531,7 +4531,7 @@ Otherwise, if I<--halt> is specified, the domain will be left in an inactive state after the snapshot is created, and if I<--disk-only> is specified, the snapshot will not include vm state.
-The I<--memspec> option can be used to control whether a checkpoint +The I<--memspec> option can be used to control whether a full system snapshot is internal or external. The I<--memspec> flag is mandatory, followed by a B<memspec> of the form B<[file=]name[,snapshot=type]>, where type can be B<no>, B<internal>, or B<external>. To include a literal @@ -4539,7 +4539,7 @@ comma in B<file=name>, escape it with a second comma. I<--memspec> cannot be used together with I<--disk-only>.
The I<--diskspec> option can be used to control how I<--disk-only> and -external checkpoints create external files. This option can occur +external full system snapshots create external files. This option can occur multiple times, according to the number of <disk> elements in the domain xml. Each <diskspec> is in the form B<disk[,snapshot=type][,driver=type][,file=name]>. A I<diskspec> @@ -4579,7 +4579,7 @@ see whether any partial changes occurred.
If I<--live> is specified, libvirt takes the snapshot while the guest is running. This increases the size of the memory image of the external -checkpoint. This is currently supported only for external checkpoints. +snapshot. This is currently supported only for external full system snapshots.
=item B<snapshot-current> I<domain> {[I<--name>] | [I<--security-info>] | [I<snapshotname>]}

On 06/22/2018 04:16 PM, John Ferlan wrote:
On 06/13/2018 12:42 PM, Eric Blake wrote:
Upcoming patches plan to introduce virDomainCheckpointPtr as a new object for use in incremental backups, along with documentation how incremental backups differ from snapshots. But first, we need to rename any existing mention of a 'system checkpoint' to instead be a 'full system state snapshot', so that we aren't overloading the term checkpoint.
Signed-off-by: Eric Blake <eblake@redhat.com>
--- Bikeshed suggestions on what to name the new object for use in backups is welcome, if we would rather keep the term 'checkpoint' for a disk+memory snapshot. ---
"Naming is hard" and opinions can vary greatly - be careful for what you ask in case you receive something not wanted ;-).
I haven't followed the discussions thus far all that closely, but I'll give this a go anyway since it's languishing and saying nothing is akin to implicitly agreeing everything is fine. Fair warning, I'm not all that familiar with snapshot algorithms having largely tried to ignore it since others (Eric and Peter) have far more in depth knowledge.
In any case, another option for the proposed "checkpoint" could be a "snapshot reference". One can start or end a reference period and then set or clear a reference point.
What I'm not clear on yet is whether the intention is to have this checkpoint (and backup) be integrated in any way to the existing snapshot algorithms. I guess part of me thinks that if I take a full system snapshot, then any backup/checkpoint data should be included so that if/when I go back to that point in time I can start from whence I left as it relates to my backup. Kind of a superset and/or integrated model rather than something bolted onto the side to resolve a specific need.
That's a tough call. My current design has incremental backups completely separate from the existing checkpoint code for several reasons: - the snapshot code is already confusing with lots of flags (internal/external, disk/memory, etc) - snapshots can be reverted to (well, in theory - we STILL can't revert to an external snapshot cleanly, even though the design supports it) - incremental backups are not direct revert points so, rather than bolt something on to the existing design, I went with a new concept. As you found later in the series, I then tried to provide a good summary page describing the different pieces, and what tradeoffs are involved in order to know which approach will work for a given need.
I suppose a reservation I have about separate virDomainCheckpoint* and virDomainBackup* API's is understanding the relationship between the two naming spaces. IIUC though a Checkpoint would be reference point in time within a Backup period.
A sequence of snapshots are different points in time you can revert to. A sequence of checkpoints are different points in time you can use as the reference point for starting an incremental backup. So if we don't like the term 'checkpoint', maybe virDomainBlockBackupReference would work. But it is longer, and would make for some mouthful API names. Also, you commented elsewhere that 'virDomainBackupBegin' misses out on the fact that under the hood, it is a block operation (only disk state); would 'virDomainBlockBackupBegin' be any better? There are fewer APIs with the term 'Backup' than with 'Checkpoint', if we do want go with that particular rename.
I do have more comments in patch2, but I want to make them coherent before posting. Still I wanted to be sure you got at least "some" feedback for this and well of course an opinion on checkpoint ;-)
docs/formatsnapshot.html.in | 14 +++++++------- include/libvirt/libvirt-domain-snapshot.h | 2 +- src/conf/snapshot_conf.c | 2 +- src/libvirt-domain-snapshot.c | 4 ++-- src/qemu/qemu_driver.c | 12 ++++++------ tools/virsh-snapshot.c | 2 +- tools/virsh.pod | 14 +++++++------- 7 files changed, 25 insertions(+), 25 deletions(-)
diff --git a/docs/formatsnapshot.html.in b/docs/formatsnapshot.html.in index fbbecfd242..f2e51df5ab 100644 --- a/docs/formatsnapshot.html.in +++ b/docs/formatsnapshot.html.in @@ -33,7 +33,7 @@ resume in a consistent state; but if the disks are modified externally in the meantime, this is likely to lead to data corruption.</dd> - <dt>system checkpoint</dt> + <dt>full system state</dt>
Is "state" superfluous in this context? IOW: Everywhere that "full system state" exists, it seems "full system" could be used.
Other synonyms that came up are complete, entire, integrated, or thorough (hah!). But I think "Full System" conveys enough meaning even though it could convey more meaning than intended.
Okay, I can live with shortening the replacement to 'full system'. Don't know if it will happen in the v2 series that I hope to post later tonight, or if it would be done on top (my immediate short-term goal is to get a demo of incremental backups working, to show the API is usable; although the demo depends on unreleased qemu code so only the API would actually go in this month's libvirt release, while the underlying src/qemu/* changes can be delayed and polished to be better than the demo for the time when qemu 3.0 releases the needed bitmap/NBD features).
<dd>A combination of disk snapshots for all disks as well as VM memory state, which can be used to resume the guest from where it left off with symptoms similar to hibernation (that is, TCP @@ -55,7 +55,7 @@ as <code>virDomainSaveImageGetXMLDesc()</code> to work with those files. </p> - <p>System checkpoints are created + <p>Full system state snapshots are created by <code>virDomainSnapshotCreateXML()</code> with no flags, and disk snapshots are created by the same function with the <code>VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY</code> flag; in
BTW: Existing and maybe it's just me, but when I read a conjunctive sentence with only two parts I don't expect to see ", and" or ", or" - it's just "and" or "or" without the comma....
Thanks for the careful grammar/legibility review. I'll try to fold in those suggestions (again, might not make it into v2). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

On 06/25/2018 06:27 PM, Eric Blake wrote:
On 06/22/2018 04:16 PM, John Ferlan wrote:
On 06/13/2018 12:42 PM, Eric Blake wrote:
Upcoming patches plan to introduce virDomainCheckpointPtr as a new object for use in incremental backups, along with documentation how incremental backups differ from snapshots. But first, we need to rename any existing mention of a 'system checkpoint' to instead be a 'full system state snapshot', so that we aren't overloading the term checkpoint.
Signed-off-by: Eric Blake <eblake@redhat.com>
--- Bikeshed suggestions on what to name the new object for use in backups is welcome, if we would rather keep the term 'checkpoint' for a disk+memory snapshot. ---
"Naming is hard" and opinions can vary greatly - be careful for what you ask in case you receive something not wanted ;-).
I haven't followed the discussions thus far all that closely, but I'll give this a go anyway since it's languishing and saying nothing is akin to implicitly agreeing everything is fine. Fair warning, I'm not all that familiar with snapshot algorithms having largely tried to ignore it since others (Eric and Peter) have far more in depth knowledge.
In any case, another option for the proposed "checkpoint" could be a "snapshot reference". One can start or end a reference period and then set or clear a reference point.
What I'm not clear on yet is whether the intention is to have this checkpoint (and backup) be integrated in any way to the existing snapshot algorithms. I guess part of me thinks that if I take a full system snapshot, then any backup/checkpoint data should be included so that if/when I go back to that point in time I can start from whence I left as it relates to my backup. Kind of a superset and/or integrated model rather than something bolted onto the side to resolve a specific need.
That's a tough call. My current design has incremental backups completely separate from the existing checkpoint code for several reasons: - the snapshot code is already confusing with lots of flags (internal/external, disk/memory, etc) - snapshots can be reverted to (well, in theory - we STILL can't revert to an external snapshot cleanly, even though the design supports it) - incremental backups are not direct revert points
so, rather than bolt something on to the existing design, I went with a new concept. As you found later in the series, I then tried to provide a good summary page describing the different pieces, and what tradeoffs are involved in order to know which approach will work for a given need.
Understood - since I've made it further now. Domain state is somewhat of a tangled or interwoven part of the domain XML lifecycle fabric, so I perhaps process it that way when reading new code. It seems the design is that XML for domain{checkpoint|backup} will be similar to domainstatus, but far more restrictive to the specific needs for each since you're saving the original config in the output.
I suppose a reservation I have about separate virDomainCheckpoint* and virDomainBackup* API's is understanding the relationship between the two naming spaces. IIUC though a Checkpoint would be reference point in time within a Backup period.
A sequence of snapshots are different points in time you can revert to. A sequence of checkpoints are different points in time you can use as the reference point for starting an incremental backup.
So if we don't like the term 'checkpoint', maybe virDomainBlockBackupReference would work. But it is longer, and would make for some mouthful API names.
I saw things as more of "{Create|Start|Set|Clear|End|Destroy}" type operations initially, but I've gone through my viewpoint has changed. I'm still concerned with over complicating with too many flags which is where we seem to have gotten ourselves in trouble with snapshot as time went on and more or less functionality points were desired. Checkpoint/backup is complex enough without adding levels of features for consumers that may or may not be used. In the long run, it's where do you give up control and how much can/should be assumed to be libvirt's "job". If a consumer wants to do something particularly tricky, then maybe we should just hand them the gun with one bullet in the chamber (so to speak). Provide an API that "locks out" other threads instead of doing that for them ;-) - someone always thinks they can do it better! In the long run, we don't necessarily know how all consumers would like to use this and so far there's been mixed opinions on what should be done. At some level the operation of starting checkpoints, setting checkpoints, and allowing/performing backups from a specific checkpoint are fairly straightforward type operations. It's the additional cruft and flags to do "fancy" stuff or the desire to worry about every possible operation option someone could possibly want that I think causes over complication. Let the up-the-stack app keep track of things if that's the complicated level they desire. Keep the libvirt side simple.
Also, you commented elsewhere that 'virDomainBackupBegin' misses out on the fact that under the hood, it is a block operation (only disk state); would 'virDomainBlockBackupBegin' be any better? There are fewer APIs with the term 'Backup' than with 'Checkpoint', if we do want go with that particular rename.
I think so, but I haven't reached the Backup API's yet. There's the interaction with NBD that's going to keep things "interesting". Reminding myself that under the covers, migrations start/stop an NBD server is I assume the impetus behind [re]using NBD as the 3rd party integration point. Perhaps something that becomes part of the documentation description for describing the difference between pull/push backup models.
I do have more comments in patch2, but I want to make them coherent before posting. Still I wanted to be sure you got at least "some" feedback for this and well of course an opinion on checkpoint ;-)
 docs/formatsnapshot.html.in              | 14 +++++++-------  include/libvirt/libvirt-domain-snapshot.h | 2 +-  src/conf/snapshot_conf.c                 | 2 +-  src/libvirt-domain-snapshot.c            | 4 ++--  src/qemu/qemu_driver.c                   | 12 ++++++------  tools/virsh-snapshot.c                   | 2 +-  tools/virsh.pod                          | 14 +++++++-------  7 files changed, 25 insertions(+), 25 deletions(-)
diff --git a/docs/formatsnapshot.html.in b/docs/formatsnapshot.html.in index fbbecfd242..f2e51df5ab 100644 --- a/docs/formatsnapshot.html.in +++ b/docs/formatsnapshot.html.in @@ -33,7 +33,7 @@          resume in a consistent state; but if the disks are modified          externally in the meantime, this is likely to lead to data          corruption.</dd> -     <dt>system checkpoint</dt> +     <dt>full system state</dt>
Is "state" superfluous in this context? IOW: Everywhere that "full system state" exists, it seems "full system" could be used.
Other synonyms that came up are complete, entire, integrated, or thorough (hah!). But I think "Full System" conveys enough meaning even though it could convey more meaning than intended.
Okay, I can live with shortening the replacement to 'full system'. Don't know if it will happen in the v2 series that I hope to post later tonight, or if it would be done on top (my immediate short-term goal is to get a demo of incremental backups working, to show the API is usable; although the demo depends on unreleased qemu code so only the API would actually go in this month's libvirt release, while the underlying src/qemu/* changes can be delayed and polished to be better than the demo for the time when qemu 3.0 releases the needed bitmap/NBD features).
OK - it's not 100% clear in my mind all the various pieces needed for this to all work properly. I understand the desire/need for inclusion; however, we do need to consider that in more recent times "waiting" for a more functional solution to be "ready" in QEMU before pushing libvirt changes - especially ones that bake an API...
       <dd>A combination of disk snapshots for all disks as well as VM          memory state, which can be used to resume the guest from where it          left off with symptoms similar to hibernation (that is, TCP @@ -55,7 +55,7 @@        as <code>virDomainSaveImageGetXMLDesc()</code> to work with        those files.      </p> -   <p>System checkpoints are created +   <p>Full system state snapshots are created        by <code>virDomainSnapshotCreateXML()</code> with no flags, and        disk snapshots are created by the same function with        the <code>VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY</code> flag; in
BTW: Existing and maybe it's just me, but when I read a conjunctive sentence with only two parts I don't expect to see ", and" or ", or" - it's just "and" or "or" without the comma....
Thanks for the careful grammar/legibility review. I'll try to fold in those suggestions (again, might not make it into v2).
Understood. John

On Wed, Jun 13, 2018 at 7:42 PM Eric Blake <eblake@redhat.com> wrote:
Upcoming patches plan to introduce virDomainCheckpointPtr as a new object for use in incremental backups, along with documentation how incremental backups differ from snapshots. But first, we need to rename any existing mention of a 'system checkpoint' to instead be a 'full system state snapshot', so that we aren't overloading the term checkpoint.
I want to refer only to the new concept of checkpoint, compared with snapshot. I think checkpoint should refer to the current snapshot. When you perform a backup, you should get the changed blocks in the current snapshot. When you restore, you want to the restore several complete snapshots, and one partial snapshot, based on the backups of that snapshot. Lets try to see an example: T1 - user create new vm marked for incremental backup - system create base volume (S1) - system create new dirty bitmap (B1) T2 - user create a snapshot - dirty bitmap in original snapshot deactivated (B1) - system create new snapshot (S2) - system starts new dirty bitmap in the new snapshot (B2) T3 - user create new checkpoint - system deactivate current dirty bitmap (B2) - system create new dirty bitmap (B3) - user backups data in snapshot S2 using dirty bitmap B2 - user backups data in snapshot S1 using dirty bitmap B1 T4 - user create new checkpoint - system deactivate current dirty bitmap (B3) - system create new dirty bitmap (B4) - user backups data in snapshot S2 using dirty bitmap B3 Lets say use want to restore to state as it was in T3 This is the data kept by the backup application: - snapshots - S1 - checkpoints - B1 - S2 - checkpoints - B2 - B3 T5 - user start restore to state in time T3 - user create new disk - user create empty snapshot S1 - user upload snapshot S1 data to stoage - user create empty snaphost disk S2 - user upload snapshot S1 data to stoage John, are dirty bitmaps implemented in this way in qemu? Nir

On 06/26/2018 10:56 AM, Nir Soffer wrote:
On Wed, Jun 13, 2018 at 7:42 PM Eric Blake <eblake@redhat.com> wrote:
Upcoming patches plan to introduce virDomainCheckpointPtr as a new object for use in incremental backups, along with documentation how incremental backups differ from snapshots. But first, we need to rename any existing mention of a 'system checkpoint' to instead be a 'full system state snapshot', so that we aren't overloading the term checkpoint.
I want to refer only to the new concept of checkpoint, compared with snapshot.
I think checkpoint should refer to the current snapshot. When you perform a backup, you should get the changed blocks in the current snapshot.
That is an incremental backup (copying only the blocks that have changed since some previous point in time) - and my design was that such points in time are named 'checkpoints', where the most recent checkpoint is the current checkpoint. This is different from a snapshot (which is enough state that you can revert back to that point in time directly) - a checkpoint only carries enough information to perform an incremental backup rather than a rollback to earlier state.
When you restore, you want to the restore several complete snapshots, and one partial snapshot, based on the backups of that snapshot.
I'm worried that overloading the term "snapshot" and/or "checkpoint" can make it difficult to see whether we are describing the same data motions. You are correct that in order to roll a virtual machine back to state represented by a series of incremental backups will require reconstructing the state present on the machine at the desired point to roll back to. But I'll have to read your example first to see if we're on the same page.
Lets try to see an example:
T1 - user create new vm marked for incremental backup - system create base volume (S1) - system create new dirty bitmap (B1)
Why do you need a dirty bitmap on a brand new system? By definition, if the VM is brand new, every sector that the guest touches will be part of the first incremental backup, which is no different than taking a full backup of every sector? But if it makes life easier by following consistent patterns, I also don't see a problem with creating a first checkpoint at the time an image is first created (my API proposal would allow you to create a domain, start it in the paused state, create a checkpoint, and then resume the guest so that it can start executing).
T2 - user create a snapshot - dirty bitmap in original snapshot deactivated (B1) - system create new snapshot (S2) - system starts new dirty bitmap in the new snapshot (B2)
I'm still worried that interactions between snapshots (where the backing chain grows) and bitmaps may present interesting challenges. But what you are describing here is that the act of creating a snapshot (to enlarge the backing chain) also has the effect of creating a snapshot (a new point in time for tracking incremental changes since the creation of the snapshot). Whether we have to copy B1 into image S2, or whether image S2 can get by with just bitmap B2, is an implementation detail.
T3 - user create new checkpoint - system deactivate current dirty bitmap (B2) - system create new dirty bitmap (B3) - user backups data in snapshot S2 using dirty bitmap B2 - user backups data in snapshot S1 using dirty bitmap B1
So here you are performing two incremental backups. Note: the user can already backup S1 without using any new APIs, and without reference to bitmap B1 - that's because B1 was started when S1 was created, and closed out when S1 was no longer modified - but now that S1 is a read-only file in the backing chain, copying S1 is the same as copying the clusters covered by bitmap B1. Also, my current API additions do NOT make it easy to grab just the incremental data covered by bitmap B1 at time T3; rather, the time to grab the copy of the data covered just by B1 is at time T2 when you create bitmap B2 (whether or not you also create file S2). The API additions as I have proposed them only make it easy to grab a full backup of all data up to time T3 (no checkpoint as its start), an incremental backup of all data since T1 (checkpoint T1 as its start, using the merge of B1 and B2 to learn which clusters to grab), or an incremental backup of all data since T2 (checkpoint T2 as its start, using B2 to learn which clusters to grab). If you NEED to grab an incremental snapshot whose history is NOT bounded by the current moment in time, then we need to rethink the operations we are offering via my new API. On the bright side, since my API for virDomainBackupBegin() takes an XML description, we DO have the option of enhancing that XML to take a second point in time as the end boundary (it already has an optional <incremental> tag as the first point in time for the start boundary; or a full backup if that tag is omitted) - if we enhance that XML, we'd also have to figure out how to map it to the operations that qemu exposes. (The blockdev-backup command makes it easy to grab an incremental backup ending at the current moment in time, by using the "sync":"none" option to a temporary scratch file so that further guest writes do not corrupt the data to be grabbed from that point in time - but it does NOT make it easy to see the state of data from an earlier point in time - I'll demonstrate that below).
T4 - user create new checkpoint - system deactivate current dirty bitmap (B3) - system create new dirty bitmap (B4) - user backups data in snapshot S2 using dirty bitmap B3
Yes, this is similar to what was done at T3, without the complication of trying to grab an incremental backup whose end boundary is not the current moment in time.
Lets say use want to restore to state as it was in T3
This is the data kept by the backup application:
- snapshots - S1 - checkpoints - B1 - S2 - checkpoints - B2 - B3
T5 - user start restore to state in time T3 - user create new disk - user create empty snapshot S1 - user upload snapshot S1 data to stoage - user create empty snaphost disk S2 - user upload snapshot S1 data to stoage
Presumably, this would be 'user uploads S2 to storage', not S1. But restoring in this manner didn't make any use of your incremental snapshots. Maybe what I need to do is give a more visual indication of what incremental backups store. At T1, we create S1 and start populating it. As this was a brand new guest, the storage starts empty. Since you mentioned B1, I'll show it here, even though I argued it is pointless other than for fewer differences from later cases: S1: |--------| B1: |--------| guest sees: |--------| At T2, the guest has written things, so we now have: S1: |AAAA----| B1: |XXXX----| guest sees: |AAAA----| where A is the contents of the data the guest has written, and X is an indication in the bitmap which sections are dirty. Also at time T2, we create a snapshot S2, making S1 become a read-only picture of the state of the disk at T2; we also started bitmap B2 on S2 to track what the guest does: S1: |AAAA----| <- S2: |--------| B1: |XXXX----| B2: |--------| we can copy S1 to S1.bak at any point in time now that S1 is readonly. S1.bak: |AAAA----| At T3, the guest has written things, so we now have: S1: |AAAA----| <- S2: |---BBB--| B1: |XXXX----| B2: |---XXX--| guest sees: |AAABBB--| so at this point, we freeze B2 and create B3; the new virDomainBackupBegin() API will let us also access the following copies at this time: S1: |AAAA----| <- S2: |---BBB--| B1: |XXXX----| B2: |---XXX--| B3: |--------| full3.bak (no checkpoint as starting point): |AAABBB--| B2.bak (checkpoint B2 as starting point): |---BBB--| B2.bak by itself does not match anything the guest ever saw, but you can string together: S1.bak <- S2.bak to reconstruct the state the guest saw at T3. By T4, the guest has made more edits: S1: |AAAA----| <- S2: |D--BBDD-| B1: |XXXX----| B2: |---XXX--| B3: |X----XX-| guest sees: |DAABBDD-| and as before, we now create B4, and have the option of several backups (usually, you'll only grab the most recent incremental backup, and not multiple backups; this is more an exploration of what is possible): full4.bak (no checkpoint as starting): |DAABBDD-| S2_3.bak (B2 as starting point, covering merge of B2 and B3): |D--BBDD-| S3.bak (B3 as starting point): |D----DD-| Note that both (S1.bak <- S2_3.bak) and (S1.bak <- S2.bak <- S3.bak) result in the same reconstructed guest image at time T4. Also note that reading the contents of bitmap B2 in isolation at this time is NOT usable (you'd get |---BBD--|, which has mixed the incremental difference from T2 to T3 with a subset of the difference from T3 to T4, so it NO LONGER REPRESENTS the state of the guest at either T2 or T3, even when used as an overlay on top of S1.bak). Hence, my emphasis that it is usually important to create your incremental backup at the same time you start your next bitmap, rather than trying to do it after the fact. Also, you are starting to see the benefits of incremental backups. Creating S2_3.bak doesn't necessarily need bitmaps (it results in the same image as you would get if you create a temporary overlay [S1 <- S2 <- tmp], copy off S2, then live merge tmp back into S2), but both full4.bak and S2_3.bak had to copy more data than S3.bak. Later on, if you want to roll back to what the guest saw at T4, you just have to restore [S1.bak <- S2.bak <- S3.bak] as your backing chain to provide the data the guest saw at that time.
John, are dirty bitmaps implemented in this way in qemu?
The whole point of the libvirt API proposals is to make it possible to create bitmaps in qcow2 images at the point where you are creating incremental backups, so that the next incremental backup can be created using the previous one as its base. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

On 06/26/2018 08:27 PM, Eric Blake wrote:
Lets try to see an example:
T1 - user create new vm marked for incremental backup - system create base volume (S1) - system create new dirty bitmap (B1)
Why do you need a dirty bitmap on a brand new system? By definition, if the VM is brand new, every sector that the guest touches will be part of the first incremental backup, which is no different than taking a full backup of every sector? But if it makes life easier by following consistent patterns, I also don't see a problem with creating a first checkpoint at the time an image is first created (my API proposal would allow you to create a domain, start it in the paused state, create a checkpoint, and then resume the guest so that it can start executing).
T2 - user create a snapshot - dirty bitmap in original snapshot deactivated (B1) - system create new snapshot (S2) - system starts new dirty bitmap in the new snapshot (B2)
I'm still worried that interactions between snapshots (where the backing chain grows) and bitmaps may present interesting challenges. But what you are describing here is that the act of creating a snapshot (to enlarge the backing chain) also has the effect of creating a snapshot (a
that should read "also has the effect of creating a checkpoint" Except that I'm not quite sure how best to handle the interaction between snapshots and checkpoints using existing qemu primitives. Right now, I'm leaning back to the idea that if you have an external backing file (that is, the act of creating a snapshot expanded the disk chain from 'S1' into 'S1 <- S2'), then creating an incremental backup that covers just the disk changes since that point in time is the same as a "sync":"top" copy of the just-created S2 image (no bitmap is needed to track what needs copying) - which works well for qemu writing out the backup file. But since we are talking about allowing third-party backups (where we provide an NBD export and the client can query which portions are dirty), then using the snapshot as the start point in time would indeed require that we either have a bitmap to expose (that is, we need to create a bitmap as part of the same transaction as creating the external snapshot file), or that we can resynthesize a bitmap based on the clusters allocated in S2 at the time we start the backup operation (that's an operation that I don't see in qemu right now). And if we DO want to allow external snapshots to automatically behave as checkpoints for use by incremental backups, that makes me wonder if I need to eventually enhance the existing virDomainSnapshotCreateXML() to also accept XML describing a checkpoint to be created simultaneously with the snapshot (the way my proposal already allows creating a checkpoint simultaneously with virDomainBackupBegin()). Another point that John and I discussed on IRC is that migrating bitmaps still has some design work to figure out. Remember, right now, there are basically three modes of operation regarding storage between source and endpoint of a migration: 1. Storage is shared. As long as qemu flushes the bitmap before inactivating on the source, then activating on the destination can load the bitmap, and everything is fine. The migration stream does not have to include the bitmaps. 2. Storage is not shared, but the storage is migrated via flags to the migrate command (we're trying to move away from this version) - there, qemu knows that it has to migrate the bitmaps as part of the migration stream. 3. Storage is not shared, and the storage is migrated via NBD (libvirt favors using this version for non-shared storage). Libvirt starts 'qemu -S' on the destination, pre-creates a destination file large enough to match the source, starts an NBD server at the destination, then on the source starts a block-mirror operation to the destination. When the drive is mirrored, libvirt then kicks off the migration using the same command as in style 1; when all state is transferred, the source then stops the block-mirror, disconnects the NBD client, the destination then stops the NBD server, and the destination can finally start executing. But note that in this mode, no bitmaps are migrated. So we need some way for libvirt to also migrate bitmap state to the destination (perhaps having the NBD server open multiple exports - one for the block itself, but another export for each bitmap that needs to be copied). At this point, I think the pressure is on me to provide a working demo of incremental backups working without any external snapshots or migration, before we expand into figuring out interactions between features. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

Upcoming patches will add support for incremental backups via a new API; but first, we need a landing page that gives an overview of capturing various pieces of guest state, and which APIs are best suited to which tasks. Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/docs.html.in | 5 ++ docs/domainstatecapture.html.in | 190 ++++++++++++++++++++++++++++++++++++++++ docs/formatsnapshot.html.in | 2 + 3 files changed, 197 insertions(+) create mode 100644 docs/domainstatecapture.html.in diff --git a/docs/docs.html.in b/docs/docs.html.in index 40e0e3b82e..4c46b74980 100644 --- a/docs/docs.html.in +++ b/docs/docs.html.in @@ -120,6 +120,11 @@ <dt><a href="secureusage.html">Secure usage</a></dt> <dd>Secure usage of the libvirt APIs</dd> + + <dt><a href="domainstatecapture.html">Domain state + capture</a></dt> + <dd>Comparison between different methods of capturing domain + state</dd> </dl> </div> diff --git a/docs/domainstatecapture.html.in b/docs/domainstatecapture.html.in new file mode 100644 index 0000000000..00ab7e8ee1 --- /dev/null +++ b/docs/domainstatecapture.html.in @@ -0,0 +1,190 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html> +<html xmlns="http://www.w3.org/1999/xhtml"> + <body> + + <h1>Domain state capture using Libvirt</h1> + + <ul id="toc"></ul> + + <p> + This page compares the different means for capturing state + related to a domain managed by libvirt, in order to aid + application developers to choose which operations best suit + their needs. + </p> + + <h2><a id="definitions">State capture trade-offs</a></h2> + + <p>One of the features made possible with virtual machines is live + migration, or transferring all state related to the guest from + one host to another, with minimal interruption to the guest's + activity. A clever observer will then note that if all state is + available for live migration, there is nothing stopping a user + from saving that state at a given point of time, to be able to + later rewind guest execution back to the state it previously + had. There are several different libvirt APIs associated with + capturing the state of a guest, such that the captured state can + later be used to rewind that guest to the conditions it was in + earlier. But since there are multiple APIs, it is best to + understand the tradeoffs and differences between them, in order + to choose the best API for a given task. + </p> + + <dl> + <dt>Timing</dt> + <dd>Capturing state can be a lengthy process, so while the + captured state ideally represents an atomic point in time + correpsonding to something the guest was actually executing, + some interfaces require up-front preparation (the state + captured is not complete until the API ends, which may be some + time after the command was first started), while other + interfaces track the state when the command was first issued + even if it takes some time to finish capturing the state. + While it is possible to freeze guest I/O around either point + in time (so that the captured state is fully consistent, + rather than just crash-consistent), knowing whether the state + is captured at the start or end of the command may determine + which approach to use. A related concept is the amount of + downtime the guest will experience during the capture, + particularly since freezing guest I/O has time + constraints.</dd> + + <dt>Amount of state</dt> + <dd>For an offline guest, only the contents of the guest disks + needs to be captured; restoring that state is merely a fresh + boot with the disks restored to that state. But for an online + guest, there is a choice between storing the guest's memory + (all that is needed during live migration where the storage is + shared between source and destination), the guest's disk state + (all that is needed if there are no pending guest I/O + transactions that would be lost without the corresponding + memory state), or both together. Unless guest I/O is quiesced + prior to capturing state, then reverting to captured disk + state of a live guest without the corresponding memory state + is comparable to booting a machine that previously lost power + without a clean shutdown; but for a guest that uses + appropriate journaling methods, this crash-consistent state + may be sufficient to avoid the additional storage and time + needed to capture memory state.</dd> + + <dt>Quantity of files</dt> + <dd>When capturing state, some approaches store all state within + the same file (internal), while others expand a chain of + related files that must be used together (external), for more + files that a management application must track. There are + also differences depending on whether the state is captured in + the same file in use by a running guest, or whether the state + is captured to a distinct file without impacting the files + used to run the guest.</dd> + + <dt>Third-party integration</dt> + <dd>When capturing state, particularly for a running, there are + tradeoffs to how much of the process must be done directly by + the hypervisor, and how much can be off-loaded to third-party + software. Since capturing state is not instantaneous, it is + essential that any third-party integration see consistent data + even if the running guest continues to modify that data after + the point in time of the capture.</dd> + + <dt>Full vs. partial</dt> + <dd>When capturing state, it is useful to minimize the amount of + state that must be captured in relation to a previous capture, + by focusing only on the portions of the disk that the guest + has modified since the previous capture. Some approaches are + able to take advantage of checkpoints to provide an + incremental backup, while others are only capable of a full + backup including portions of the disk that have not changed + since the previous state capture.</dd> + </dl> + + <h2><a id="apis">State capture APIs</a></h2> + <p>With those definitions, the following libvirt APIs have these + properties:</p> + <dl> + <dt>virDomainSnapshotCreateXML()</dt> + <dd>This API wraps several approaches for capturing guest state, + with a general premise of creating a snapshot (where the + current guest resources are frozen in time and a new wrapper + layer is opened for tracking subsequent guest changes). It + can operate on both offline and running guests, can choose + whether to capture the state of memory, disk, or both when + used on a running guest, and can choose between internal and + external storage for captured state. However, it is geared + towards post-event captures (when capturing both memory and + disk state, the disk state is not captured until all memory + state has been collected first). For qemu as the hypervisor, + internal snapshots currently have lengthy downtime that is + incompatible with freezing guest I/O, but external snapshots + are quick. Since creating an external snapshot changes which + disk image resource is in use by the guest, this API can be + coupled with <code>virDomainBlockCommit()</code> to restore + things back to the guest using its original disk image, where + a third-party tool can read the backing file prior to the live + commit. See also the <a href="formatsnapshot.html">XML + details</a> used with this command.</dd> + <dt>virDomainBlockCopy()</dt> + <dd>This API wraps approaches for capturing the state of disks + of a running guest, but does not track accompanying guest + memory state, and can only operate on one block device per job + (to get a consistent copy of multiple disks, the domain must + be paused before ending the multiple jobs). The capture is + consistent only at the end of the operation, with a choice to + either pivot to the new file that contains the copy (leaving + the old file as the backup), or to return to the original file + (leaving the new file as the backup).</dd> + <dt>virDomainBackupBegin()</dt> + <dd>This API wraps approaches for capturing the state of disks + of a running guest, but does not track accompanying guest + memory state. The capture is consistent to the start of the + operation, where the captured state is stored independently + from the disk image in use with the guest, and where it can be + easily integrated with a third-party for capturing the disk + state. Since the backup operation is stored externally from + the guest resources, there is no need to commit data back in + at the completion of the operation. When coupled with + checkpoints, this can be used to capture incremental backups + instead of full.</dd> + <dt>virDomainCheckpointCreateXML()</dt> + <dd>This API does not actually capture guest state, so much as + make it possible to track which portions of guest disks have + change between checkpoints or between a current checkpoint and + the live execution of the guest. When performing incremental + backups, it is easier to create a new checkpoint at the same + time as a new backup, so that the next incremental backup can + refer to the incremental state since the checkpoint created + during the current backup. Guest state is then actually + captured using <code>virDomainBackupBegin()</code>. <!--See also + the <a href="formatcheckpoint.html">XML details</a> used with + this command.--></dd> + </dl> + + <h2><a id="examples">Examples</a></h2> + <p>The following two sequences both capture the disk state of a + running guest, then complete with the guest running on its + original disk image; but with a difference that an unexpected + interruption during the first mode leaves a temporary wrapper + file that must be accounted for, while interruption of the + second mode has no impact to the guest.</p> + <p>1. Backup via temporary snapshot + <pre> +virDomainFSFreeze() +virDomainSnapshotCreateXML(VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) +virDomainFSThaw() +third-party copy the backing file to backup storage # most time spent here +virDomainBlockCommit(VIR_DOMAIN_BLOCK_COMMIT_ACTIVE) per disk +wait for commit ready event per disk +virDomainBlockJobAbort() per disk + </pre></p> + + <p>2. Direct backup + <pre> +virDomainFSFreeze() +virDomainBackupBegin() +virDomainFSThaw() +wait for push mode event, or pull data over NBD # most time spent here +virDomainBackeupEnd() + </pre></p> + + </body> +</html> diff --git a/docs/formatsnapshot.html.in b/docs/formatsnapshot.html.in index f2e51df5ab..d7051683a5 100644 --- a/docs/formatsnapshot.html.in +++ b/docs/formatsnapshot.html.in @@ -9,6 +9,8 @@ <h2><a id="SnapshotAttributes">Snapshot XML</a></h2> <p> + Snapshots are one form + of <a href="domainstatecapture.html">domain state capture</a>. There are several types of snapshots: </p> <dl> -- 2.14.4

On 06/13/2018 12:42 PM, Eric Blake wrote:
Upcoming patches will add support for incremental backups via a new API; but first, we need a landing page that gives an overview of capturing various pieces of guest state, and which APIs are best suited to which tasks.
Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/docs.html.in | 5 ++ docs/domainstatecapture.html.in | 190 ++++++++++++++++++++++++++++++++++++++++ docs/formatsnapshot.html.in | 2 + 3 files changed, 197 insertions(+) create mode 100644 docs/domainstatecapture.html.in
This got a lot messier than originally intended. As noted in my response for .1 - I haven't really followed the discussions thus far - so take it with that viewpoint - someone from outside the current discussion trying to make sense of what this topic is all about.
diff --git a/docs/docs.html.in b/docs/docs.html.in index 40e0e3b82e..4c46b74980 100644 --- a/docs/docs.html.in +++ b/docs/docs.html.in @@ -120,6 +120,11 @@
<dt><a href="secureusage.html">Secure usage</a></dt> <dd>Secure usage of the libvirt APIs</dd> + + <dt><a href="domainstatecapture.html">Domain state + capture</a></dt> + <dd>Comparison between different methods of capturing domain + state</dd> </dl> </div>
diff --git a/docs/domainstatecapture.html.in b/docs/domainstatecapture.html.in new file mode 100644 index 0000000000..00ab7e8ee1 --- /dev/null +++ b/docs/domainstatecapture.html.in @@ -0,0 +1,190 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html> +<html xmlns="http://www.w3.org/1999/xhtml"> + <body> + + <h1>Domain state capture using Libvirt</h1> + + <ul id="toc"></ul> + + <p> + This page compares the different means for capturing state + related to a domain managed by libvirt, in order to aid + application developers to choose which operations best suit + their needs.
I would alter the sentence at the comma... IOW: In order to aid ... their needs, this page compares ... by libvirt. Then rather than discussing this below - I think we really need to state right at the top the following: </p> <p> The information here is primarily geared towards capturing the state of the active domain. Capturing state of an inactive domain essentially only requires the contents of the guest disks and then restoring that state is merely a fresh boot with the disks restored to that state. There are aspects of the subsequent functionality that cover the inactive state collection, but it's not the primary focus.
+ </p> + + <h2><a id="definitions">State capture trade-offs</a></h2> + + <p>One of the features made possible with virtual machines is live + migration, or transferring all state related to the guest from + one host to another, with minimal interruption to the guest's
to me the commas are unnecessary.
+ activity. A clever observer will then note that if all state is
s/activity./activity. In this case, state includes domain memory including the current instruction stream and domain storage, whether that is local virtual disks which are not present on a target host or networked storage being updated by the local hypervisor. A clever... [BTW: In rereading my response - I almost want to add - "As it relates to domain checkpoints and backups, state only includes disk state change.". However, I'm not sure if that ties in yet or not. I think it only matters for the two new API's being discussed.]
+ available for live migration, there is nothing stopping a user
, then there is...
+ from saving that state at a given point of time, to be able to
s/,/ in order/
+ later rewind guest execution back to the state it previously + had. There are several different libvirt APIs associated with
[BTW: The following includes something else I pulled up from the list below...] s/had. /had. The astute reader will also realize that state capture at any level requires that the data must be stored and managed by some mechanism. This processing may be to a single file or some set of chained files. This is the inflection point between where Libvirt would (could, should?) integrate with third party tools that are built around managing the volume of data possibly generated by multiple domains with multiple disks. This leaves the task of synchronizing the capture algorithms to Libvirt in order to be able to work seamlessly with the underlying hypervisor. <paragraph break> There are several libvirt APIs associated with ... (different is superfluous)
+ capturing the state of a guest, such that the captured state can
s/, such that the captured state/which
+ later be used to rewind that guest to the conditions it was in + earlier. But since there are multiple APIs, it is best to + understand the tradeoffs and differences between them, in order> + to choose the best API for a given task.
s/But since ... given task./The following is a list of trade-offs and differences between the various facets that affect capturing domain state for active domains:/
+ </p> + + <dl> + <dt>Timing</dt>
"Data Completeness" (or Integrity)
+ <dd>Capturing state can be a lengthy process, so while the + captured state ideally represents an atomic point in time + correpsonding to something the guest was actually executing,
corresponding
+ some interfaces require up-front preparation (the state
s/preparation (the.../preparation. The ...
+ captured is not complete until the API ends, which may be some + time after the command was first started), while other
s/started), while .../started. While .../
+ interfaces track the state when the command was first issued + even if it takes some time to finish capturing the state.
Feels like a paragraph break... Or even a whole new bullet: <dt>Quiescing of Data</dt>
+ While it is possible to freeze guest I/O around either point + in time (so that the captured state is fully consistent,
s/around either point in time/at any point in time/
+ rather than just crash-consistent), knowing whether the state + is captured at the start or end of the command may determine + which approach to use. A related concept is the amount of + downtime the guest will experience during the capture, + particularly since freezing guest I/O has time + constraints.</dd>
That last sentence: Freezing guest I/O can be problematic depending on what the guest's expectations are and the duration of the freeze. Some software will rightfully panic once it is given the chance to realize it lost some number of seconds. In general though long pauses are unacceptable, so reducing the time spent frozen is a goal of management software. Still the balance for data integrity and completeness does require some amount of time.
+ + <dt>Amount of state</dt>
I pulled part of this to earlier - let's face it, the offline guest is not the focus of this work, so I think it's only worth discussing or noting that an API can handle active or inactive guests.
+ <dd>For an offline guest, only the contents of the guest disks + needs to be captured; restoring that state is merely a fresh + boot with the disks restored to that state. But for an online
<dt>Memory State only, Disk change only, or Both</dt>
+ guest, there is a choice between storing the guest's memory + (all that is needed during live migration where the storage is + shared between source and destination), the guest's disk state + (all that is needed if there are no pending guest I/O + transactions that would be lost without the corresponding + memory state), or both together. Unless guest I/O is quiesced + prior to capturing state, then reverting to captured disk + state of a live guest without the corresponding memory state + is comparable to booting a machine that previously lost power + without a clean shutdown; but for a guest that uses + appropriate journaling methods, this crash-consistent state + may be sufficient to avoid the additional storage and time + needed to capture memory state.</dd> + + <dt>Quantity of files</dt>
I ended up essentially moving this concept prior to the list of facets. I suppose it could go here too, but I'm not sure this ends up being so much of a tradeoff as opposed to an overall design decision made when deciding to implement some sort of backup solution. In the long run, the management software "doesn't care" where or how the data is stored - it's just providing the data.
+ <dd>When capturing state, some approaches store all state within + the same file (internal), while others expand a chain of + related files that must be used together (external), for more + files that a management application must track. There are> + also differences depending on whether the state is captured in + the same file in use by a running guest, or whether the state + is captured to a distinct file without impacting the files + used to run the guest.</dd>
That last sentence could almost be it's own bullet "Impact to Active State". Libvirt already captures active state as part of normal processing by updates to the domain's active XML.
+ + <dt>Third-party integration</dt>
So again, this isn't a facet affecting capturing state - it seems to be more of a statement related to the reality of what Libvirt can/should be expected to do vs. being able to allow configurability for external forces to make certain decisions.
+ <dd>When capturing state, particularly for a running, there are + tradeoffs to how much of the process must be done directly by + the hypervisor, and how much can be off-loaded to third-party + software. Since capturing state is not instantaneous, it is + essential that any third-party integration see consistent data + even if the running guest continues to modify that data after + the point in time of the capture.</dd> + + <dt>Full vs. partial</dt>
How about Full vs. Incremental
+ <dd>When capturing state, it is useful to minimize the amount of + state that must be captured in relation to a previous capture, + by focusing only on the portions of the disk that the guest + has modified since the previous capture. Some approaches are + able to take advantage of checkpoints to provide an + incremental backup, while others are only capable of a full + backup including portions of the disk that have not changed + since the previous state capture.</dd>
Is there a "downside" to the time needed for ma[r]king the "first checkpoint"? Or do we dictate/assume that someone has some sort of backup already before starting the domain and updates thereafter are all incremental. FWIW: A couple of others that come to mind that are facets: <dt>Local or Remote Storage</dt> Domains that completely use remote storage may only need some mechanism to keep track of the guest memory state while using some external means to manage/track the remote storage. Still even with that, it's possible that the hypervisor has I/O's "in flight" to the network storage that could be "important data" in the big picture. So having the capability to have the management software keeping track of all disk state can be important. <dt>Network Latency</dt> Whether it's domain storage or the saving of the domain data into some remote storage, network latency has an impact on snapshot data. Having dedicated network capacity/bandwidth and/or properly set quality of service certainly helps.
+ </dl> +
Perhaps in some way providing an example of a migration vs. pure save could be helpful - using of course the various facets. Something like: An example of the various facets in action is migration of a running guest. In order for the guest to be able to start on a target from whence it left off on a source, the guest has to get to a point where execution on the source is stopped, the last remaining changes occurring since the migration started are then transferred, and the guest is started on the target. The management software thus must keep track of the starting point and any changes since the starting point. These last changes are often referred to as dirty page tracking or dirty disk block bitmaps. At some point in time during the migration, the management software must freeze the source guest, transfer the dirty data, and then start the guest on the target. This period of time must be minimal. To minimize overall migration time, one is advised to use a dedicated network connection with a high quality of service. Alternatively saving the current state of the running guest can just be a point in time type operation which doesn't require updating the "last vestiges" of state prior to writing out the saved state file. The state file is the point in time of whatever is current and may contain incomplete data which if used to restart the guest could cause confusion or problems because some operation wasn't completed depending upon where in time the operation was commenced.
+ <h2><a id="apis">State capture APIs</a></h2> + <p>With those definitions, the following libvirt APIs have these + properties:</p>
Do you think perhaps it may be a good idea to list the pros and cons of each of the APIs? As in, why someone would want to use one over another? and of course which work together...
+ <dl> + <dt>virDomainSnapshotCreateXML()</dt> + <dd>This API wraps several approaches for capturing guest state, + with a general premise of creating a snapshot (where the + current guest resources are frozen in time and a new wrapper + layer is opened for tracking subsequent guest changes). It + can operate on both offline and running guests, can choose + whether to capture the state of memory, disk, or both when + used on a running guest, and can choose between internal and + external storage for captured state. However, it is geared + towards post-event captures (when capturing both memory and + disk state, the disk state is not captured until all memory + state has been collected first). For qemu as the hypervisor,
s/For qemu/Using QEMU/
+ internal snapshots currently have lengthy downtime that is + incompatible with freezing guest I/O, but external snapshots + are quick. Since creating an external snapshot changes which + disk image resource is in use by the guest, this API can be + coupled with <code>virDomainBlockCommit()</code> to restore + things back to the guest using its original disk image, where + a third-party tool can read the backing file prior to the live + commit. See also the <a href="formatsnapshot.html">XML + details</a> used with this command.</dd>
Some random grumbling from me about the complexity of SnapshotCreateXML interacting with BlockCommit ;-)... Still should perhaps the Block API's be listed first to create a grounding of what they do so that when discussed as part of this Snapshot section? Of course all that without getting in the gnarly details of using Block APIs. The SnapshotCreateXML API is a "complex maze" of flag usage where it's important to understand the nuances between active/inactive, internal/external, memory/disk/both, and reversion to the point in time.
+ <dt>virDomainBlockCopy()</dt> + <dd>This API wraps approaches for capturing the state of disks + of a running guest, but does not track accompanying guest + memory state, and can only operate on one block device per job
s/, and/ and/ realistically the part about tracking guest memory state is probably unnecessary.
+ (to get a consistent copy of multiple disks, the domain must + be paused before ending the multiple jobs). The capture is
s/job (to .../job. To .../ s/jobs)./jobs./
+ consistent only at the end of the operation, with a choice to
s/, with/ with/
+ either pivot to the new file that contains the copy (leaving + the old file as the backup), or to return to the original file
s/), or/) or/
+ (leaving the new file as the backup).</dd>
From my perspective, the next two should be added once the two API's are introduced. Still for grounding this was a good introduction. I think
s/(// s/)// The next two aren't even introduced yet... But I'd probably also want to know about virDomain{Save|ManagedSave[Image]} which while not snapshot level type API's, they can be used to save domain state information. And since we discussed quiesce and freeze/thaw above, should there should virDomainFS{Freeze|Thaw} be discussed - if only to note their usage? the order should be changed since before performing an incremental backup is possible there must be some way to set when time begins.
+ <dt>virDomainBackupBegin()</dt> + <dd>This API wraps approaches for capturing the state of disks + of a running guest, but does not track accompanying guest + memory state. The capture is consistent to the start of the + operation, where the captured state is stored independently + from the disk image in use with the guest, and where it can be
s/, and/ and/
+ easily integrated with a third-party for capturing the disk + state. Since the backup operation is stored externally from + the guest resources, there is no need to commit data back in + at the completion of the operation. When coupled with + checkpoints, this can be used to capture incremental backups + instead of full.</dd> + <dt>virDomainCheckpointCreateXML()</dt> + <dd>This API does not actually capture guest state, so much as + make it possible to track which portions of guest disks have
s/, so much as make/, rather it makes/
+ change between checkpoints or between a current checkpoint and
changed?
+ the live execution of the guest. When performing incremental + backups, it is easier to create a new checkpoint at the same + time as a new backup, so that the next incremental backup can + refer to the incremental state since the checkpoint created + during the current backup. Guest state is then actually
The "When performing ... current backup" description is a bit confusing to read for me. So do I create Checkpoint before after the Backup or both? Hard to be simultaneous - one comes first.
+ captured using <code>virDomainBackupBegin()</code>. <!--See also + the <a href="formatcheckpoint.html">XML details</a> used with + this command.--></dd>
This reliance on one another gets really confusing... Using the term "Guest state" for Checkpoint to mean something different than Snapshot makes for difficult comprehension in the grand scheme of domain management. Of slight concern is that there's nothing in the Checkpoint or Backup API naming scheme that says this is for disk/block only. There's also nothing that would cause me to think they are related without reading their descriptions.
+ </dl> + + <h2><a id="examples">Examples</a></h2> + <p>The following two sequences both capture the disk state of a + running guest, then complete with the guest running on its + original disk image; but with a difference that an unexpected + interruption during the first mode leaves a temporary wrapper + file that must be accounted for, while interruption of the + second mode has no impact to the guest.</p> + <p>1. Backup via temporary snapshot + <pre> +virDomainFSFreeze() +virDomainSnapshotCreateXML(VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) +virDomainFSThaw() +third-party copy the backing file to backup storage # most time spent here +virDomainBlockCommit(VIR_DOMAIN_BLOCK_COMMIT_ACTIVE) per disk +wait for commit ready event per disk +virDomainBlockJobAbort() per disk + </pre></p> + + <p>2. Direct backup + <pre> +virDomainFSFreeze() +virDomainBackupBegin() +virDomainFSThaw() +wait for push mode event, or pull data over NBD # most time spent here +virDomainBackeupEnd() + </pre></p> + + </body> +</html>
The examples certainly need more beef w/r/t description. Personally, I like the fact that we're heavily documenting things before writing the code rather than the opposite direction. John
diff --git a/docs/formatsnapshot.html.in b/docs/formatsnapshot.html.in index f2e51df5ab..d7051683a5 100644 --- a/docs/formatsnapshot.html.in +++ b/docs/formatsnapshot.html.in @@ -9,6 +9,8 @@ <h2><a id="SnapshotAttributes">Snapshot XML</a></h2>
<p> + Snapshots are one form + of <a href="domainstatecapture.html">domain state capture</a>. There are several types of snapshots: </p> <dl>

On Wed, Jun 13, 2018 at 7:42 PM Eric Blake <eblake@redhat.com> wrote:
Upcoming patches will add support for incremental backups via a new API; but first, we need a landing page that gives an overview of capturing various pieces of guest state, and which APIs are best suited to which tasks.
Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/docs.html.in | 5 ++ docs/domainstatecapture.html.in | 190 ++++++++++++++++++++++++++++++++++++++++ docs/formatsnapshot.html.in | 2 + 3 files changed, 197 insertions(+) create mode 100644 docs/domainstatecapture.html.in
diff --git a/docs/docs.html.in b/docs/docs.html.in index 40e0e3b82e..4c46b74980 100644 --- a/docs/docs.html.in +++ b/docs/docs.html.in @@ -120,6 +120,11 @@
<dt><a href="secureusage.html">Secure usage</a></dt> <dd>Secure usage of the libvirt APIs</dd> + + <dt><a href="domainstatecapture.html">Domain state + capture</a></dt> + <dd>Comparison between different methods of capturing domain + state</dd> </dl> </div>
diff --git a/docs/domainstatecapture.html.in b/docs/ domainstatecapture.html.in new file mode 100644 index 0000000000..00ab7e8ee1 --- /dev/null +++ b/docs/domainstatecapture.html.in @@ -0,0 +1,190 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html> +<html xmlns="http://www.w3.org/1999/xhtml"> + <body> + + <h1>Domain state capture using Libvirt</h1> + + <ul id="toc"></ul> + + <p> + This page compares the different means for capturing state + related to a domain managed by libvirt, in order to aid + application developers to choose which operations best suit + their needs. + </p> + + <h2><a id="definitions">State capture trade-offs</a></h2> + + <p>One of the features made possible with virtual machines is live + migration, or transferring all state related to the guest from + one host to another, with minimal interruption to the guest's + activity. A clever observer will then note that if all state is + available for live migration, there is nothing stopping a user + from saving that state at a given point of time, to be able to + later rewind guest execution back to the state it previously + had. There are several different libvirt APIs associated with + capturing the state of a guest, such that the captured state can + later be used to rewind that guest to the conditions it was in + earlier. But since there are multiple APIs, it is best to + understand the tradeoffs and differences between them, in order + to choose the best API for a given task. + </p> + + <dl> + <dt>Timing</dt> + <dd>Capturing state can be a lengthy process, so while the + captured state ideally represents an atomic point in time + correpsonding to something the guest was actually executing, + some interfaces require up-front preparation (the state + captured is not complete until the API ends, which may be some + time after the command was first started), while other + interfaces track the state when the command was first issued + even if it takes some time to finish capturing the state. + While it is possible to freeze guest I/O around either point + in time (so that the captured state is fully consistent, + rather than just crash-consistent), knowing whether the state + is captured at the start or end of the command may determine + which approach to use. A related concept is the amount of + downtime the guest will experience during the capture, + particularly since freezing guest I/O has time + constraints.</dd> + + <dt>Amount of state</dt> + <dd>For an offline guest, only the contents of the guest disks + needs to be captured; restoring that state is merely a fresh + boot with the disks restored to that state. But for an online + guest, there is a choice between storing the guest's memory + (all that is needed during live migration where the storage is + shared between source and destination), the guest's disk state + (all that is needed if there are no pending guest I/O + transactions that would be lost without the corresponding + memory state), or both together. Unless guest I/O is quiesced + prior to capturing state, then reverting to captured disk + state of a live guest without the corresponding memory state + is comparable to booting a machine that previously lost power + without a clean shutdown; but for a guest that uses + appropriate journaling methods, this crash-consistent state + may be sufficient to avoid the additional storage and time + needed to capture memory state.</dd> + + <dt>Quantity of files</dt> + <dd>When capturing state, some approaches store all state within + the same file (internal), while others expand a chain of + related files that must be used together (external), for more + files that a management application must track. There are + also differences depending on whether the state is captured in + the same file in use by a running guest, or whether the state + is captured to a distinct file without impacting the files + used to run the guest.</dd> + + <dt>Third-party integration</dt> + <dd>When capturing state, particularly for a running, there are + tradeoffs to how much of the process must be done directly by + the hypervisor, and how much can be off-loaded to third-party + software. Since capturing state is not instantaneous, it is + essential that any third-party integration see consistent data + even if the running guest continues to modify that data after + the point in time of the capture.</dd> + + <dt>Full vs. partial</dt> + <dd>When capturing state, it is useful to minimize the amount of + state that must be captured in relation to a previous capture, + by focusing only on the portions of the disk that the guest + has modified since the previous capture. Some approaches are + able to take advantage of checkpoints to provide an + incremental backup, while others are only capable of a full + backup including portions of the disk that have not changed + since the previous state capture.</dd> + </dl> + + <h2><a id="apis">State capture APIs</a></h2> + <p>With those definitions, the following libvirt APIs have these + properties:</p> + <dl> + <dt>virDomainSnapshotCreateXML()</dt> + <dd>This API wraps several approaches for capturing guest state, + with a general premise of creating a snapshot (where the + current guest resources are frozen in time and a new wrapper + layer is opened for tracking subsequent guest changes). It + can operate on both offline and running guests, can choose + whether to capture the state of memory, disk, or both when + used on a running guest, and can choose between internal and + external storage for captured state. However, it is geared + towards post-event captures (when capturing both memory and + disk state, the disk state is not captured until all memory + state has been collected first). For qemu as the hypervisor, + internal snapshots currently have lengthy downtime that is + incompatible with freezing guest I/O, but external snapshots + are quick. Since creating an external snapshot changes which + disk image resource is in use by the guest, this API can be + coupled with <code>virDomainBlockCommit()</code> to restore + things back to the guest using its original disk image, where + a third-party tool can read the backing file prior to the live + commit. See also the <a href="formatsnapshot.html">XML + details</a> used with this command.</dd>
Needs blank line between list items for easier reading of the source.
+ <dt>virDomainBlockCopy()</dt> + <dd>This API wraps approaches for capturing the state of disks + of a running guest, but does not track accompanying guest + memory state, and can only operate on one block device per job + (to get a consistent copy of multiple disks, the domain must + be paused before ending the multiple jobs). The capture is + consistent only at the end of the operation, with a choice to + either pivot to the new file that contains the copy (leaving + the old file as the backup), or to return to the original file + (leaving the new file as the backup).</dd>
+ <dt>virDomainBackupBegin()</dt> + <dd>This API wraps approaches for capturing the state of disks + of a running guest, but does not track accompanying guest + memory state. The capture is consistent to the start of the + operation, where the captured state is stored independently + from the disk image in use with the guest, and where it can be + easily integrated with a third-party for capturing the disk + state. Since the backup operation is stored externally from + the guest resources, there is no need to commit data back in + at the completion of the operation. When coupled with + checkpoints, this can be used to capture incremental backups + instead of full.</dd>
I think we should describe checkpoints before backups, since the expected flow is: - user start backup - system create checkpoint using virDomainCheckpointCreateXML - system query amount of data pointed by the previous checkpoint bitmaps - system create temporary storage for the backup - system starts backup using virDomainBackupBegin
+ <dt>virDomainCheckpointCreateXML()</dt> + <dd>This API does not actually capture guest state, so much as + make it possible to track which portions of guest disks have + change between checkpoints or between a current checkpoint and + the live execution of the guest. When performing incremental + backups, it is easier to create a new checkpoint at the same + time as a new backup, so that the next incremental backup can + refer to the incremental state since the checkpoint created + during the current backup. Guest state is then actually + captured using <code>virDomainBackupBegin()</code>. <!--See also + the <a href="formatcheckpoint.html">XML details</a> used with + this command.--></dd> + </dl> + + <h2><a id="examples">Examples</a></h2> + <p>The following two sequences both capture the disk state of a + running guest, then complete with the guest running on its + original disk image; but with a difference that an unexpected + interruption during the first mode leaves a temporary wrapper + file that must be accounted for, while interruption of the + second mode has no impact to the guest.</p>
This is not clear, I read this several times and I'm not sure what do you mean here. Blank line between paragraphs
+ <p>1. Backup via temporary snapshot + <pre> +virDomainFSFreeze() +virDomainSnapshotCreateXML(VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) +virDomainFSThaw() +third-party copy the backing file to backup storage # most time spent here
+virDomainBlockCommit(VIR_DOMAIN_BLOCK_COMMIT_ACTIVE) per disk
+wait for commit ready event per disk +virDomainBlockJobAbort() per disk + </pre></p>
I think we should mention virDomainFSFreeze and virDomainFSThaw before this examples, in the same way we mention the other apis.
+ + <p>2. Direct backup + <pre> +virDomainFSFreeze() +virDomainBackupBegin() +virDomainFSThaw() +wait for push mode event, or pull data over NBD # most time spent here +virDomainBackeupEnd() + </pre></p>
This means that virDomainBackupBegin will create a checkpoint, and libvirt will have to create the temporary storage for the backup (.e.g disk for push model, or temporary snapshot for the pull model). Libvirt will most likely use local storage which may fail if the host does not have enough local storage. But this may be good enough for many users, so maybe it is good to have this. I think we need to show here the more low level flow that oVirt will use: Backup using external temporary storage - virDomainFSFreeze() - virtDomainCreateCheckpointXML() - virDomainFSThaw() - Here oVirt will need to query the checkpoints, to understand how much temporary storage is needed for the backup. I hope we have an API for this (did not read the next patches yet). - virDomainBackupBegin() - third party copy data... - virDomainBackeupEnd() +
+ </body> +</html> diff --git a/docs/formatsnapshot.html.in b/docs/formatsnapshot.html.in index f2e51df5ab..d7051683a5 100644 --- a/docs/formatsnapshot.html.in +++ b/docs/formatsnapshot.html.in @@ -9,6 +9,8 @@ <h2><a id="SnapshotAttributes">Snapshot XML</a></h2>
<p> + Snapshots are one form + of <a href="domainstatecapture.html">domain state capture</a>. There are several types of snapshots: </p> <dl> -- 2.14.4
This is great documentation, showing both the APIs and how they are used together, we need more of this! Nir

On 06/26/2018 11:36 AM, Nir Soffer wrote:
On Wed, Jun 13, 2018 at 7:42 PM Eric Blake <eblake@redhat.com> wrote:
Upcoming patches will add support for incremental backups via a new API; but first, we need a landing page that gives an overview of capturing various pieces of guest state, and which APIs are best suited to which tasks.
Needs blank line between list items for easier reading of the source.
Sure.
I think we should describe checkpoints before backups, since the expected flow is:
- user start backup - system create checkpoint using virDomainCheckpointCreateXML - system query amount of data pointed by the previous checkpoint bitmaps - system create temporary storage for the backup - system starts backup using virDomainBackupBegin
I actually think it will be more common to create checkpoints via virDomainBackupBegin(), and not virDomainCheckpointCreateXML (the latter exists because it is easy, and may have a use independent from incremental backups, but it is the former that makes chains of incremental backups reliable). That is, your first backup will be a full backup (no checkpoint as its start) but will create a checkpoint at the same time; then your second backup is an incremental backup (use the checkpoint created at the first backup as the start) and also creates a checkpoint in anticipation of a third incremental backup. You do have an interesting step in there - the ability to query how much data is pointed to in the delta between two checkpoints (that is, before I actually create a backup, can I pre-guess how much data it will end up copying). On the other hand, the size of the temporary storage for the backup is not related to the amount of data tracked in the bitmap. Expanding on the examples in my 1/8 reply to you: At T3, we have: S1: |AAAA----| <- S2: |---BBB--| B1: |XXXX----| B2: |---XXX--| guest sees: |AAABBB--| where by T4 we will have: S1: |AAAA----| <- S2: |D--BBDD-| B1: |XXXX----| B2: |---XXX--| B3: |X----XX-| guest sees: |DAABBDD-| Back at T3, using B2 as our dirty bitmap, there are two backup models we can pursue to get at the data tracked by that bitmap. The first is push-model backup (blockdev-backup with "sync":"top" to the actual backup file) - qemu directly writes the |---BBB--| sequence into the destination file (based on the contents of B2), whether or not S2 is modified in the meantime; in this mode, qemu is smart enough to not bother copying clusters to the destination that were not in the bitmap. So the fact that B2 mentions 3 dirty clusters indeed proves to be the right size for the destination file. The second is pull-model backup (blockdev-backup with "sync":"none" to a temporary file, coupled with a read-only NBD server on the temporary file that also exposes bitmap B2 via NBD_CMD_BLOCK_STATUS) - here, if qemu can guarantee that the client would read only dirty clusters, then it only has to write to the temporary file if the guest changes a cluster that was tracked in B2 (so at most the temporary file would contain |-----B--| if the NBD client finishes before T4); but more likely, qemu will play conservative and write to the temporary file for ANY changes whether or not they are to areas covered by B2 (in which case the temporary file could contain |A----B0-| for the three writes done by T4). Or put another way, if qemu can guarantee a nice client, then the size of B2 probably overestimates the size of the temporary file; but if qemu plays conservative by assuming the client will read even portions of the file that weren't dirty, then keeping those reads constant will require the temporary file to be as large as the guest is able to dirty data while the backup continues, which may be far larger than the size of B2. [And maybe this argues that we want a way for an NBD export to force EIO read errors for anything outside of the exported dirty bitmap, thus making the client play nice, so that the temporary file does not have to grow beyond the size of the bitmap - but that's a future feature request]
+ <h2><a id="examples">Examples</a></h2> + <p>The following two sequences both capture the disk state of a + running guest, then complete with the guest running on its + original disk image; but with a difference that an unexpected + interruption during the first mode leaves a temporary wrapper + file that must be accounted for, while interruption of the + second mode has no impact to the guest.</p>
This is not clear, I read this several times and I'm not sure what do you mean here.
I'm trying to convey the point that with example 1...
Blank line between paragraphs
+ <p>1. Backup via temporary snapshot + <pre> +virDomainFSFreeze() +virDomainSnapshotCreateXML(VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY)
...if you are interrupted here, your <domain> XML has changed to point to the snapshot file...
+virDomainFSThaw() +third-party copy the backing file to backup storage # most time spent here
+virDomainBlockCommit(VIR_DOMAIN_BLOCK_COMMIT_ACTIVE) per disk
+wait for commit ready event per disk +virDomainBlockJobAbort() per disk
...and it is not until here that your <domain> XML is back to its pre-backup state. If the backup is interrupted for any reason, you have to manually get things back to the pre-backup layout, whether or not you were able to salvage the backup data.
+ </pre></p>
I think we should mention virDomainFSFreeze and virDomainFSThaw before this examples, in the same way we mention the other apis.
Can do.
+ + <p>2. Direct backup + <pre> +virDomainFSFreeze() +virDomainBackupBegin() +virDomainFSThaw() +wait for push mode event, or pull data over NBD # most time spent here +virDomainBackeupEnd()
In this example 2, using the new APIs, the <domain> XML is unchanged through the entire operation. If you interrupt things in the middle, you may have to scrap the backup data as not being viable, but you don't have to do any manual cleanup to get your domain back to the pre-backup layout.
+ </pre></p>
This means that virDomainBackupBegin will create a checkpoint, and libvirt will have to create the temporary storage for the backup (.e.g disk for push model, or temporary snapshot for the pull model). Libvirt will most likely use local storage which may fail if the host does not have enough local storage.
virDomainBackupBegin() has an optional <disks> XML element - if provided, then YOU can control the files (the destination on push model, ultimately including a remote network destination, such as via NBD, gluster, sheepdog, ...; or the scratch file for pull model, which probably only makes sense locally as the file gets thrown away as soon as the 3rd-party NBD client finishes). Libvirt only generates a filename if you don't provide that level of detail. You're right that the local storage running out of space can be a concern - but also remember that incremental backups are designed to be less invasive than full backups, AND that if one backup fails, you can then kick off another backup using the same checkpoint as starting point as the one that failed (that is, when libvirt is using B1 as its basis for a backup, but also created B2 at the same time, then you can use virDomainCheckpointDelete to remove B2 by merging the B1/B2 bitmaps back into B1, with B1 once again tracking changes from the previous successful backup to the current point in time).
But this may be good enough for many users, so maybe it is good to have this.
I think we need to show here the more low level flow that oVirt will use:
Backup using external temporary storage - virDomainFSFreeze() - virtDomainCreateCheckpointXML() - virDomainFSThaw() - Here oVirt will need to query the checkpoints, to understand how much temporary storage is needed for the backup. I hope we have an API for this (did not read the next patches yet).
I have not exposed one so far, nor do I know if qemu has that easily available. But since it matters to you, we can make it a priority to add that (and the API would need to be added to libvirt.so at the same time as the other new APIs, whether or not I can make it in time for the freeze at the end of this week).
- virDomainBackupBegin() - third party copy data... - virDomainBackeupEnd()
Again, note that oVirt will probably NOT call virDomainCreateCheckpointXML() directly, but will instead do: virDomainFSFreeze(); virDomainBackupBegin(dom, "<domainbackup type='pull'/>", "<domaincheckpoint><name>B1</name></domaincheckpoint>", 0); virDomainFSThaw(); third party copy data virDomainBackupEnd(); for the first full backup, then for the next incremental backup, do: virDomainFSFreeze(); virDomainBackupBegin(dom, "<domainbackup type='pull'><incremental>B1</incremental></domainbackup>", "domaincheckpoint><name>B2</name></domaincheckpoint>", 0); virDomainFSThaw(); third party copy data virDomainBackupEnd(); where you are creating bitmap B2 at the time of the first incremental backup (the second backup overall), and that backup consists of the data changed since the creation of bitmap B1 at the time of the earlier full backup. Then, as I mentioned earlier, the minimal XML forces libvirt to generate filenames (which may or may not match what you want), so you can certainly pass in more verbose XML: <domainbackup type='pull'> <incremental>B1</incremental> <server transport='unix' socket='/path/to/server'> <disks> <disk name='vda' type='block'> <scratch dev='/path/to/scratch/dev'> </disk> </disks> </domainbackup> and of course, we'll eventually want TLS thrown in the mix (my initial implementation has completely bypassed that, other than the fact that the <server> element is a great place to stick in the information needed for telling qemu's server to only accept clients that know the right TLS magic). If this example helps, I can flush out the html to give these further insights. And, if wrapping FSFreeze/Thaw is that common, we'll probably want to reach the point where we add VIR_DOMAIN_BACKUP_QUIESCE as a flag argument to automatically do it as part of virDomainBackupBegin().
This is great documentation, showing both the APIs and how they are used together, we need more of this!
Well, and it's also been a great resource for me as I continue to hammer out the (LOADS) of code needed to reach a working demo. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

Prepare for introducing a bunch of new public APIs related to backup checkpoints by first introducing a new internal type and errors associated with that type. Checkpoints are modeled heavily after virDomainSnapshotPtr (both represent a point in time of the guest), although a snapshot exists with the intent of rolling back to that state, while a checkpoint exists to make it possible to create an incremental backup at a later time. Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt-domain-snapshot.h | 8 ++-- include/libvirt/libvirt.h | 2 + include/libvirt/virterror.h | 5 ++- src/datatypes.c | 62 ++++++++++++++++++++++++++++++- src/datatypes.h | 31 +++++++++++++++- src/libvirt_private.syms | 2 + src/util/virerror.c | 15 +++++++- 7 files changed, 118 insertions(+), 7 deletions(-) diff --git a/include/libvirt/libvirt-domain-snapshot.h b/include/libvirt/libvirt-domain-snapshot.h index e5a893a767..ff1e890cfc 100644 --- a/include/libvirt/libvirt-domain-snapshot.h +++ b/include/libvirt/libvirt-domain-snapshot.h @@ -31,15 +31,17 @@ /** * virDomainSnapshot: * - * a virDomainSnapshot is a private structure representing a snapshot of - * a domain. + * A virDomainSnapshot is a private structure representing a snapshot of + * a domain. A snapshot captures the state of the domain at a point in + * time, with the intent that the guest can be reverted back to that + * state at a later time. */ typedef struct _virDomainSnapshot virDomainSnapshot; /** * virDomainSnapshotPtr: * - * a virDomainSnapshotPtr is pointer to a virDomainSnapshot private structure, + * A virDomainSnapshotPtr is pointer to a virDomainSnapshot private structure, * and is the type used to reference a domain snapshot in the API. */ typedef virDomainSnapshot *virDomainSnapshotPtr; diff --git a/include/libvirt/libvirt.h b/include/libvirt/libvirt.h index 36f6d60775..26887a40e7 100644 --- a/include/libvirt/libvirt.h +++ b/include/libvirt/libvirt.h @@ -36,6 +36,8 @@ extern "C" { # include <libvirt/libvirt-common.h> # include <libvirt/libvirt-host.h> # include <libvirt/libvirt-domain.h> +typedef struct _virDomainCheckpoint virDomainCheckpoint; +typedef virDomainCheckpoint *virDomainCheckpointPtr; # include <libvirt/libvirt-domain-snapshot.h> # include <libvirt/libvirt-event.h> # include <libvirt/libvirt-interface.h> diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h index 5e58b6a3f9..87ac16be0b 100644 --- a/include/libvirt/virterror.h +++ b/include/libvirt/virterror.h @@ -4,7 +4,7 @@ * Description: Provides the interfaces of the libvirt library to handle * errors raised while using the library. * - * Copyright (C) 2006-2016 Red Hat, Inc. + * Copyright (C) 2006-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -133,6 +133,7 @@ typedef enum { VIR_FROM_PERF = 65, /* Error from perf */ VIR_FROM_LIBSSH = 66, /* Error from libssh connection transport */ VIR_FROM_RESCTRL = 67, /* Error from resource control */ + VIR_FROM_DOMAIN_CHECKPOINT = 68,/* Error from domain checkpoint */ # ifdef VIR_ENUM_SENTINELS VIR_ERR_DOMAIN_LAST @@ -321,6 +322,8 @@ typedef enum { to guest-sync command (DEPRECATED)*/ VIR_ERR_LIBSSH = 98, /* error in libssh transport driver */ VIR_ERR_DEVICE_MISSING = 99, /* fail to find the desired device */ + VIR_ERR_INVALID_DOMAIN_CHECKPOINT = 100,/* invalid domain checkpoint */ + VIR_ERR_NO_DOMAIN_CHECKPOINT = 101, /* domain checkpoint not found */ } virErrorNumber; /** diff --git a/src/datatypes.c b/src/datatypes.c index 09b8eea5a2..3c9069c938 100644 --- a/src/datatypes.c +++ b/src/datatypes.c @@ -1,7 +1,7 @@ /* * datatypes.c: management of structs for public data types * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -36,6 +36,7 @@ VIR_LOG_INIT("datatypes"); virClassPtr virConnectClass; virClassPtr virConnectCloseCallbackDataClass; virClassPtr virDomainClass; +virClassPtr virDomainCheckpointClass; virClassPtr virDomainSnapshotClass; virClassPtr virInterfaceClass; virClassPtr virNetworkClass; @@ -49,6 +50,7 @@ virClassPtr virStoragePoolClass; static void virConnectDispose(void *obj); static void virConnectCloseCallbackDataDispose(void *obj); static void virDomainDispose(void *obj); +static void virDomainCheckpointDispose(void *obj); static void virDomainSnapshotDispose(void *obj); static void virInterfaceDispose(void *obj); static void virNetworkDispose(void *obj); @@ -84,6 +86,7 @@ virDataTypesOnceInit(void) DECLARE_CLASS_LOCKABLE(virConnect); DECLARE_CLASS_LOCKABLE(virConnectCloseCallbackData); DECLARE_CLASS(virDomain); + DECLARE_CLASS(virDomainCheckpoint); DECLARE_CLASS(virDomainSnapshot); DECLARE_CLASS(virInterface); DECLARE_CLASS(virNetwork); @@ -887,6 +890,63 @@ virDomainSnapshotDispose(void *obj) } +/** + * virGetDomainCheckpoint: + * @domain: the domain to checkpoint + * @name: pointer to the domain checkpoint name + * + * Allocates a new domain checkpoint object. When the object is no longer needed, + * virObjectUnref() must be called in order to not leak data. + * + * Returns a pointer to the domain checkpoint object, or NULL on error. + */ +virDomainCheckpointPtr +virGetDomainCheckpoint(virDomainPtr domain, const char *name) +{ + virDomainCheckpointPtr ret = NULL; + + if (virDataTypesInitialize() < 0) + return NULL; + + virCheckDomainGoto(domain, error); + virCheckNonNullArgGoto(name, error); + + if (!(ret = virObjectNew(virDomainCheckpointClass))) + goto error; + if (VIR_STRDUP(ret->name, name) < 0) + goto error; + + ret->domain = virObjectRef(domain); + + return ret; + + error: + virObjectUnref(ret); + return NULL; +} + + +/** + * virDomainCheckpointDispose: + * @obj: the domain checkpoint to release + * + * Unconditionally release all memory associated with a checkpoint. + * The checkpoint object must not be used once this method returns. + * + * It will also unreference the associated connection object, + * which may also be released if its ref count hits zero. + */ +static void +virDomainCheckpointDispose(void *obj) +{ + virDomainCheckpointPtr checkpoint = obj; + VIR_DEBUG("release checkpoint %p %s", checkpoint, checkpoint->name); + + VIR_FREE(checkpoint->name); + virObjectUnref(checkpoint->domain); +} + + virAdmConnectPtr virAdmConnectNew(void) { diff --git a/src/datatypes.h b/src/datatypes.h index 192c86be80..fbe842d105 100644 --- a/src/datatypes.h +++ b/src/datatypes.h @@ -1,7 +1,7 @@ /* * datatypes.h: management of structs for public data types * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -31,6 +31,7 @@ extern virClassPtr virConnectClass; extern virClassPtr virDomainClass; +extern virClassPtr virDomainCheckpointClass; extern virClassPtr virDomainSnapshotClass; extern virClassPtr virInterfaceClass; extern virClassPtr virNetworkClass; @@ -292,6 +293,21 @@ extern virClassPtr virAdmClientClass; } \ } while (0) +# define virCheckDomainCheckpointReturn(obj, retval) \ + do { \ + virDomainCheckpointPtr _check = (obj); \ + if (!virObjectIsClass(_check, virDomainCheckpointClass) || \ + !virObjectIsClass(_check->domain, virDomainClass) || \ + !virObjectIsClass(_check->domain->conn, virConnectClass)) { \ + virReportErrorHelper(VIR_FROM_DOMAIN_CHECKPOINT, \ + VIR_ERR_INVALID_DOMAIN_CHECKPOINT, \ + __FILE__, __FUNCTION__, __LINE__, \ + __FUNCTION__); \ + virDispatchError(NULL); \ + return retval; \ + } \ + } while (0) + /* Helper macros to implement VIR_DOMAIN_DEBUG using just C99. This * assumes you pass fewer than 15 arguments to VIR_DOMAIN_DEBUG, but @@ -652,6 +668,17 @@ struct _virStream { void *privateData; }; +/** + * _virDomainCheckpoint + * + * Internal structure associated with a domain checkpoint + */ +struct _virDomainCheckpoint { + virObject parent; + char *name; + virDomainPtr domain; +}; + /** * _virDomainSnapshot * @@ -712,6 +739,8 @@ virStreamPtr virGetStream(virConnectPtr conn); virNWFilterPtr virGetNWFilter(virConnectPtr conn, const char *name, const unsigned char *uuid); +virDomainCheckpointPtr virGetDomainCheckpoint(virDomainPtr domain, + const char *name); virDomainSnapshotPtr virGetDomainSnapshot(virDomainPtr domain, const char *name); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index ea24f2847c..4686c775a5 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1183,10 +1183,12 @@ virConnectCloseCallbackDataClass; virConnectCloseCallbackDataGetCallback; virConnectCloseCallbackDataRegister; virConnectCloseCallbackDataUnregister; +virDomainCheckpointClass; virDomainClass; virDomainSnapshotClass; virGetConnect; virGetDomain; +virGetDomainCheckpoint; virGetDomainSnapshot; virGetInterface; virGetNetwork; diff --git a/src/util/virerror.c b/src/util/virerror.c index 93632dbdf7..1e6fd77abf 100644 --- a/src/util/virerror.c +++ b/src/util/virerror.c @@ -1,7 +1,7 @@ /* * virerror.c: error handling and reporting code for libvirt * - * Copyright (C) 2006, 2008-2016 Red Hat, Inc. + * Copyright (C) 2006, 2008-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -140,6 +140,7 @@ VIR_ENUM_IMPL(virErrorDomain, VIR_ERR_DOMAIN_LAST, "Perf", /* 65 */ "Libssh transport layer", "Resource control", + "Domain Checkpoint", ) @@ -1494,6 +1495,18 @@ virErrorMsg(virErrorNumber error, const char *info) else errmsg = _("device not found: %s"); break; + case VIR_ERR_INVALID_DOMAIN_CHECKPOINT: + if (info == NULL) + errmsg = _("Invalid checkpoint"); + else + errmsg = _("Invalid checkpoint: %s"); + break; + case VIR_ERR_NO_DOMAIN_CHECKPOINT: + if (info == NULL) + errmsg = _("Domain snapshot not found"); + else + errmsg = _("Domain snapshot not found: %s"); + break; } return errmsg; } -- 2.14.4

On 06/13/2018 12:42 PM, Eric Blake wrote:
Prepare for introducing a bunch of new public APIs related to backup checkpoints by first introducing a new internal type and errors associated with that type. Checkpoints are modeled heavily after virDomainSnapshotPtr (both represent a point in time of the guest), although a snapshot exists with the intent of rolling back to that state, while a checkpoint exists to make it possible to create an incremental backup at a later time.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt-domain-snapshot.h | 8 ++-- include/libvirt/libvirt.h | 2 + include/libvirt/virterror.h | 5 ++- src/datatypes.c | 62 ++++++++++++++++++++++++++++++- src/datatypes.h | 31 +++++++++++++++- src/libvirt_private.syms | 2 + src/util/virerror.c | 15 +++++++- 7 files changed, 118 insertions(+), 7 deletions(-)
diff --git a/include/libvirt/libvirt-domain-snapshot.h b/include/libvirt/libvirt-domain-snapshot.h index e5a893a767..ff1e890cfc 100644 --- a/include/libvirt/libvirt-domain-snapshot.h +++ b/include/libvirt/libvirt-domain-snapshot.h @@ -31,15 +31,17 @@ /** * virDomainSnapshot: * - * a virDomainSnapshot is a private structure representing a snapshot of - * a domain. + * A virDomainSnapshot is a private structure representing a snapshot of + * a domain. A snapshot captures the state of the domain at a point in + * time, with the intent that the guest can be reverted back to that + * state at a later time. */ typedef struct _virDomainSnapshot virDomainSnapshot;
/** * virDomainSnapshotPtr: * - * a virDomainSnapshotPtr is pointer to a virDomainSnapshot private structure, + * A virDomainSnapshotPtr is pointer to a virDomainSnapshot private structure, * and is the type used to reference a domain snapshot in the API. */ typedef virDomainSnapshot *virDomainSnapshotPtr;
The above hunk is separable (and push-able) as it's own patch, so you can consider it : Reviewed-by: John Ferlan <jferlan@redhat.com> Naming scheme aside, the rest had one minor nit: [...]
diff --git a/src/util/virerror.c b/src/util/virerror.c index 93632dbdf7..1e6fd77abf 100644 --- a/src/util/virerror.c +++ b/src/util/virerror.c @@ -1,7 +1,7 @@ /* * virerror.c: error handling and reporting code for libvirt * - * Copyright (C) 2006, 2008-2016 Red Hat, Inc. + * Copyright (C) 2006, 2008-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -140,6 +140,7 @@ VIR_ENUM_IMPL(virErrorDomain, VIR_ERR_DOMAIN_LAST, "Perf", /* 65 */ "Libssh transport layer", "Resource control", + "Domain Checkpoint", )
@@ -1494,6 +1495,18 @@ virErrorMsg(virErrorNumber error, const char *info) else errmsg = _("device not found: %s"); break; + case VIR_ERR_INVALID_DOMAIN_CHECKPOINT: + if (info == NULL) + errmsg = _("Invalid checkpoint"); + else + errmsg = _("Invalid checkpoint: %s"); + break; + case VIR_ERR_NO_DOMAIN_CHECKPOINT: + if (info == NULL) + errmsg = _("Domain snapshot not found"); + else + errmsg = _("Domain snapshot not found: %s");
checkpoint
+ break; } return errmsg; }
John

On Wed, Jun 13, 2018 at 7:42 PM Eric Blake <eblake@redhat.com> wrote:
Prepare for introducing a bunch of new public APIs related to backup checkpoints by first introducing a new internal type and errors associated with that type. Checkpoints are modeled heavily after virDomainSnapshotPtr (both represent a point in time of the guest), although a snapshot exists with the intent of rolling back to that state, while a checkpoint exists to make it possible to create an incremental backup at a later time.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt-domain-snapshot.h | 8 ++-- include/libvirt/libvirt.h | 2 + include/libvirt/virterror.h | 5 ++- src/datatypes.c | 62 ++++++++++++++++++++++++++++++- src/datatypes.h | 31 +++++++++++++++- src/libvirt_private.syms | 2 + src/util/virerror.c | 15 +++++++- 7 files changed, 118 insertions(+), 7 deletions(-)
diff --git a/include/libvirt/libvirt-domain-snapshot.h b/include/libvirt/libvirt-domain-snapshot.h index e5a893a767..ff1e890cfc 100644 --- a/include/libvirt/libvirt-domain-snapshot.h +++ b/include/libvirt/libvirt-domain-snapshot.h @@ -31,15 +31,17 @@ /** * virDomainSnapshot: * - * a virDomainSnapshot is a private structure representing a snapshot of - * a domain. + * A virDomainSnapshot is a private structure representing a snapshot of + * a domain. A snapshot captures the state of the domain at a point in + * time, with the intent that the guest can be reverted back to that + * state at a later time.
The extra context is very nice...
*/ typedef struct _virDomainSnapshot virDomainSnapshot;
/** * virDomainSnapshotPtr: * - * a virDomainSnapshotPtr is pointer to a virDomainSnapshot private structure, + * A virDomainSnapshotPtr is pointer to a virDomainSnapshot private structure, * and is the type used to reference a domain snapshot in the API. */
But I think users of this API would like to find it here, explaining the public type.
typedef virDomainSnapshot *virDomainSnapshotPtr; diff --git a/include/libvirt/libvirt.h b/include/libvirt/libvirt.h index 36f6d60775..26887a40e7 100644 --- a/include/libvirt/libvirt.h +++ b/include/libvirt/libvirt.h @@ -36,6 +36,8 @@ extern "C" { # include <libvirt/libvirt-common.h> # include <libvirt/libvirt-host.h> # include <libvirt/libvirt-domain.h> +typedef struct _virDomainCheckpoint virDomainCheckpoint; +typedef virDomainCheckpoint *virDomainCheckpointPtr; # include <libvirt/libvirt-domain-snapshot.h> # include <libvirt/libvirt-event.h> # include <libvirt/libvirt-interface.h> diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h index 5e58b6a3f9..87ac16be0b 100644 --- a/include/libvirt/virterror.h +++ b/include/libvirt/virterror.h @@ -4,7 +4,7 @@ * Description: Provides the interfaces of the libvirt library to handle * errors raised while using the library. * - * Copyright (C) 2006-2016 Red Hat, Inc. + * Copyright (C) 2006-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -133,6 +133,7 @@ typedef enum { VIR_FROM_PERF = 65, /* Error from perf */ VIR_FROM_LIBSSH = 66, /* Error from libssh connection transport */ VIR_FROM_RESCTRL = 67, /* Error from resource control */ + VIR_FROM_DOMAIN_CHECKPOINT = 68,/* Error from domain checkpoint */
# ifdef VIR_ENUM_SENTINELS VIR_ERR_DOMAIN_LAST @@ -321,6 +322,8 @@ typedef enum { to guest-sync command (DEPRECATED)*/ VIR_ERR_LIBSSH = 98, /* error in libssh transport driver */ VIR_ERR_DEVICE_MISSING = 99, /* fail to find the desired device */ + VIR_ERR_INVALID_DOMAIN_CHECKPOINT = 100,/* invalid domain checkpoint */
What is invalid checkpoint? It would be nice if there would not be such thing. Also the comment does not add anything. + VIR_ERR_NO_DOMAIN_CHECKPOINT = 101, /* domain checkpoint not found */ } virErrorNumber;
/** diff --git a/src/datatypes.c b/src/datatypes.c index 09b8eea5a2..3c9069c938 100644 --- a/src/datatypes.c +++ b/src/datatypes.c @@ -1,7 +1,7 @@ /* * datatypes.c: management of structs for public data types * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -36,6 +36,7 @@ VIR_LOG_INIT("datatypes"); virClassPtr virConnectClass; virClassPtr virConnectCloseCallbackDataClass; virClassPtr virDomainClass; +virClassPtr virDomainCheckpointClass; virClassPtr virDomainSnapshotClass; virClassPtr virInterfaceClass; virClassPtr virNetworkClass; @@ -49,6 +50,7 @@ virClassPtr virStoragePoolClass; static void virConnectDispose(void *obj); static void virConnectCloseCallbackDataDispose(void *obj); static void virDomainDispose(void *obj); +static void virDomainCheckpointDispose(void *obj); static void virDomainSnapshotDispose(void *obj); static void virInterfaceDispose(void *obj); static void virNetworkDispose(void *obj); @@ -84,6 +86,7 @@ virDataTypesOnceInit(void) DECLARE_CLASS_LOCKABLE(virConnect); DECLARE_CLASS_LOCKABLE(virConnectCloseCallbackData); DECLARE_CLASS(virDomain); + DECLARE_CLASS(virDomainCheckpoint); DECLARE_CLASS(virDomainSnapshot); DECLARE_CLASS(virInterface); DECLARE_CLASS(virNetwork); @@ -887,6 +890,63 @@ virDomainSnapshotDispose(void *obj) }
+/** + * virGetDomainCheckpoint: + * @domain: the domain to checkpoint + * @name: pointer to the domain checkpoint name + * + * Allocates a new domain checkpoint object. When the object is no longer needed, + * virObjectUnref() must be called in order to not leak data. + * + * Returns a pointer to the domain checkpoint object, or NULL on error. + */ +virDomainCheckpointPtr +virGetDomainCheckpoint(virDomainPtr domain, const char *name) +{ + virDomainCheckpointPtr ret = NULL; + + if (virDataTypesInitialize() < 0) + return NULL; + + virCheckDomainGoto(domain, error); + virCheckNonNullArgGoto(name, error);
+
+ if (!(ret = virObjectNew(virDomainCheckpointClass))) + goto error; + if (VIR_STRDUP(ret->name, name) < 0) + goto error; + + ret->domain = virObjectRef(domain); + + return ret; + + error: + virObjectUnref(ret); + return NULL; +} + + +/** + * virDomainCheckpointDispose: + * @obj: the domain checkpoint to release + * + * Unconditionally release all memory associated with a checkpoint. + * The checkpoint object must not be used once this method returns. + * + * It will also unreference the associated connection object, + * which may also be released if its ref count hits zero. + */ +static void +virDomainCheckpointDispose(void *obj) +{ + virDomainCheckpointPtr checkpoint = obj; + VIR_DEBUG("release checkpoint %p %s", checkpoint, checkpoint->name); + + VIR_FREE(checkpoint->name); + virObjectUnref(checkpoint->domain); +} + + virAdmConnectPtr virAdmConnectNew(void) { diff --git a/src/datatypes.h b/src/datatypes.h index 192c86be80..fbe842d105 100644 --- a/src/datatypes.h +++ b/src/datatypes.h @@ -1,7 +1,7 @@ /* * datatypes.h: management of structs for public data types * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -31,6 +31,7 @@
extern virClassPtr virConnectClass; extern virClassPtr virDomainClass; +extern virClassPtr virDomainCheckpointClass; extern virClassPtr virDomainSnapshotClass; extern virClassPtr virInterfaceClass; extern virClassPtr virNetworkClass; @@ -292,6 +293,21 @@ extern virClassPtr virAdmClientClass; } \ } while (0)
+# define virCheckDomainCheckpointReturn(obj, retval) \ + do { \ + virDomainCheckpointPtr _check = (obj); \ + if (!virObjectIsClass(_check, virDomainCheckpointClass) || \ + !virObjectIsClass(_check->domain, virDomainClass) || \ + !virObjectIsClass(_check->domain->conn, virConnectClass)) { \ + virReportErrorHelper(VIR_FROM_DOMAIN_CHECKPOINT, \ + VIR_ERR_INVALID_DOMAIN_CHECKPOINT, \ + __FILE__, __FUNCTION__, __LINE__, \ + __FUNCTION__); \
I guess that this is invalid domain checkpoint. Isn't this a generic error, providing a pointer of the wrong type?
+ virDispatchError(NULL); \ + return retval; \ + } \ + } while (0) +
/* Helper macros to implement VIR_DOMAIN_DEBUG using just C99. This * assumes you pass fewer than 15 arguments to VIR_DOMAIN_DEBUG, but @@ -652,6 +668,17 @@ struct _virStream { void *privateData; };
+/** + * _virDomainCheckpoint + * + * Internal structure associated with a domain checkpoint + */ +struct _virDomainCheckpoint { + virObject parent; + char *name; + virDomainPtr domain; +}; + /** * _virDomainSnapshot * @@ -712,6 +739,8 @@ virStreamPtr virGetStream(virConnectPtr conn); virNWFilterPtr virGetNWFilter(virConnectPtr conn, const char *name, const unsigned char *uuid); +virDomainCheckpointPtr virGetDomainCheckpoint(virDomainPtr domain, + const char *name);
I guess this implemented and documented elsewhere.
virDomainSnapshotPtr virGetDomainSnapshot(virDomainPtr domain, const char *name);
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index ea24f2847c..4686c775a5 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1183,10 +1183,12 @@ virConnectCloseCallbackDataClass; virConnectCloseCallbackDataGetCallback; virConnectCloseCallbackDataRegister; virConnectCloseCallbackDataUnregister; +virDomainCheckpointClass; virDomainClass; virDomainSnapshotClass; virGetConnect; virGetDomain; +virGetDomainCheckpoint; virGetDomainSnapshot; virGetInterface; virGetNetwork; diff --git a/src/util/virerror.c b/src/util/virerror.c index 93632dbdf7..1e6fd77abf 100644 --- a/src/util/virerror.c +++ b/src/util/virerror.c @@ -1,7 +1,7 @@ /* * virerror.c: error handling and reporting code for libvirt * - * Copyright (C) 2006, 2008-2016 Red Hat, Inc. + * Copyright (C) 2006, 2008-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -140,6 +140,7 @@ VIR_ENUM_IMPL(virErrorDomain, VIR_ERR_DOMAIN_LAST, "Perf", /* 65 */ "Libssh transport layer", "Resource control", + "Domain Checkpoint", )
@@ -1494,6 +1495,18 @@ virErrorMsg(virErrorNumber error, const char *info) else errmsg = _("device not found: %s"); break; + case VIR_ERR_INVALID_DOMAIN_CHECKPOINT: + if (info == NULL) + errmsg = _("Invalid checkpoint"); + else + errmsg = _("Invalid checkpoint: %s"); + break; + case VIR_ERR_NO_DOMAIN_CHECKPOINT: + if (info == NULL) + errmsg = _("Domain snapshot not found"); + else + errmsg = _("Domain snapshot not found: %s"); + break; } return errmsg; } -- 2.14.4
Nir

On 06/26/2018 02:03 PM, Nir Soffer wrote:
On Wed, Jun 13, 2018 at 7:42 PM Eric Blake <eblake@redhat.com> wrote:
Prepare for introducing a bunch of new public APIs related to backup checkpoints by first introducing a new internal type and errors associated with that type. Checkpoints are modeled heavily after virDomainSnapshotPtr (both represent a point in time of the guest), although a snapshot exists with the intent of rolling back to that state, while a checkpoint exists to make it possible to create an incremental backup at a later time.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt-domain-snapshot.h | 8 ++-- include/libvirt/libvirt.h | 2 + include/libvirt/virterror.h | 5 ++- src/datatypes.c | 62 ++++++++++++++++++++++++++++++- src/datatypes.h | 31 +++++++++++++++- src/libvirt_private.syms | 2 + src/util/virerror.c | 15 +++++++- 7 files changed, 118 insertions(+), 7 deletions(-)
diff --git a/include/libvirt/libvirt-domain-snapshot.h b/include/libvirt/libvirt-domain-snapshot.h index e5a893a767..ff1e890cfc 100644 --- a/include/libvirt/libvirt-domain-snapshot.h +++ b/include/libvirt/libvirt-domain-snapshot.h @@ -31,15 +31,17 @@ /** * virDomainSnapshot: * - * a virDomainSnapshot is a private structure representing a snapshot of - * a domain. + * A virDomainSnapshot is a private structure representing a snapshot of + * a domain. A snapshot captures the state of the domain at a point in + * time, with the intent that the guest can be reverted back to that + * state at a later time.
The extra context is very nice...
*/ typedef struct _virDomainSnapshot virDomainSnapshot;
/** * virDomainSnapshotPtr: * - * a virDomainSnapshotPtr is pointer to a virDomainSnapshot private structure, + * A virDomainSnapshotPtr is pointer to a virDomainSnapshot private structure, * and is the type used to reference a domain snapshot in the API. */
But I think users of this API would like to find it here, explaining the public type.
That's a pre-existing documentation issue (probably worth a separate cleanup patch to a lot of files, if it really does render better to tie the details to the 'Ptr' typedef rather than the opaque typedef).
@@ -321,6 +322,8 @@ typedef enum { to guest-sync command (DEPRECATED)*/ VIR_ERR_LIBSSH = 98, /* error in libssh transport driver */ VIR_ERR_DEVICE_MISSING = 99, /* fail to find the desired device */ + VIR_ERR_INVALID_DOMAIN_CHECKPOINT = 100,/* invalid domain checkpoint */
What is invalid checkpoint? It would be nice if there would not be such thing.
Copied from the existing VIR_ERR_INVALID_DOMAIN_SNAPSHOT. Sadly, there MUST be such a thing - it exists to (try and) catch bugs such as: void *ptr = virAPI1() (which returns virDomainPtr) virDomainCheckpointAPI2(ptr, ...) (which expects virDomainCheckpointPtr) where you are passing in the wrong type, or such as: virConnectPtr conn = virAPI1() virDomainCheckpointPtr chk = virAPI2(conn) virConnectClose(conn) virDomainCheckpointAPI3(chk) where you are passing in the right type but wrong order because the checkpoint depends on a connection that you have closed
Also the comment does not add anything.
Such is the life of copy-and-paste. My excuse is that the code I copied from has the same sort of poor comment.
@@ -292,6 +293,21 @@ extern virClassPtr virAdmClientClass; } \ } while (0)
+# define virCheckDomainCheckpointReturn(obj, retval) \ + do { \ + virDomainCheckpointPtr _check = (obj); \ + if (!virObjectIsClass(_check, virDomainCheckpointClass) || \ + !virObjectIsClass(_check->domain, virDomainClass) || \ + !virObjectIsClass(_check->domain->conn, virConnectClass)) { \ + virReportErrorHelper(VIR_FROM_DOMAIN_CHECKPOINT, \ + VIR_ERR_INVALID_DOMAIN_CHECKPOINT, \ + __FILE__, __FUNCTION__, __LINE__, \ + __FUNCTION__); \
I guess that this is invalid domain checkpoint. Isn't this a generic error, providing a pointer of the wrong type?
Yes, except that libvirt already has the practice of distinguishing error messages according to which type was expected.
@@ -712,6 +739,8 @@ virStreamPtr virGetStream(virConnectPtr conn); virNWFilterPtr virGetNWFilter(virConnectPtr conn, const char *name, const unsigned char *uuid); +virDomainCheckpointPtr virGetDomainCheckpoint(virDomainPtr domain, + const char *name);
I guess this implemented and documented elsewhere.
This is a function for internal use only; it is not exported as a public function, but exists to mirror...
virDomainSnapshotPtr virGetDomainSnapshot(virDomainPtr domain, const char *name);
...this existing snapshot function with the exact same amount of zero comments. It was implemented in this patch, in src/datatypes.c. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

Prepare for new checkpoint and backup APIs by describing the XML that will represent a checkpoint. This is modeled heavily after the XML for virDomainSnapshotPtr, since both represent a point in time of the guest. But while a snapshot exists with the intent of rolling back to that state, a checkpoint instead makes it possible to create an incremental backup at a later time. Add testsuite coverage of a minimal use of the XML. Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/docs.html.in | 3 +- docs/domainstatecapture.html.in | 4 +- docs/formatcheckpoint.html.in | 273 +++++++++++++++++++++++++++++ docs/schemas/domaincheckpoint.rng | 89 ++++++++++ libvirt.spec.in | 1 + mingw-libvirt.spec.in | 2 + tests/domaincheckpointxml2xmlin/empty.xml | 1 + tests/domaincheckpointxml2xmlout/empty.xml | 10 ++ tests/virschematest.c | 2 + 9 files changed, 382 insertions(+), 3 deletions(-) create mode 100644 docs/formatcheckpoint.html.in create mode 100644 docs/schemas/domaincheckpoint.rng create mode 100644 tests/domaincheckpointxml2xmlin/empty.xml create mode 100644 tests/domaincheckpointxml2xmlout/empty.xml diff --git a/docs/docs.html.in b/docs/docs.html.in index 4c46b74980..11dfd27ba6 100644 --- a/docs/docs.html.in +++ b/docs/docs.html.in @@ -79,7 +79,8 @@ <a href="formatdomaincaps.html">domain capabilities</a>, <a href="formatnode.html">node devices</a>, <a href="formatsecret.html">secrets</a>, - <a href="formatsnapshot.html">snapshots</a></dd> + <a href="formatsnapshot.html">snapshots</a>, + <a href="formatcheckpoint.html">checkpoints</a></dd> <dt><a href="uri.html">URI format</a></dt> <dd>The URI formats used for connecting to libvirt</dd> diff --git a/docs/domainstatecapture.html.in b/docs/domainstatecapture.html.in index 00ab7e8ee1..4de93c87c8 100644 --- a/docs/domainstatecapture.html.in +++ b/docs/domainstatecapture.html.in @@ -154,9 +154,9 @@ time as a new backup, so that the next incremental backup can refer to the incremental state since the checkpoint created during the current backup. Guest state is then actually - captured using <code>virDomainBackupBegin()</code>. <!--See also + captured using <code>virDomainBackupBegin()</code>. See also the <a href="formatcheckpoint.html">XML details</a> used with - this command.--></dd> + this command.</dd> </dl> <h2><a id="examples">Examples</a></h2> diff --git a/docs/formatcheckpoint.html.in b/docs/formatcheckpoint.html.in new file mode 100644 index 0000000000..34507a9f68 --- /dev/null +++ b/docs/formatcheckpoint.html.in @@ -0,0 +1,273 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html> +<html xmlns="http://www.w3.org/1999/xhtml"> + <body> + <h1>Checkpoint and Backup XML format</h1> + + <ul id="toc"></ul> + + <h2><a id="CheckpointAttributes">Checkpoint XML</a></h2> + + <p> + Domain disk backups, including incremental backups, are one form + of <a href="domainstatecapture.html">domain state capture</a>. + </p> + <p> + Libvirt is able to facilitate incremental backups by tracking + disk checkpoints, or points in time against which it is easy to + compute which portion of the disk has changed. Given a full + backup (a backup created from the creation of the disk to a + given point in time, coupled with the creation of a disk + checkpoint at that time), and an incremental backup (a backup + created from just the dirty portion of the disk between the + first checkpoint and the second backup operation), it is + possible to do an offline reconstruction of the state of the + disk at the time of the second backup, without having to copy as + much data as a second full backup would require. Most disk + checkpoints are created in concert with a backup, + via <code>virDomainBackupBegin()</code>; however, libvirt also + exposes enough support to create disk checkpoints independently + from a backup operation, + via <code>virDomainCheckpointCreateXML()</code>. + </p> + <p> + Attributes of libvirt checkpoints are stored as child elements of + the <code>domaincheckpoint</code> element. At checkpoint creation + time, normally only the <code>name</code>, <code>description</code>, + and <code>disks</code> elements are settable; the rest of the + fields are ignored on creation, and will be filled in by + libvirt in for informational purposes + by <code>virDomainCheckpointGetXMLDesc()</code>. However, when + redefining a checkpoint, + with the <code>VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE</code> flag + of <code>virDomainCheckpointCreateXML()</code>, all of the XML + described here is relevant. + </p> + <p> + Checkpoints are maintained in a hierarchy. A domain can have a + current checkpoint, which is the most recent checkpoint compared to + the current state of the domain (although a domain might have + checkpoints without a current checkpoint, if checkpoints have been + deleted in the meantime). Creating or reverting to a checkpoint + sets that checkpoint as current, and the prior current checkpoint is + the parent of the new checkpoint. Branches in the hierarchy can + be formed by reverting to a checkpoint with a child, then creating + another checkpoint. + </p> + <p> + The top-level <code>domaincheckpoint</code> element may contain + the following elements: + </p> + <dl> + <dt><code>name</code></dt> + <dd>The name for this checkpoint. If the name is specified when + initially creating the checkpoint, then the checkpoint will have + that particular name. If the name is omitted when initially + creating the checkpoint, then libvirt will make up a name for + the checkpoint, based on the time when it was created. + </dd> + <dt><code>description</code></dt> + <dd>A human-readable description of the checkpoint. If the + description is omitted when initially creating the checkpoint, + then this field will be empty. + </dd> + <dt><code>disks</code></dt> + <dd>On input, this is an optional listing of specific + instructions for disk checkpoints; it is needed when making a + checkpoint on only a subset of the disks associated with a + domain (in particular, since qemu checkpoints require qcow2 + disks, this element may be needed on input for excluding guest + disks that are not in qcow2 format); if omitted on input, then + all disks participate in the checkpoint. On output, this is + fully populated to show the state of each disk in the + checkpoint. This element has a list of <code>disk</code> + sub-elements, describing anywhere from one to all of the disks + associated with the domain. + <dl> + <dt><code>disk</code></dt> + <dd>This sub-element describes the checkpoint properties of + a specific disk. The attribute <code>name</code> is + mandatory, and must match either the <code><target + dev='name'/></code> or an unambiguous <code><source + file='name'/></code> of one of + the <a href="formatdomain.html#elementsDisks">disk + devices</a> specified for the domain at the time of the + checkpoint. The attribute <code>checkpoint</code> is + optional on input; possible values are <code>no</code> + when the disk does not participate in this checkpoint; + or <code>bitmap</code> if the disk will track all changes + since the creation of this checkpoint via a bitmap, in + which case another attribute <code>bitmap</code> will be + the name of the tracking bitmap (defaulting to the + checkpoint name). + </dd> + </dl> + </dd> + <dt><code>creationTime</code></dt> + <dd>The time this checkpoint was created. The time is specified + in seconds since the Epoch, UTC (i.e. Unix time). Readonly. + </dd> + <dt><code>parent</code></dt> + <dd>The parent of this checkpoint. If present, this element + contains exactly one child element, name. This specifies the + name of the parent checkpoint of this one, and is used to + represent trees of checkpoints. Readonly. + </dd> + <dt><code>domain</code></dt> + <dd>The inactive <a href="formatdomain.html">domain + configuration</a> at the time the checkpoint was created. + Readonly. + </dd> + </dl> + + <h2><a id="BackupAttributes">Backup XML</a></h2> + + <p> + Creating a backup, whether full or incremental, is done + via <code>virDomainBackupBegin()</code>, which takes an XML + description of the actions to perform. There are two general + modes for backups: a push mode (where the hypervisor writes out + the data to the destination file, which may be local or remote), + and a pull mode (where the hypervisor creates an NBD server that + a third-party client can then read as needed, and which requires + the use of temporary storage, typically local, until the backup + is complete). + </p> + <p> + The instructions for beginning a backup job are provided as + attributes and elements of the + top-level <code>domainbackup</code> element. This element + includes an optional attribute <code>mode</code> which can be + either "push" or "pull" (default push). Where elements are + optional on creation, <code>virDomainBackupGetXMLDesc()</code> + can be used to see the actual values selected (for example, + learning which port the NBD server is using in the pull model, + or what file names libvirt generated when none were supplied). + The following child elements are supported: + </p> + <dl> + <dt><code>incremental</code></dt> + <dd>Optional. If this element is present, it must name an + existing checkpoint of the domain, which will be used to make + this backup an incremental one (in the push model, only + changes since the checkpoint are written to the destination; + in the pull model, the NBD server uses the + NBD_OPT_SET_META_CONTEXT extension to advertise to the client + which portions of the export contain changes since the + checkpoint). If omitted, a full backup is performed. + </dd> + <dt><code>server</code></dt> + <dd>Present only for a pull mode backup. Contains the same + attributes as the <code>protocol</code> element of a disk + attached via NBD in the domain (such as transport, socket, + name, port, or tls), necessary to set up an NBD server that + exposes the content of each disk at the time the backup + started. + </dd> + <dt><code>disks</code></dt> + <dd>This is an optional listing of instructions for disks + participating in the backup (if omitted, all disks + participate, and libvirt attempts to generate filenames by + appending the current timestamp as a suffix). When provided on + input, disks omitted from the list do not participate in the + backup. On output, the list is present but contains only the + disks participating in the backup job. This element has a + list of <code>disk</code> sub-elements, describing anywhere + from one to all of the disks associated with the domain. + <dl> + <dt><code>disk</code></dt> + <dd>This sub-element describes the checkpoint properties of + a specific disk. The attribute <code>name</code> is + mandatory, and must match either the <code><target + dev='name'/></code> or an unambiguous <code><source + file='name'/></code> of one of + the <a href="formatdomain.html#elementsDisks">disk + devices</a> specified for the domain at the time of the + checkpoint. The optional attribute <code>type</code> can + be <code>file</code>, <code>block</code>, + or <code>networks</code>, similar to a disk declaration + for a domain, controls what additional sub-elements are + needed to describe the destination (such + as <code>protocol</code> for a network destination). In + push mode backups, the primary subelement + is <code>target</code>; in pull mode, the primary sublement + is <code>scratch</code>; but either way, + the primary sub-element describes the file name to be used + during the backup operation, similar to + the <code>source</code> sub-element of a domain disk. An + optional sublement <code>driver</code> can also be used to + specify a destination format different from qcow2. + </dd> + </dl> + </dd> + </dl> + + <h2><a id="example">Examples</a></h2> + + <p>Using this XML to create a checkpoint of just vda on a qemu + domain with two disks and a prior checkpoint:</p> + <pre> +<domaincheckpoint> + <description>Completion of updates after OS install</description> + <disks> + <disk name='vda' checkpoint='bitmap'/> + <disk name='vdb' checkpoint='no'/> + </disks> +</domaincheckpoint></pre> + + <p>will result in XML similar to this from + <code>virDomainCheckpointGetXMLDesc()</code>:</p> + <pre> +<domaincheckpoint> + <name>1525889631</name> + <description>Completion of updates after OS install</description> + <creationTime>1525889631</creationTime> + <parent> + <name>1525111885</name> + </parent> + <disks> + <disk name='vda' checkpoint='bitmap' bitmap='1525889631'/> + <disk name='vdb' checkpoint='no'/> + </disks> + <domain> + <name>fedora</name> + <uuid>93a5c045-6457-2c09-e56c-927cdf34e178</uuid> + <memory>1048576</memory> + ... + <devices> + <disk type='file' device='disk'> + <driver name='qemu' type='qcow2'/> + <source file='/path/to/file1'/> + <target dev='vda' bus='virtio'/> + </disk> + <disk type='file' device='disk' snapshot='external'> + <driver name='qemu' type='raw'/> + <source file='/path/to/file2'/> + <target dev='vdb' bus='virtio'/> + </disk> + ... + </devices> + </domain> +</domaincheckpoint></pre> + + <p>With that checkpoint created, the qcow2 image is now tracking + all changes that occur in the image since the checkpoint via + the persistent bitmap named <code>1525889631</code>. Now, we + can make a subsequent call + to <code>virDomainBackupBegin()</code> to perform an incremental + backup of just this data, using the following XML to start a + pull model NBD export of the vda disk: + </p> + <pre> +<domainbackup mode="pull"> + <incremental>1525889631</incremental> + <server transport="unix" socket="/path/to/server"/> + <disks/> + <disk name='vda' type='file'/> + <scratch file=/path/to/file1.scratch'/> + </disk> + </disks/> +</domainbackup> + </pre> + </body> +</html> diff --git a/docs/schemas/domaincheckpoint.rng b/docs/schemas/domaincheckpoint.rng new file mode 100644 index 0000000000..1e2c16e035 --- /dev/null +++ b/docs/schemas/domaincheckpoint.rng @@ -0,0 +1,89 @@ +<?xml version="1.0"?> +<!-- A Relax NG schema for the libvirt domain checkpoint properties XML format --> +<grammar xmlns="http://relaxng.org/ns/structure/1.0"> + <start> + <ref name='domaincheckpoint'/> + </start> + + <include href='domaincommon.rng'/> + + <define name='domaincheckpoint'> + <element name='domaincheckpoint'> + <interleave> + <optional> + <element name='name'> + <text/> + </element> + </optional> + <optional> + <element name='description'> + <text/> + </element> + </optional> + <optional> + <element name='creationTime'> + <text/> + </element> + </optional> + <optional> + <element name='disks'> + <zeroOrMore> + <ref name='diskcheckpoint'/> + </zeroOrMore> + </element> + </optional> + <optional> + <choice> + <element name='domain'> + <element name='uuid'> + <ref name="UUID"/> + </element> + </element> + <!-- Nested grammar ensures that any of our overrides of + storagecommon/domaincommon defines do not conflict + with any domain.rng overrides. --> + <grammar> + <include href='domain.rng'/> + </grammar> + </choice> + </optional> + <optional> + <element name='parent'> + <element name='name'> + <text/> + </element> + </element> + </optional> + </interleave> + </element> + </define> + + <define name='diskcheckpoint'> + <element name='disk'> + <attribute name='name'> + <choice> + <ref name='diskTarget'/> + <ref name='absFilePath'/> + </choice> + </attribute> + <choice> + <attribute name='checkpoint'> + <value>no</value> + </attribute> + <group> + <optional> + <attribute name='checkpoint'> + <value>bitmap</value> + </attribute> + </optional> + <optional> + <attribute name='bitmap'> + <text/> + </attribute> + </optional> + </group> + </choice> + </element> + </define> + +</grammar> diff --git a/libvirt.spec.in b/libvirt.spec.in index ace05820aa..50bd79a7d7 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -2044,6 +2044,7 @@ exit 0 %{_datadir}/libvirt/schemas/cputypes.rng %{_datadir}/libvirt/schemas/domain.rng %{_datadir}/libvirt/schemas/domaincaps.rng +%{_datadir}/libvirt/schemas/domaincheckpoint.rng %{_datadir}/libvirt/schemas/domaincommon.rng %{_datadir}/libvirt/schemas/domainsnapshot.rng %{_datadir}/libvirt/schemas/interface.rng diff --git a/mingw-libvirt.spec.in b/mingw-libvirt.spec.in index 917d2143d8..6912527cf7 100644 --- a/mingw-libvirt.spec.in +++ b/mingw-libvirt.spec.in @@ -241,6 +241,7 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw32_datadir}/libvirt/schemas/cputypes.rng %{mingw32_datadir}/libvirt/schemas/domain.rng %{mingw32_datadir}/libvirt/schemas/domaincaps.rng +%{mingw32_datadir}/libvirt/schemas/domaincheckpoint.rng %{mingw32_datadir}/libvirt/schemas/domaincommon.rng %{mingw32_datadir}/libvirt/schemas/domainsnapshot.rng %{mingw32_datadir}/libvirt/schemas/interface.rng @@ -326,6 +327,7 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw64_datadir}/libvirt/schemas/cputypes.rng %{mingw64_datadir}/libvirt/schemas/domain.rng %{mingw64_datadir}/libvirt/schemas/domaincaps.rng +%{mingw32_datadir}/libvirt/schemas/domaincheckpoint.rng %{mingw64_datadir}/libvirt/schemas/domaincommon.rng %{mingw64_datadir}/libvirt/schemas/domainsnapshot.rng %{mingw64_datadir}/libvirt/schemas/interface.rng diff --git a/tests/domaincheckpointxml2xmlin/empty.xml b/tests/domaincheckpointxml2xmlin/empty.xml new file mode 100644 index 0000000000..dc36449142 --- /dev/null +++ b/tests/domaincheckpointxml2xmlin/empty.xml @@ -0,0 +1 @@ +<domaincheckpoint/> diff --git a/tests/domaincheckpointxml2xmlout/empty.xml b/tests/domaincheckpointxml2xmlout/empty.xml new file mode 100644 index 0000000000..a26c7caab0 --- /dev/null +++ b/tests/domaincheckpointxml2xmlout/empty.xml @@ -0,0 +1,10 @@ +<domaincheckpoint> + <name>1525889631</name> + <creationTime>1525889631</creationTime> + <disks> + <disk name='vda' checkpoint='bitmap' bitmap='1525889631'/> + </disks> + <domain> + <uuid>9d37b878-a7cc-9f9a-b78f-49b3abad25a8</uuid> + </domain> +</domaincheckpoint> diff --git a/tests/virschematest.c b/tests/virschematest.c index 2d35833919..b866db4326 100644 --- a/tests/virschematest.c +++ b/tests/virschematest.c @@ -223,6 +223,8 @@ mymain(void) "genericxml2xmloutdata", "xlconfigdata", "libxlxml2domconfigdata", "qemuhotplugtestdomains"); DO_TEST_DIR("domaincaps.rng", "domaincapsschemadata"); + DO_TEST_DIR("domaincheckpoint.rng", "domaincheckpointxml2xmlin", + "domaincheckpointxml2xmlout"); DO_TEST_DIR("domainsnapshot.rng", "domainsnapshotxml2xmlin", "domainsnapshotxml2xmlout"); DO_TEST_DIR("interface.rng", "interfaceschemadata"); -- 2.14.4

On 06/13/2018 12:42 PM, Eric Blake wrote:
Prepare for new checkpoint and backup APIs by describing the XML that will represent a checkpoint. This is modeled heavily after the XML for virDomainSnapshotPtr, since both represent a point in time of the guest. But while a snapshot exists with the intent of rolling back to that state, a checkpoint instead makes it possible to create an incremental backup at a later time.
Add testsuite coverage of a minimal use of the XML.
Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/docs.html.in | 3 +- docs/domainstatecapture.html.in | 4 +- docs/formatcheckpoint.html.in | 273 +++++++++++++++++++++++++++++ docs/schemas/domaincheckpoint.rng | 89 ++++++++++ libvirt.spec.in | 1 + mingw-libvirt.spec.in | 2 + tests/domaincheckpointxml2xmlin/empty.xml | 1 + tests/domaincheckpointxml2xmlout/empty.xml | 10 ++ tests/virschematest.c | 2 + 9 files changed, 382 insertions(+), 3 deletions(-) create mode 100644 docs/formatcheckpoint.html.in create mode 100644 docs/schemas/domaincheckpoint.rng create mode 100644 tests/domaincheckpointxml2xmlin/empty.xml create mode 100644 tests/domaincheckpointxml2xmlout/empty.xml
diff --git a/docs/docs.html.in b/docs/docs.html.in index 4c46b74980..11dfd27ba6 100644 --- a/docs/docs.html.in +++ b/docs/docs.html.in @@ -79,7 +79,8 @@ <a href="formatdomaincaps.html">domain capabilities</a>, <a href="formatnode.html">node devices</a>, <a href="formatsecret.html">secrets</a>, - <a href="formatsnapshot.html">snapshots</a></dd> + <a href="formatsnapshot.html">snapshots</a>, + <a href="formatcheckpoint.html">checkpoints</a></dd>
<dt><a href="uri.html">URI format</a></dt> <dd>The URI formats used for connecting to libvirt</dd>
Add a link in the format.html.in and index.html.in pages too.
diff --git a/docs/domainstatecapture.html.in b/docs/domainstatecapture.html.in index 00ab7e8ee1..4de93c87c8 100644 --- a/docs/domainstatecapture.html.in +++ b/docs/domainstatecapture.html.in @@ -154,9 +154,9 @@ time as a new backup, so that the next incremental backup can refer to the incremental state since the checkpoint created during the current backup. Guest state is then actually - captured using <code>virDomainBackupBegin()</code>. <!--See also + captured using <code>virDomainBackupBegin()</code>. See also the <a href="formatcheckpoint.html">XML details</a> used with - this command.--></dd> + this command.</dd> </dl>
<h2><a id="examples">Examples</a></h2> diff --git a/docs/formatcheckpoint.html.in b/docs/formatcheckpoint.html.in new file mode 100644 index 0000000000..34507a9f68 --- /dev/null +++ b/docs/formatcheckpoint.html.in @@ -0,0 +1,273 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html> +<html xmlns="http://www.w3.org/1999/xhtml"> + <body> + <h1>Checkpoint and Backup XML format</h1> + + <ul id="toc"></ul> + + <h2><a id="CheckpointAttributes">Checkpoint XML</a></h2> + + <p> + Domain disk backups, including incremental backups, are one form'
+ of <a href="domainstatecapture.html">domain state capture</a>. + </p>
IMO: Strange opening line for something describing checkpoints. As I've read further, the fact that checkpoint and backup is only supported for qcow2 domain disks I would think needs to be up at the top here - front an center. No sense reading any further for raw disks. Yet another patch 2 "factor" related to choosing a backup plan. What to do if you have raw devices and how those should be handled. This would seem to preclude LUKS encrypted devices, true? What about disks with various levels of backingStore logic? I can only imagine some of the depth issues causing problems with that logic would be applicable here too with the hierarchical approach.
+ <p> + Libvirt is able to facilitate incremental backups by tracking + disk checkpoints, or points in time against which it is easy to
s/, or/ or/
+ compute which portion of the disk has changed. Given a full + backup (a backup created from the creation of the disk to a + given point in time, coupled with the creation of a disk
s/time,/time)
+ checkpoint at that time), and an incremental backup (a backup
s/time),/time,
+ created from just the dirty portion of the disk between the + first checkpoint and the second backup operation), it is + possible to do an offline reconstruction of the state of the + disk at the time of the second backup, without having to copy as
s/without,/without/
+ much data as a second full backup would require. Most disk + checkpoints are created in concert with a backup,
s/backup,/backup/
+ via <code>virDomainBackupBegin()</code>; however, libvirt also + exposes enough support to create disk checkpoints independently + from a backup operation,
s/operation,/operation/
+ via <code>virDomainCheckpointCreateXML()</code>. + </p>
NB: virDomainBackupBegin doesn't exist yet. Still a few patches away.
+ <p> + Attributes of libvirt checkpoints are stored as child elements of + the <code>domaincheckpoint</code> element. At checkpoint creation + time, normally only the <code>name</code>, <code>description</code>, + and <code>disks</code> elements are settable; the rest of the
s/; the/. The
+ fields are ignored on creation, and will be filled in by
s/creation,/creation/
+ libvirt in for informational purposes + by <code>virDomainCheckpointGetXMLDesc()</code>. However, when + redefining a checkpoint, + with the <code>VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE</code> flag + of <code>virDomainCheckpointCreateXML()</code>, all of the XML
s/XML/XML fields/
+ described here is relevant.
s/is/are/
+ </p>
"All" of them? Even the readonly ones?
+ <p> + Checkpoints are maintained in a hierarchy. A domain can have a
First sentence I think should start previous paragraph...
+ current checkpoint, which is the most recent checkpoint compared to + the current state of the domain (although a domain might have + checkpoints without a current checkpoint, if checkpoints have been + deleted in the meantime). Creating or reverting to a checkpoint
What is inside the parenthesis is quite confusing and perhaps "forward thinking" considering not all checkpoint concepts are described yet.
+ sets that checkpoint as current, and the prior current checkpoint is + the parent of the new checkpoint. Branches in the hierarchy can
So you can create a "current" checkpoint or revert to a checkpoint making it current - so how then does a domain ever have checkpoints without a current checkpoint. Is this a chicken and egg problem?
+ be formed by reverting to a checkpoint with a child, then creating + another checkpoint.
Can one merge two or more checkpoints? Let's say I have A, B, C, D, and E. I could care less about B, C, and D now and I want to merge them to create F which is essentially all 3 sets of changes leaving the time oriented order as A, F, E. I still don't see how a domain doesn't have a current checkpoint even though there are checkpoints.
+ </p>
Does the live domain maintain a pointer to the checkpoint? Or are domaincheckpoint xml files only valid with their own context? How would the checkpoint code know if a domain was removed? Following events?
+ <p> + The top-level <code>domaincheckpoint</code> element may contain + the following elements:
s/following/following optional/
+ </p> + <dl> + <dt><code>name</code></dt> + <dd>The name for this checkpoint. If the name is specified when + initially creating the checkpoint, then the checkpoint will have + that particular name. If the name is omitted when initially + creating the checkpoint, then libvirt will make up a name for + the checkpoint, based on the time when it was created.
Assuming epoch and that seems to be confirmed by the example.
+ </dd> + <dt><code>description</code></dt> + <dd>A human-readable description of the checkpoint. If the + description is omitted when initially creating the checkpoint, + then this field will be empty. + </dd> + <dt><code>disks</code></dt> + <dd>On input, this is an optional listing of specific + instructions for disk checkpoints; it is needed when making a
What does specific instructions mean in this context?
+ checkpoint on only a subset of the disks associated with a + domain (in particular, since qemu checkpoints require qcow2 + disks, this element may be needed on input for excluding guest + disks that are not in qcow2 format); if omitted on input, then
Ah, well now you know why I added the note at the top... This is buried inside this description.
+ all disks participate in the checkpoint. On output, this is
I have visions of sugar plum fairies and QE teams using hot (un)plug to make life absolutely miserable in this regard. Setting aside hot [un]plug, this makes it's possible to define a subset of domain qcow2 disks to participate in the checkpoint/backup, fair statement? If not provided, then all qcow2 disks participate.
+ fully populated to show the state of each disk in the + checkpoint. This element has a list of <code>disk</code> + sub-elements, describing anywhere from one to all of the disks
s/the/the qcow2 formatted/
+ associated with the domain. + <dl> + <dt><code>disk</code></dt> + <dd>This sub-element describes the checkpoint properties of + a specific disk. The attribute <code>name</code> is + mandatory, and must match either the <code><target + dev='name'/></code> or an unambiguous <code><source + file='name'/></code> of one of + the <a href="formatdomain.html#elementsDisks">disk + devices</a> specified for the domain at the time of the
I honestly think you should just pick one the target dev and be done, but I assume there's a reason for both which I don't understand yet.
+ checkpoint. The attribute <code>checkpoint</code> is + optional on input; possible values are <code>no</code> + when the disk does not participate in this checkpoint; + or <code>bitmap</code> if the disk will track all changes + since the creation of this checkpoint via a bitmap, in + which case another attribute <code>bitmap</code> will be + the name of the tracking bitmap (defaulting to the + checkpoint name).
So now I'm confused again. Initially it seemed as though the <disk> sub-elements would be the only ones participating, but now it feels like it's possible to pick and choose to what level something participates. If nothing is provided, can someone change the 'checkpoint' attribute to 'no' (assuming the default is 'bitmap')? This is all rather mind boggling - why can stuff never be simple /-|
+ </dd> + </dl> + </dd> + <dt><code>creationTime</code></dt> + <dd>The time this checkpoint was created. The time is specified + in seconds since the Epoch, UTC (i.e. Unix time). Readonly. + </dd> + <dt><code>parent</code></dt> + <dd>The parent of this checkpoint. If present, this element + contains exactly one child element, name. This specifies the + name of the parent checkpoint of this one, and is used to + represent trees of checkpoints. Readonly. + </dd> + <dt><code>domain</code></dt> + <dd>The inactive <a href="formatdomain.html">domain + configuration</a> at the time the checkpoint was created. + Readonly.
The whole initial inactive/config XML is loaded here? Or just certain fields. The example XML output has just a UUID, but the data here has a lot more and the RNG shows everything. Did I read something wrong earlier regarding saving domain state? I though the checkpoint/backup code was purely for the domain disks that are of type qcow2. I can see saving the UUID since that cannot change (unlike the name)... But why does the entire "original" config need to be added here?
+ </dd> + </dl> + + <h2><a id="BackupAttributes">Backup XML</a></h2> + + <p> + Creating a backup, whether full or incremental, is done + via <code>virDomainBackupBegin()</code>, which takes an XML + description of the actions to perform. There are two general + modes for backups: a push mode (where the hypervisor writes out + the data to the destination file, which may be local or remote), + and a pull mode (where the hypervisor creates an NBD server that + a third-party client can then read as needed, and which requires + the use of temporary storage, typically local, until the backup + is complete). + </p>
I think the modes should be described in own bullets rather than part of a long-ish sentence using parentheses to help define what is meant. Perhaps even more verbiage regarding general usage expectations of each model.
+ <p> + The instructions for beginning a backup job are provided as + attributes and elements of the + top-level <code>domainbackup</code> element. This element + includes an optional attribute <code>mode</code> which can be + either "push" or "pull" (default push). Where elements are
s/Where/Although ???
+ optional on creation, <code>virDomainBackupGetXMLDesc()</code> + can be used to see the actual values selected (for example, + learning which port the NBD server is using in the pull model,
s/model,/model/
+ or what file names libvirt generated when none were supplied). + The following child elements are supported: + </p> + <dl>
Since all elements are optional, probably don't have to note it again for each description...
+ <dt><code>incremental</code></dt> + <dd>Optional. If this element is present, it must name an + existing checkpoint of the domain, which will be used to make
s/domain,/domain/
+ this backup an incremental one (in the push model, only + changes since the checkpoint are written to the destination; + in the pull model, the NBD server uses the + NBD_OPT_SET_META_CONTEXT extension to advertise to the client + which portions of the export contain changes since the + checkpoint). If omitted, a full backup is performed.
The wording in the parentheses doesn't make complete sense yet. Perhaps it's better to have each as a bullet describing the action of the element for the backup "mode" value.
+ </dd> + <dt><code>server</code></dt> + <dd>Present only for a pull mode backup. Contains the same + attributes as the <code>protocol</code> element of a disk + attached via NBD in the domain (such as transport, socket, + name, port, or tls), necessary to set up an NBD server that + exposes the content of each disk at the time the backup
contents ?
+ started. + </dd>
This hunk above uses <tab>'s not spaces (syntax-check)
+ <dt><code>disks</code></dt> + <dd>This is an optional listing of instructions for disks + participating in the backup (if omitted, all disks + participate, and libvirt attempts to generate filenames by + appending the current timestamp as a suffix). When provided on + input, disks omitted from the list do not participate in the + backup. On output, the list is present but contains only the + disks participating in the backup job. This element has a + list of <code>disk</code> sub-elements, describing anywhere + from one to all of the disks associated with the domain.
again, qcow2 formatted disks, right? Is that true for both mode's of backup?
+ <dl> + <dt><code>disk</code></dt> + <dd>This sub-element describes the checkpoint properties of + a specific disk. The attribute <code>name</code> is + mandatory, and must match either the <code><target + dev='name'/></code> or an unambiguous <code><source + file='name'/></code> of one of + the <a href="formatdomain.html#elementsDisks">disk + devices</a> specified for the domain at the time of the + checkpoint. The optional attribute <code>type</code> can + be <code>file</code>, <code>block</code>, + or <code>networks</code>, similar to a disk declaration
network
+ for a domain, controls what additional sub-elements are
s/,/ and/
+ needed to describe the destination (such + as <code>protocol</code> for a network destination). In + push mode backups, the primary subelement
sub-element
+ is <code>target</code>; in pull mode, the primary sublement
sub-element
+ is <code>scratch</code>; but either way, + the primary sub-element describes the file name to be used + during the backup operation, similar to + the <code>source</code> sub-element of a domain disk. An + optional sublement <code>driver</code> can also be used to
sub-element
+ specify a destination format different from qcow2. + </dd> + </dl> + </dd> + </dl>
Well, suffice to say, I'm lost. I guess I at least am getting an idea why no consensus has been reached yet.
+ + <h2><a id="example">Examples</a></h2> + + <p>Using this XML to create a checkpoint of just vda on a qemu + domain with two disks and a prior checkpoint:</p> + <pre> +<domaincheckpoint> + <description>Completion of updates after OS install</description> + <disks> + <disk name='vda' checkpoint='bitmap'/> + <disk name='vdb' checkpoint='no'/> + </disks> +</domaincheckpoint></pre> + + <p>will result in XML similar to this from + <code>virDomainCheckpointGetXMLDesc()</code>:</p> + <pre> +<domaincheckpoint> + <name>1525889631</name> + <description>Completion of updates after OS install</description> + <creationTime>1525889631</creationTime> + <parent> + <name>1525111885</name> + </parent> + <disks> + <disk name='vda' checkpoint='bitmap' bitmap='1525889631'/> + <disk name='vdb' checkpoint='no'/> + </disks> + <domain> + <name>fedora</name> + <uuid>93a5c045-6457-2c09-e56c-927cdf34e178</uuid> + <memory>1048576</memory> + ... + <devices> + <disk type='file' device='disk'> + <driver name='qemu' type='qcow2'/> + <source file='/path/to/file1'/> + <target dev='vda' bus='virtio'/> + </disk> + <disk type='file' device='disk' snapshot='external'> + <driver name='qemu' type='raw'/> + <source file='/path/to/file2'/> + <target dev='vdb' bus='virtio'/> + </disk> + ... + </devices> + </domain> +</domaincheckpoint></pre> + + <p>With that checkpoint created, the qcow2 image is now tracking + all changes that occur in the image since the checkpoint via + the persistent bitmap named <code>1525889631</code>. Now, we + can make a subsequent call + to <code>virDomainBackupBegin()</code> to perform an incremental + backup of just this data, using the following XML to start a + pull model NBD export of the vda disk: + </p> + <pre> +<domainbackup mode="pull"> + <incremental>1525889631</incremental> + <server transport="unix" socket="/path/to/server"/> + <disks/> + <disk name='vda' type='file'/> + <scratch file=/path/to/file1.scratch'/> + </disk> + </disks/> +</domainbackup> + </pre> + </body> +</html> diff --git a/docs/schemas/domaincheckpoint.rng b/docs/schemas/domaincheckpoint.rng new file mode 100644 index 0000000000..1e2c16e035 --- /dev/null +++ b/docs/schemas/domaincheckpoint.rng @@ -0,0 +1,89 @@ +<?xml version="1.0"?> +<!-- A Relax NG schema for the libvirt domain checkpoint properties XML format --> +<grammar xmlns="http://relaxng.org/ns/structure/1.0"> + <start> + <ref name='domaincheckpoint'/> + </start> + + <include href='domaincommon.rng'/> + + <define name='domaincheckpoint'> + <element name='domaincheckpoint'> + <interleave> + <optional> + <element name='name'> + <text/> + </element> + </optional> + <optional> + <element name='description'> + <text/> + </element> + </optional> + <optional> + <element name='creationTime'> + <text/> + </element> + </optional> + <optional> + <element name='disks'> + <zeroOrMore> + <ref name='diskcheckpoint'/> + </zeroOrMore> + </element> + </optional> + <optional> + <choice> + <element name='domain'> + <element name='uuid'> + <ref name="UUID"/> + </element> + </element> + <!-- Nested grammar ensures that any of our overrides of + storagecommon/domaincommon defines do not conflict + with any domain.rng overrides. --> + <grammar> + <include href='domain.rng'/> + </grammar> + </choice> + </optional> + <optional> + <element name='parent'> + <element name='name'> + <text/> + </element> + </element> + </optional> + </interleave> + </element> + </define>
Since everything is optional, does <optional> have be supplied for each element or can there be one <optional> before the <interleave>? I see no <domainbackup> definition yet either - I assume it's coming. John
+ + <define name='diskcheckpoint'> + <element name='disk'> + <attribute name='name'> + <choice> + <ref name='diskTarget'/> + <ref name='absFilePath'/> + </choice> + </attribute> + <choice> + <attribute name='checkpoint'> + <value>no</value> + </attribute> + <group> + <optional> + <attribute name='checkpoint'> + <value>bitmap</value> + </attribute> + </optional> + <optional> + <attribute name='bitmap'> + <text/> + </attribute> + </optional> + </group> + </choice> + </element> + </define> + +</grammar> diff --git a/libvirt.spec.in b/libvirt.spec.in index ace05820aa..50bd79a7d7 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -2044,6 +2044,7 @@ exit 0 %{_datadir}/libvirt/schemas/cputypes.rng %{_datadir}/libvirt/schemas/domain.rng %{_datadir}/libvirt/schemas/domaincaps.rng +%{_datadir}/libvirt/schemas/domaincheckpoint.rng %{_datadir}/libvirt/schemas/domaincommon.rng %{_datadir}/libvirt/schemas/domainsnapshot.rng %{_datadir}/libvirt/schemas/interface.rng diff --git a/mingw-libvirt.spec.in b/mingw-libvirt.spec.in index 917d2143d8..6912527cf7 100644 --- a/mingw-libvirt.spec.in +++ b/mingw-libvirt.spec.in @@ -241,6 +241,7 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw32_datadir}/libvirt/schemas/cputypes.rng %{mingw32_datadir}/libvirt/schemas/domain.rng %{mingw32_datadir}/libvirt/schemas/domaincaps.rng +%{mingw32_datadir}/libvirt/schemas/domaincheckpoint.rng %{mingw32_datadir}/libvirt/schemas/domaincommon.rng %{mingw32_datadir}/libvirt/schemas/domainsnapshot.rng %{mingw32_datadir}/libvirt/schemas/interface.rng @@ -326,6 +327,7 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw64_datadir}/libvirt/schemas/cputypes.rng %{mingw64_datadir}/libvirt/schemas/domain.rng %{mingw64_datadir}/libvirt/schemas/domaincaps.rng +%{mingw32_datadir}/libvirt/schemas/domaincheckpoint.rng %{mingw64_datadir}/libvirt/schemas/domaincommon.rng %{mingw64_datadir}/libvirt/schemas/domainsnapshot.rng %{mingw64_datadir}/libvirt/schemas/interface.rng diff --git a/tests/domaincheckpointxml2xmlin/empty.xml b/tests/domaincheckpointxml2xmlin/empty.xml new file mode 100644 index 0000000000..dc36449142 --- /dev/null +++ b/tests/domaincheckpointxml2xmlin/empty.xml @@ -0,0 +1 @@ +<domaincheckpoint/> diff --git a/tests/domaincheckpointxml2xmlout/empty.xml b/tests/domaincheckpointxml2xmlout/empty.xml new file mode 100644 index 0000000000..a26c7caab0 --- /dev/null +++ b/tests/domaincheckpointxml2xmlout/empty.xml @@ -0,0 +1,10 @@ +<domaincheckpoint> + <name>1525889631</name> + <creationTime>1525889631</creationTime> + <disks> + <disk name='vda' checkpoint='bitmap' bitmap='1525889631'/> + </disks> + <domain> + <uuid>9d37b878-a7cc-9f9a-b78f-49b3abad25a8</uuid> + </domain> +</domaincheckpoint> diff --git a/tests/virschematest.c b/tests/virschematest.c index 2d35833919..b866db4326 100644 --- a/tests/virschematest.c +++ b/tests/virschematest.c @@ -223,6 +223,8 @@ mymain(void) "genericxml2xmloutdata", "xlconfigdata", "libxlxml2domconfigdata", "qemuhotplugtestdomains"); DO_TEST_DIR("domaincaps.rng", "domaincapsschemadata"); + DO_TEST_DIR("domaincheckpoint.rng", "domaincheckpointxml2xmlin", + "domaincheckpointxml2xmlout"); DO_TEST_DIR("domainsnapshot.rng", "domainsnapshotxml2xmlin", "domainsnapshotxml2xmlout"); DO_TEST_DIR("interface.rng", "interfaceschemadata");

Prepare for new checkpoint and backup APIs by describing the XML that will represent a checkpoint and backup. The checkpoint XML is modeled heavily after virDomainSnapshotPtr, since both represent a point in time of the guest (however, a snapshot exists with the intent to roll back to that point, while a checkpoint exists to facilitate later incremental backups). Meanwhile, the backup XML has enough information to represent both push model (the hypervisor writes the backup file to a location of the user's choice) and the pull model (the hypervisor needs local temporary storage, and also creates an NBD server that the user can use to read the backup via a third-party client).. But while a snapshot exists with the intent of rolling back to that state, a checkpoint instead makes it possible to create an incremental backup at a later time. Add testsuite coverage for some minimal uses of both XML. Ultimately, I'd love for push model backups to target a network driver rather than just a local file or block device; but doing that got hairy (while <domain> uses <source> as the description of a host or network resource, I picked <target> as the description of a push model backup target [defaults to qcow2 but can also be raw or any other format], and <scratch> as the description of a pull model backup scratch space [must be qcow2]). The ideal refactoring would be a way to parameterize RNG to accept <disk type='FOO'>...</disk> so that the name of the subelement can be <source> for domain, or <target> or <scratch> as needed for backups. Future patches may improve this area of code. Signed-off-by: Eric Blake <eblake@redhat.com> --- An updated version of the XML descriptions, while I still work on piecing together the pieces needed to demo the XML in action. Missing changes that were pointed out by John, but also supplies the <domainbackup> XML that was previously a black box. docs/docs.html.in | 3 +- docs/domainstatecapture.html.in | 4 +- docs/format.html.in | 1 + docs/formatcheckpoint.html.in | 273 +++++++++++++++++++++++++++ docs/index.html.in | 3 +- docs/schemas/domainbackup.rng | 180 ++++++++++++++++++ docs/schemas/domaincheckpoint.rng | 89 +++++++++ libvirt.spec.in | 2 + mingw-libvirt.spec.in | 4 + tests/domainbackupxml2xmlin/backup-pull.xml | 9 + tests/domainbackupxml2xmlin/backup-push.xml | 9 + tests/domainbackupxml2xmlin/empty.xml | 1 + tests/domainbackupxml2xmlout/backup-pull.xml | 9 + tests/domainbackupxml2xmlout/backup-push.xml | 9 + tests/domainbackupxml2xmlout/empty.xml | 7 + tests/domaincheckpointxml2xmlin/empty.xml | 1 + tests/domaincheckpointxml2xmlin/sample.xml | 7 + tests/domaincheckpointxml2xmlout/empty.xml | 10 + tests/domaincheckpointxml2xmlout/sample.xml | 16 ++ tests/virschematest.c | 4 + 20 files changed, 637 insertions(+), 4 deletions(-) create mode 100644 docs/formatcheckpoint.html.in create mode 100644 docs/schemas/domainbackup.rng create mode 100644 docs/schemas/domaincheckpoint.rng create mode 100644 tests/domainbackupxml2xmlin/backup-pull.xml create mode 100644 tests/domainbackupxml2xmlin/backup-push.xml create mode 100644 tests/domainbackupxml2xmlin/empty.xml create mode 100644 tests/domainbackupxml2xmlout/backup-pull.xml create mode 100644 tests/domainbackupxml2xmlout/backup-push.xml create mode 100644 tests/domainbackupxml2xmlout/empty.xml create mode 100644 tests/domaincheckpointxml2xmlin/empty.xml create mode 100644 tests/domaincheckpointxml2xmlin/sample.xml create mode 100644 tests/domaincheckpointxml2xmlout/empty.xml create mode 100644 tests/domaincheckpointxml2xmlout/sample.xml diff --git a/docs/docs.html.in b/docs/docs.html.in index 4c46b74980..4914e7dbed 100644 --- a/docs/docs.html.in +++ b/docs/docs.html.in @@ -79,7 +79,8 @@ <a href="formatdomaincaps.html">domain capabilities</a>, <a href="formatnode.html">node devices</a>, <a href="formatsecret.html">secrets</a>, - <a href="formatsnapshot.html">snapshots</a></dd> + <a href="formatsnapshot.html">snapshots</a>, + <a href="formatcheckpoint.html">backups and checkpoints</a></dd> <dt><a href="uri.html">URI format</a></dt> <dd>The URI formats used for connecting to libvirt</dd> diff --git a/docs/domainstatecapture.html.in b/docs/domainstatecapture.html.in index 00ab7e8ee1..4de93c87c8 100644 --- a/docs/domainstatecapture.html.in +++ b/docs/domainstatecapture.html.in @@ -154,9 +154,9 @@ time as a new backup, so that the next incremental backup can refer to the incremental state since the checkpoint created during the current backup. Guest state is then actually - captured using <code>virDomainBackupBegin()</code>. <!--See also + captured using <code>virDomainBackupBegin()</code>. See also the <a href="formatcheckpoint.html">XML details</a> used with - this command.--></dd> + this command.</dd> </dl> <h2><a id="examples">Examples</a></h2> diff --git a/docs/format.html.in b/docs/format.html.in index 22b23e3fc7..8c4e15e079 100644 --- a/docs/format.html.in +++ b/docs/format.html.in @@ -24,6 +24,7 @@ <li><a href="formatnode.html">Node devices</a></li> <li><a href="formatsecret.html">Secrets</a></li> <li><a href="formatsnapshot.html">Snapshots</a></li> + <li><a href="formatcheckpoint.html">Backups and checkpoints</a></li> </ul> <h2>Command line validation</h2> diff --git a/docs/formatcheckpoint.html.in b/docs/formatcheckpoint.html.in new file mode 100644 index 0000000000..4d65c8e581 --- /dev/null +++ b/docs/formatcheckpoint.html.in @@ -0,0 +1,273 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html> +<html xmlns="http://www.w3.org/1999/xhtml"> + <body> + <h1>Checkpoint and Backup XML format</h1> + + <ul id="toc"></ul> + + <h2><a id="CheckpointAttributes">Checkpoint XML</a></h2> + + <p> + Domain disk backups, including incremental backups, are one form + of <a href="domainstatecapture.html">domain state capture</a>. + </p> + <p> + Libvirt is able to facilitate incremental backups by tracking + disk checkpoints, or points in time against which it is easy to + compute which portion of the disk has changed. Given a full + backup (a backup created from the creation of the disk to a + given point in time, coupled with the creation of a disk + checkpoint at that time), and an incremental backup (a backup + created from just the dirty portion of the disk between the + first checkpoint and the second backup operation), it is + possible to do an offline reconstruction of the state of the + disk at the time of the second backup, without having to copy as + much data as a second full backup would require. Most disk + checkpoints are created in concert with a backup, + via <code>virDomainBackupBegin()</code>; however, libvirt also + exposes enough support to create disk checkpoints independently + from a backup operation, + via <code>virDomainCheckpointCreateXML()</code>. + </p> + <p> + Attributes of libvirt checkpoints are stored as child elements of + the <code>domaincheckpoint</code> element. At checkpoint creation + time, normally only the <code>name</code>, <code>description</code>, + and <code>disks</code> elements are settable; the rest of the + fields are ignored on creation, and will be filled in by + libvirt in for informational purposes + by <code>virDomainCheckpointGetXMLDesc()</code>. However, when + redefining a checkpoint, + with the <code>VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE</code> flag + of <code>virDomainCheckpointCreateXML()</code>, all of the XML + described here is relevant. + </p> + <p> + Checkpoints are maintained in a hierarchy. A domain can have a + current checkpoint, which is the most recent checkpoint compared to + the current state of the domain (although a domain might have + checkpoints without a current checkpoint, if checkpoints have been + deleted in the meantime). Creating or reverting to a checkpoint + sets that checkpoint as current, and the prior current checkpoint is + the parent of the new checkpoint. Branches in the hierarchy can + be formed by reverting to a checkpoint with a child, then creating + another checkpoint. + </p> + <p> + The top-level <code>domaincheckpoint</code> element may contain + the following elements: + </p> + <dl> + <dt><code>name</code></dt> + <dd>The name for this checkpoint. If the name is specified when + initially creating the checkpoint, then the checkpoint will have + that particular name. If the name is omitted when initially + creating the checkpoint, then libvirt will make up a name for + the checkpoint, based on the time when it was created. + </dd> + <dt><code>description</code></dt> + <dd>A human-readable description of the checkpoint. If the + description is omitted when initially creating the checkpoint, + then this field will be empty. + </dd> + <dt><code>disks</code></dt> + <dd>On input, this is an optional listing of specific + instructions for disk checkpoints; it is needed when making a + checkpoint on only a subset of the disks associated with a + domain (in particular, since qemu checkpoints require qcow2 + disks, this element may be needed on input for excluding guest + disks that are not in qcow2 format); if omitted on input, then + all disks participate in the checkpoint. On output, this is + fully populated to show the state of each disk in the + checkpoint. This element has a list of <code>disk</code> + sub-elements, describing anywhere from one to all of the disks + associated with the domain. + <dl> + <dt><code>disk</code></dt> + <dd>This sub-element describes the checkpoint properties of + a specific disk. The attribute <code>name</code> is + mandatory, and must match either the <code><target + dev='name'/></code> or an unambiguous <code><source + file='name'/></code> of one of + the <a href="formatdomain.html#elementsDisks">disk + devices</a> specified for the domain at the time of the + checkpoint. The attribute <code>checkpoint</code> is + optional on input; possible values are <code>no</code> + when the disk does not participate in this checkpoint; + or <code>bitmap</code> if the disk will track all changes + since the creation of this checkpoint via a bitmap, in + which case another attribute <code>bitmap</code> will be + the name of the tracking bitmap (defaulting to the + checkpoint name). + </dd> + </dl> + </dd> + <dt><code>creationTime</code></dt> + <dd>The time this checkpoint was created. The time is specified + in seconds since the Epoch, UTC (i.e. Unix time). Readonly. + </dd> + <dt><code>parent</code></dt> + <dd>The parent of this checkpoint. If present, this element + contains exactly one child element, name. This specifies the + name of the parent checkpoint of this one, and is used to + represent trees of checkpoints. Readonly. + </dd> + <dt><code>domain</code></dt> + <dd>The inactive <a href="formatdomain.html">domain + configuration</a> at the time the checkpoint was created. + Readonly. + </dd> + </dl> + + <h2><a id="BackupAttributes">Backup XML</a></h2> + + <p> + Creating a backup, whether full or incremental, is done + via <code>virDomainBackupBegin()</code>, which takes an XML + description of the actions to perform. There are two general + modes for backups: a push mode (where the hypervisor writes out + the data to the destination file, which may be local or remote), + and a pull mode (where the hypervisor creates an NBD server that + a third-party client can then read as needed, and which requires + the use of temporary storage, typically local, until the backup + is complete). + </p> + <p> + The instructions for beginning a backup job are provided as + attributes and elements of the + top-level <code>domainbackup</code> element. This element + includes an optional attribute <code>mode</code> which can be + either "push" or "pull" (default push). Where elements are + optional on creation, <code>virDomainBackupGetXMLDesc()</code> + can be used to see the actual values selected (for example, + learning which port the NBD server is using in the pull model, + or what file names libvirt generated when none were supplied). + The following child elements are supported: + </p> + <dl> + <dt><code>incremental</code></dt> + <dd>Optional. If this element is present, it must name an + existing checkpoint of the domain, which will be used to make + this backup an incremental one (in the push model, only + changes since the checkpoint are written to the destination; + in the pull model, the NBD server uses the + NBD_OPT_SET_META_CONTEXT extension to advertise to the client + which portions of the export contain changes since the + checkpoint). If omitted, a full backup is performed. + </dd> + <dt><code>server</code></dt> + <dd>Present only for a pull mode backup. Contains the same + attributes as the <code>protocol</code> element of a disk + attached via NBD in the domain (such as transport, socket, + name, port, or tls), necessary to set up an NBD server that + exposes the content of each disk at the time the backup + started. + </dd> + <dt><code>disks</code></dt> + <dd>This is an optional listing of instructions for disks + participating in the backup (if omitted, all disks + participate, and libvirt attempts to generate filenames by + appending the current timestamp as a suffix). When provided on + input, disks omitted from the list do not participate in the + backup. On output, the list is present but contains only the + disks participating in the backup job. This element has a + list of <code>disk</code> sub-elements, describing anywhere + from one to all of the disks associated with the domain. + <dl> + <dt><code>disk</code></dt> + <dd>This sub-element describes the checkpoint properties of + a specific disk. The attribute <code>name</code> is + mandatory, and must match either the <code><target + dev='name'/></code> or an unambiguous <code><source + file='name'/></code> of one of + the <a href="formatdomain.html#elementsDisks">disk + devices</a> specified for the domain at the time of the + checkpoint. The optional attribute <code>type</code> can + be <code>file</code>, <code>block</code>, + or <code>network</code>, similar to a disk declaration + for a domain, controls what additional sub-elements are + needed to describe the destination (such + as <code>protocol</code> for a network destination). In + push mode backups, the primary subelement + is <code>target</code>; in pull mode, the primary sublement + is <code>scratch</code>; but either way, + the primary sub-element describes the file name to be used + during the backup operation, similar to + the <code>source</code> sub-element of a domain disk. An + optional sublement <code>driver</code> can also be used to + specify a destination format different from qcow2. + </dd> + </dl> + </dd> + </dl> + + <h2><a id="example">Examples</a></h2> + + <p>Using this XML to create a checkpoint of just vda on a qemu + domain with two disks and a prior checkpoint:</p> + <pre> +<domaincheckpoint> + <description>Completion of updates after OS install</description> + <disks> + <disk name='vda' checkpoint='bitmap'/> + <disk name='vdb' checkpoint='no'/> + </disks> +</domaincheckpoint></pre> + + <p>will result in XML similar to this from + <code>virDomainCheckpointGetXMLDesc()</code>:</p> + <pre> +<domaincheckpoint> + <name>1525889631</name> + <description>Completion of updates after OS install</description> + <creationTime>1525889631</creationTime> + <parent> + <name>1525111885</name> + </parent> + <disks> + <disk name='vda' checkpoint='bitmap' bitmap='1525889631'/> + <disk name='vdb' checkpoint='no'/> + </disks> + <domain type='qemu'> + <name>fedora</name> + <uuid>93a5c045-6457-2c09-e56c-927cdf34e178</uuid> + <memory>1048576</memory> + ... + <devices> + <disk type='file' device='disk'> + <driver name='qemu' type='qcow2'/> + <source file='/path/to/file1'/> + <target dev='vda' bus='virtio'/> + </disk> + <disk type='file' device='disk' snapshot='external'> + <driver name='qemu' type='raw'/> + <source file='/path/to/file2'/> + <target dev='vdb' bus='virtio'/> + </disk> + ... + </devices> + </domain> +</domaincheckpoint></pre> + + <p>With that checkpoint created, the qcow2 image is now tracking + all changes that occur in the image since the checkpoint via + the persistent bitmap named <code>1525889631</code>. Now, we + can make a subsequent call + to <code>virDomainBackupBegin()</code> to perform an incremental + backup of just this data, using the following XML to start a + pull model NBD export of the vda disk: + </p> + <pre> +<domainbackup mode="pull"> + <incremental>1525889631</incremental> + <server transport="unix" socket="/path/to/server"/> + <disks/> + <disk name='vda' type='file'> + <scratch file='/path/to/file1.scratch'/> + </disk> + </disks/> +</domainbackup> + </pre> + </body> +</html> diff --git a/docs/index.html.in b/docs/index.html.in index 1f9f448399..6c5d3a6dc3 100644 --- a/docs/index.html.in +++ b/docs/index.html.in @@ -68,7 +68,8 @@ <a href="formatdomaincaps.html">domain capabilities</a>, <a href="formatnode.html">node devices</a>, <a href="formatsecret.html">secrets</a>, - <a href="formatsnapshot.html">snapshots</a></dd> + <a href="formatsnapshot.html">snapshots</a>, + <a href="formatcheckpoint.html">backups and checkpoints</a></dd> <dt><a href="http://wiki.libvirt.org">Wiki</a></dt> <dd>Read further community contributed content</dd> </dl> diff --git a/docs/schemas/domainbackup.rng b/docs/schemas/domainbackup.rng new file mode 100644 index 0000000000..8e6d4b15a2 --- /dev/null +++ b/docs/schemas/domainbackup.rng @@ -0,0 +1,180 @@ +<?xml version="1.0"?> +<!-- A Relax NG schema for the libvirt domain backup properties XML format --> +<grammar xmlns="http://relaxng.org/ns/structure/1.0"> + <start> + <ref name='domainbackup'/> + </start> + + <include href='domaincommon.rng'/> + + <define name='domainbackup'> + <element name='domainbackup'> + <interleave> + <optional> + <element name='incremental'> + <text/> + </element> + </optional> + <choice> + <group> + <optional> + <attribute name='mode'> + <value>push</value> + </attribute> + </optional> + <ref name='backupDisksPush'/> + </group> + <group> + <attribute name='mode'> + <value>pull</value> + </attribute> + <interleave> + <element name='server'> + <choice> + <group> + <optional> + <attribute name='transport'> + <value>tcp</value> + </attribute> + </optional> + <attribute name='name'> + <choice> + <ref name='dnsName'/> + <ref name='ipAddr'/> + </choice> + </attribute> + <optional> + <attribute name='port'> + <ref name='unsignedInt'/> + </attribute> + </optional> + <!-- add tls? --> + </group> + <group> + <attribute name='transport'> + <value>unix</value> + </attribute> + <attribute name='socket'> + <ref name='absFilePath'/> + </attribute> + </group> + </choice> + </element> + <ref name='backupDisksPull'/> + </interleave> + </group> + </choice> + </interleave> + </element> + </define> + + <define name='backupPushDriver'> + <optional> + <element name='driver'> + <attribute name='type'> + <ref name='storageFormat'/> + </attribute> + </element> + </optional> + </define> + + <define name='backupDisksPush'> + <optional> + <element name='disks'> + <oneOrMore> + <element name='disk'> + <attribute name='name'> + <choice> + <ref name='diskTarget'/> + <ref name='absFilePath'/> + </choice> + </attribute> + <choice> + <!-- FIXME allow push to a network location, by + refactoring 'diskSource' to take element name by a + per-grammar ref --> + <group> + <optional> + <attribute name='type'> + <value>file</value> + </attribute> + </optional> + <interleave> + <optional> + <element name='target'> + <attribute name='file'> + <ref name='absFilePath'/> + </attribute> + </element> + </optional> + <ref name='backupPushDriver'/> + </interleave> + </group> + <group> + <attribute name='type'> + <value>disk</value> + </attribute> + <interleave> + <optional> + <element name='target'> + <attribute name='dev'> + <ref name='absFilePath'/> + </attribute> + </element> + </optional> + <ref name='backupPushDriver'/> + </interleave> + </group> + </choice> + </element> + </oneOrMore> + </element> + </optional> + </define> + + <define name='backupDisksPull'> + <optional> + <element name='disks'> + <oneOrMore> + <element name='disk'> + <attribute name='name'> + <choice> + <ref name='diskTarget'/> + <ref name='absFilePath'/> + </choice> + </attribute> + <choice> + <group> + <optional> + <attribute name='type'> + <value>file</value> + </attribute> + </optional> + <optional> + <element name='scratch'> + <attribute name='file'> + <ref name='absFilePath'/> + </attribute> + </element> + </optional> + </group> + <group> + <attribute name='type'> + <value>disk</value> + </attribute> + <optional> + <element name='scratch'> + <attribute name='dev'> + <ref name='absFilePath'/> + </attribute> + </element> + </optional> + </group> + </choice> + </element> + </oneOrMore> + </element> + </optional> + </define> + +</grammar> diff --git a/docs/schemas/domaincheckpoint.rng b/docs/schemas/domaincheckpoint.rng new file mode 100644 index 0000000000..1e2c16e035 --- /dev/null +++ b/docs/schemas/domaincheckpoint.rng @@ -0,0 +1,89 @@ +<?xml version="1.0"?> +<!-- A Relax NG schema for the libvirt domain checkpoint properties XML format --> +<grammar xmlns="http://relaxng.org/ns/structure/1.0"> + <start> + <ref name='domaincheckpoint'/> + </start> + + <include href='domaincommon.rng'/> + + <define name='domaincheckpoint'> + <element name='domaincheckpoint'> + <interleave> + <optional> + <element name='name'> + <text/> + </element> + </optional> + <optional> + <element name='description'> + <text/> + </element> + </optional> + <optional> + <element name='creationTime'> + <text/> + </element> + </optional> + <optional> + <element name='disks'> + <zeroOrMore> + <ref name='diskcheckpoint'/> + </zeroOrMore> + </element> + </optional> + <optional> + <choice> + <element name='domain'> + <element name='uuid'> + <ref name="UUID"/> + </element> + </element> + <!-- Nested grammar ensures that any of our overrides of + storagecommon/domaincommon defines do not conflict + with any domain.rng overrides. --> + <grammar> + <include href='domain.rng'/> + </grammar> + </choice> + </optional> + <optional> + <element name='parent'> + <element name='name'> + <text/> + </element> + </element> + </optional> + </interleave> + </element> + </define> + + <define name='diskcheckpoint'> + <element name='disk'> + <attribute name='name'> + <choice> + <ref name='diskTarget'/> + <ref name='absFilePath'/> + </choice> + </attribute> + <choice> + <attribute name='checkpoint'> + <value>no</value> + </attribute> + <group> + <optional> + <attribute name='checkpoint'> + <value>bitmap</value> + </attribute> + </optional> + <optional> + <attribute name='bitmap'> + <text/> + </attribute> + </optional> + </group> + </choice> + </element> + </define> + +</grammar> diff --git a/libvirt.spec.in b/libvirt.spec.in index ace05820aa..e3b3ba19e0 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -2043,7 +2043,9 @@ exit 0 %{_datadir}/libvirt/schemas/capability.rng %{_datadir}/libvirt/schemas/cputypes.rng %{_datadir}/libvirt/schemas/domain.rng +%{_datadir}/libvirt/schemas/domainbackup.rng %{_datadir}/libvirt/schemas/domaincaps.rng +%{_datadir}/libvirt/schemas/domaincheckpoint.rng %{_datadir}/libvirt/schemas/domaincommon.rng %{_datadir}/libvirt/schemas/domainsnapshot.rng %{_datadir}/libvirt/schemas/interface.rng diff --git a/mingw-libvirt.spec.in b/mingw-libvirt.spec.in index 917d2143d8..39d02094f9 100644 --- a/mingw-libvirt.spec.in +++ b/mingw-libvirt.spec.in @@ -240,7 +240,9 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw32_datadir}/libvirt/schemas/capability.rng %{mingw32_datadir}/libvirt/schemas/cputypes.rng %{mingw32_datadir}/libvirt/schemas/domain.rng +%{mingw32_datadir}/libvirt/schemas/domainbackup.rng %{mingw32_datadir}/libvirt/schemas/domaincaps.rng +%{mingw32_datadir}/libvirt/schemas/domaincheckpoint.rng %{mingw32_datadir}/libvirt/schemas/domaincommon.rng %{mingw32_datadir}/libvirt/schemas/domainsnapshot.rng %{mingw32_datadir}/libvirt/schemas/interface.rng @@ -325,7 +327,9 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw64_datadir}/libvirt/schemas/capability.rng %{mingw64_datadir}/libvirt/schemas/cputypes.rng %{mingw64_datadir}/libvirt/schemas/domain.rng +%{mingw64_datadir}/libvirt/schemas/domainbackup.rng %{mingw64_datadir}/libvirt/schemas/domaincaps.rng +%{mingw64_datadir}/libvirt/schemas/domaincheckpoint.rng %{mingw64_datadir}/libvirt/schemas/domaincommon.rng %{mingw64_datadir}/libvirt/schemas/domainsnapshot.rng %{mingw64_datadir}/libvirt/schemas/interface.rng diff --git a/tests/domainbackupxml2xmlin/backup-pull.xml b/tests/domainbackupxml2xmlin/backup-pull.xml new file mode 100644 index 0000000000..2ce5cd6711 --- /dev/null +++ b/tests/domainbackupxml2xmlin/backup-pull.xml @@ -0,0 +1,9 @@ +<domainbackup mode="pull"> + <incremental>1525889631</incremental> + <server transport='tcp' name='localhost' port='10809'/> + <disks> + <disk name='vda' type='file'> + <scratch file='/path/to/file'/> + </disk> + </disks> +</domainbackup> diff --git a/tests/domainbackupxml2xmlin/backup-push.xml b/tests/domainbackupxml2xmlin/backup-push.xml new file mode 100644 index 0000000000..1b7d3061fd --- /dev/null +++ b/tests/domainbackupxml2xmlin/backup-push.xml @@ -0,0 +1,9 @@ +<domainbackup mode="push"> + <incremental>1525889631</incremental> + <disks> + <disk name='vda' type='file'> + <driver type='raw'/> + <target file='/path/to/file'/> + </disk> + </disks> +</domainbackup> diff --git a/tests/domainbackupxml2xmlin/empty.xml b/tests/domainbackupxml2xmlin/empty.xml new file mode 100644 index 0000000000..7ed511f97b --- /dev/null +++ b/tests/domainbackupxml2xmlin/empty.xml @@ -0,0 +1 @@ +<domainbackup/> diff --git a/tests/domainbackupxml2xmlout/backup-pull.xml b/tests/domainbackupxml2xmlout/backup-pull.xml new file mode 100644 index 0000000000..2ce5cd6711 --- /dev/null +++ b/tests/domainbackupxml2xmlout/backup-pull.xml @@ -0,0 +1,9 @@ +<domainbackup mode="pull"> + <incremental>1525889631</incremental> + <server transport='tcp' name='localhost' port='10809'/> + <disks> + <disk name='vda' type='file'> + <scratch file='/path/to/file'/> + </disk> + </disks> +</domainbackup> diff --git a/tests/domainbackupxml2xmlout/backup-push.xml b/tests/domainbackupxml2xmlout/backup-push.xml new file mode 100644 index 0000000000..1b7d3061fd --- /dev/null +++ b/tests/domainbackupxml2xmlout/backup-push.xml @@ -0,0 +1,9 @@ +<domainbackup mode="push"> + <incremental>1525889631</incremental> + <disks> + <disk name='vda' type='file'> + <driver type='raw'/> + <target file='/path/to/file'/> + </disk> + </disks> +</domainbackup> diff --git a/tests/domainbackupxml2xmlout/empty.xml b/tests/domainbackupxml2xmlout/empty.xml new file mode 100644 index 0000000000..13600fbb1c --- /dev/null +++ b/tests/domainbackupxml2xmlout/empty.xml @@ -0,0 +1,7 @@ +<domainbackup mode="push"> + <disks> + <disk name="vda" type="file"> + <target file="/path/to/file1.copy"/> + </disk> + </disks> +</domainbackup> diff --git a/tests/domaincheckpointxml2xmlin/empty.xml b/tests/domaincheckpointxml2xmlin/empty.xml new file mode 100644 index 0000000000..dc36449142 --- /dev/null +++ b/tests/domaincheckpointxml2xmlin/empty.xml @@ -0,0 +1 @@ +<domaincheckpoint/> diff --git a/tests/domaincheckpointxml2xmlin/sample.xml b/tests/domaincheckpointxml2xmlin/sample.xml new file mode 100644 index 0000000000..70ed964e1e --- /dev/null +++ b/tests/domaincheckpointxml2xmlin/sample.xml @@ -0,0 +1,7 @@ +<domaincheckpoint> + <description>Completion of updates after OS install</description> + <disks> + <disk name='vda' checkpoint='bitmap'/> + <disk name='vdb' checkpoint='no'/> + </disks> +</domaincheckpoint> diff --git a/tests/domaincheckpointxml2xmlout/empty.xml b/tests/domaincheckpointxml2xmlout/empty.xml new file mode 100644 index 0000000000..a26c7caab0 --- /dev/null +++ b/tests/domaincheckpointxml2xmlout/empty.xml @@ -0,0 +1,10 @@ +<domaincheckpoint> + <name>1525889631</name> + <creationTime>1525889631</creationTime> + <disks> + <disk name='vda' checkpoint='bitmap' bitmap='1525889631'/> + </disks> + <domain> + <uuid>9d37b878-a7cc-9f9a-b78f-49b3abad25a8</uuid> + </domain> +</domaincheckpoint> diff --git a/tests/domaincheckpointxml2xmlout/sample.xml b/tests/domaincheckpointxml2xmlout/sample.xml new file mode 100644 index 0000000000..559b29c8c1 --- /dev/null +++ b/tests/domaincheckpointxml2xmlout/sample.xml @@ -0,0 +1,16 @@ +<domaincheckpoint> + <name>1525889631</name> + <description>Completion of updates after OS install</description> + <creationTime>1525889631</creationTime> + <parent> + <name>1525111885</name> + </parent> + <disks> + <disk name='vda' checkpoint='bitmap' bitmap='1525889631'/> + <disk name='vdb' checkpoint='no'/> + </disks> + <domain type='qemu'> + <name>fedora</name> + <uuid>93a5c045-6457-2c09-e56c-927cdf34e178</uuid> + </domain> +</domaincheckpoint> diff --git a/tests/virschematest.c b/tests/virschematest.c index aa65a434ff..545301204a 100644 --- a/tests/virschematest.c +++ b/tests/virschematest.c @@ -222,7 +222,11 @@ mymain(void) "lxcxml2xmloutdata", "bhyvexml2argvdata", "genericxml2xmlindata", "genericxml2xmloutdata", "xlconfigdata", "libxlxml2domconfigdata", "qemuhotplugtestdomains"); + DO_TEST_DIR("domainbackup.rng", "domainbackupxml2xmlin", + "domainbackupxml2xmlout"); DO_TEST_DIR("domaincaps.rng", "domaincapsschemadata"); + DO_TEST_DIR("domaincheckpoint.rng", "domaincheckpointxml2xmlin", + "domaincheckpointxml2xmlout"); DO_TEST_DIR("domainsnapshot.rng", "domainsnapshotxml2xmlin", "domainsnapshotxml2xmlout"); DO_TEST_DIR("interface.rng", "interfaceschemadata"); -- 2.14.4

On Wed, Jun 13, 2018 at 7:42 PM Eric Blake <eblake@redhat.com> wrote:
Prepare for new checkpoint and backup APIs by describing the XML that will represent a checkpoint. This is modeled heavily after the XML for virDomainSnapshotPtr, since both represent a point in time of the guest. But while a snapshot exists with the intent of rolling back to that state, a checkpoint instead makes it possible to create an incremental backup at a later time.
Add testsuite coverage of a minimal use of the XML.
Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/docs.html.in | 3 +- docs/domainstatecapture.html.in | 4 +- docs/formatcheckpoint.html.in | 273 +++++++++++++++++++++++++++++ docs/schemas/domaincheckpoint.rng | 89 ++++++++++ libvirt.spec.in | 1 + mingw-libvirt.spec.in | 2 + tests/domaincheckpointxml2xmlin/empty.xml | 1 + tests/domaincheckpointxml2xmlout/empty.xml | 10 ++ tests/virschematest.c | 2 + 9 files changed, 382 insertions(+), 3 deletions(-) create mode 100644 docs/formatcheckpoint.html.in create mode 100644 docs/schemas/domaincheckpoint.rng create mode 100644 tests/domaincheckpointxml2xmlin/empty.xml create mode 100644 tests/domaincheckpointxml2xmlout/empty.xml
diff --git a/docs/docs.html.in b/docs/docs.html.in index 4c46b74980..11dfd27ba6 100644 --- a/docs/docs.html.in +++ b/docs/docs.html.in @@ -79,7 +79,8 @@ <a href="formatdomaincaps.html">domain capabilities</a>, <a href="formatnode.html">node devices</a>, <a href="formatsecret.html">secrets</a>, - <a href="formatsnapshot.html">snapshots</a></dd> + <a href="formatsnapshot.html">snapshots</a>, + <a href="formatcheckpoint.html">checkpoints</a></dd>
<dt><a href="uri.html">URI format</a></dt> <dd>The URI formats used for connecting to libvirt</dd> diff --git a/docs/domainstatecapture.html.in b/docs/ domainstatecapture.html.in index 00ab7e8ee1..4de93c87c8 100644 --- a/docs/domainstatecapture.html.in +++ b/docs/domainstatecapture.html.in @@ -154,9 +154,9 @@ time as a new backup, so that the next incremental backup can refer to the incremental state since the checkpoint created during the current backup. Guest state is then actually - captured using <code>virDomainBackupBegin()</code>. <!--See also + captured using <code>virDomainBackupBegin()</code>. See also the <a href="formatcheckpoint.html">XML details</a> used with - this command.--></dd> + this command.</dd> </dl>
<h2><a id="examples">Examples</a></h2> diff --git a/docs/formatcheckpoint.html.in b/docs/formatcheckpoint.html.in new file mode 100644 index 0000000000..34507a9f68 --- /dev/null +++ b/docs/formatcheckpoint.html.in @@ -0,0 +1,273 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html> +<html xmlns="http://www.w3.org/1999/xhtml"> + <body> + <h1>Checkpoint and Backup XML format</h1> + + <ul id="toc"></ul> + + <h2><a id="CheckpointAttributes">Checkpoint XML</a></h2>
id=CheckpointXML?
+ + <p> + Domain disk backups, including incremental backups, are one form + of <a href="domainstatecapture.html">domain state capture</a>. + </p> + <p> + Libvirt is able to facilitate incremental backups by tracking + disk checkpoints, or points in time against which it is easy to + compute which portion of the disk has changed. Given a full + backup (a backup created from the creation of the disk to a + given point in time, coupled with the creation of a disk + checkpoint at that time),
Not clear.
and an incremental backup (a backup + created from just the dirty portion of the disk between the + first checkpoint and the second backup operation),
Also not clear.
it is + possible to do an offline reconstruction of the state of the + disk at the time of the second backup, without having to copy as + much data as a second full backup would require. Most disk + checkpoints are created in concert with a backup, + via <code>virDomainBackupBegin()</code>; however, libvirt also + exposes enough support to create disk checkpoints independently + from a backup operation, + via <code>virDomainCheckpointCreateXML()</code>.
Thanks for the extra context.
+ </p> + <p> + Attributes of libvirt checkpoints are stored as child elements of + the <code>domaincheckpoint</code> element. At checkpoint creation + time, normally only the <code>name</code>, <code>description</code>, + and <code>disks</code> elements are settable; the rest of the + fields are ignored on creation, and will be filled in by + libvirt in for informational purposes
So the user is responsible for creating checkpoints names? Do we use these the same name in qcow2?
+ by <code>virDomainCheckpointGetXMLDesc()</code>. However, when + redefining a checkpoint, + with the <code>VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE</code> flag + of <code>virDomainCheckpointCreateXML()</code>, all of the XML + described here is relevant. + </p> + <p> + Checkpoints are maintained in a hierarchy. A domain can have a + current checkpoint, which is the most recent checkpoint compared to + the current state of the domain (although a domain might have + checkpoints without a current checkpoint, if checkpoints have been + deleted in the meantime). Creating or reverting to a checkpoint + sets that checkpoint as current, and the prior current checkpoint is + the parent of the new checkpoint. Branches in the hierarchy can + be formed by reverting to a checkpoint with a child, then creating + another checkpoint.
This seems too complex. Why do we need arbitrary trees of checkpoints? What is the meaning of reverting a checkpoint?
+ </p> + <p> + The top-level <code>domaincheckpoint</code> element may contain + the following elements: + </p> + <dl> + <dt><code>name</code></dt> + <dd>The name for this checkpoint. If the name is specified when + initially creating the checkpoint, then the checkpoint will have + that particular name. If the name is omitted when initially + creating the checkpoint, then libvirt will make up a name for + the checkpoint, based on the time when it was created. + </dd>
Why not simplify and require the use to provide a name?
+ <dt><code>description</code></dt> + <dd>A human-readable description of the checkpoint. If the + description is omitted when initially creating the checkpoint, + then this field will be empty. + </dd> + <dt><code>disks</code></dt> + <dd>On input, this is an optional listing of specific + instructions for disk checkpoints; it is needed when making a + checkpoint on only a subset of the disks associated with a + domain (in particular, since qemu checkpoints require qcow2 + disks, this element may be needed on input for excluding guest + disks that are not in qcow2 format); if omitted on input, then + all disks participate in the checkpoint. On output, this is + fully populated to show the state of each disk in the + checkpoint. This element has a list of <code>disk</code> + sub-elements, describing anywhere from one to all of the disks + associated with the domain.
Why not always specify the disks?
+ <dl> + <dt><code>disk</code></dt> + <dd>This sub-element describes the checkpoint properties of + a specific disk. The attribute <code>name</code> is + mandatory, and must match either the <code><target + dev='name'/></code> or an unambiguous <code><source + file='name'/></code> of one of + the <a href="formatdomain.html#elementsDisks">disk + devices</a> specified for the domain at the time of the + checkpoint. The attribute <code>checkpoint</code> is + optional on input; possible values are <code>no</code> + when the disk does not participate in this checkpoint; + or <code>bitmap</code> if the disk will track all changes + since the creation of this checkpoint via a bitmap, in + which case another attribute <code>bitmap</code> will be + the name of the tracking bitmap (defaulting to the + checkpoint name).
Seems too complicated. Why do we need to support a checkpoint referencing a bitmap with a different name? Instead we can have a list of disk that will participate in the checkpoint. Anything not specified will not participate in the snapshot. The name of the checkpoint is always the name of the bitmap.
+ </dd> + </dl> + </dd> + <dt><code>creationTime</code></dt> + <dd>The time this checkpoint was created. The time is specified + in seconds since the Epoch, UTC (i.e. Unix time). Readonly. + </dd> + <dt><code>parent</code></dt> + <dd>The parent of this checkpoint. If present, this element + contains exactly one child element, name. This specifies the + name of the parent checkpoint of this one, and is used to + represent trees of checkpoints. Readonly. + </dd>
I think we are missing here the size of the underlying data in every disk. This probably means how many dirty bits we have in the bitmaps referenced by the checkpoint for every disk.
+ <dt><code>domain</code></dt> + <dd>The inactive <a href="formatdomain.html">domain + configuration</a> at the time the checkpoint was created. + Readonly.
What do you mean by "inactive domain configuration"?
+ </dd> + </dl> + + <h2><a id="BackupAttributes">Backup XML</a></h2> + + <p> + Creating a backup, whether full or incremental, is done + via <code>virDomainBackupBegin()</code>, which takes an XML + description of the actions to perform. There are two general + modes for backups: a push mode (where the hypervisor writes out + the data to the destination file, which may be local or remote), + and a pull mode (where the hypervisor creates an NBD server that + a third-party client can then read as needed, and which requires + the use of temporary storage, typically local, until the backup + is complete).
+ </p>
+ <p> + The instructions for beginning a backup job are provided as + attributes and elements of the + top-level <code>domainbackup</code> element. This element + includes an optional attribute <code>mode</code> which can be + either "push" or "pull" (default push). Where elements are + optional on creation, <code>virDomainBackupGetXMLDesc()</code> + can be used to see the actual values selected (for example, + learning which port the NBD server is using in the pull model, + or what file names libvirt generated when none were supplied). + The following child elements are supported: + </p> + <dl> + <dt><code>incremental</code></dt> + <dd>Optional. If this element is present, it must name an + existing checkpoint of the domain, which will be used to make + this backup an incremental one (in the push model, only + changes since the checkpoint are written to the destination; + in the pull model, the NBD server uses the + NBD_OPT_SET_META_CONTEXT extension to advertise to the client + which portions of the export contain changes since the + checkpoint). If omitted, a full backup is performed.
Just to make it clear: For example we start with: c1 c2 [c3] c3 is the active checkpoint. We create a new checkpoint: c1 c2 c3 [c4] So - using incremental=c2, we will get data referenced by c2? - using incremental=c1, we will get data reference by both c1 and c2? What if we want to backup only data from c1 to c2, not including c3? I don't have a use case for this, but if we can specify tow checkpoints this can be possible. For example: <chekpoints from="c1", to="c2"> Or <checkpoints from="c2">
+ </dd> + <dt><code>server</code></dt> + <dd>Present only for a pull mode backup. Contains the same + attributes as the <code>protocol</code> element of a disk + attached via NBD in the domain (such as transport, socket, + name, port, or tls), necessary to set up an NBD server that + exposes the content of each disk at the time the backup + started. + </dd>
To get the list of changed blocks, we planned to use something like: qemu-img map nbd+unix:/socket=server.sock Is this possible now? planned? To get the actual data, oVirt needs a device to read from. We don't want to write nbd-client, and we cannot use qemu-img since it does not support streaming data, and we want to stream data using http to the backup application. I guess we will have do this: qemu-nbd -c /dev/nbd0 nbd+unix:/socket=server.sock And serve the data from /dev/nbd0.
+ <dt><code>disks</code></dt> + <dd>This is an optional listing of instructions for disks + participating in the backup (if omitted, all disks + participate, and libvirt attempts to generate filenames by + appending the current timestamp as a suffix). When provided on + input, disks omitted from the list do not participate in the + backup. On output, the list is present but contains only the + disks participating in the backup job. This element has a + list of <code>disk</code> sub-elements, describing anywhere + from one to all of the disks associated with the domain. + <dl> + <dt><code>disk</code></dt> + <dd>This sub-element describes the checkpoint properties of + a specific disk. The attribute <code>name</code> is + mandatory, and must match either the <code><target + dev='name'/></code> or an unambiguous <code><source + file='name'/></code> of one of + the <a href="formatdomain.html#elementsDisks">disk + devices</a> specified for the domain at the time of the + checkpoint. The optional attribute <code>type</code> can + be <code>file</code>, <code>block</code>, + or <code>networks</code>, similar to a disk declaration + for a domain, controls what additional sub-elements are + needed to describe the destination (such + as <code>protocol</code> for a network destination). In + push mode backups, the primary subelement + is <code>target</code>; in pull mode, the primary sublement + is <code>scratch</code>; but either way, + the primary sub-element describes the file name to be used + during the backup operation, similar to + the <code>source</code> sub-element of a domain disk. An + optional sublement <code>driver</code> can also be used to + specify a destination format different from qcow2.
This should be similar to the way we specify disks for vm, right? Anything that works as vm disk will work for pushing backups? I will continue with the rest of the patch later. Nir + </dd>
+ </dl> + </dd> + </dl> + + <h2><a id="example">Examples</a></h2> + + <p>Using this XML to create a checkpoint of just vda on a qemu + domain with two disks and a prior checkpoint:</p> + <pre> +<domaincheckpoint> + <description>Completion of updates after OS install</description> + <disks> + <disk name='vda' checkpoint='bitmap'/> + <disk name='vdb' checkpoint='no'/> + </disks> +</domaincheckpoint></pre> + + <p>will result in XML similar to this from + <code>virDomainCheckpointGetXMLDesc()</code>:</p> + <pre> +<domaincheckpoint> + <name>1525889631</name> + <description>Completion of updates after OS install</description> + <creationTime>1525889631</creationTime> + <parent> + <name>1525111885</name> + </parent> + <disks> + <disk name='vda' checkpoint='bitmap' bitmap='1525889631'/> + <disk name='vdb' checkpoint='no'/> + </disks> + <domain> + <name>fedora</name> + <uuid>93a5c045-6457-2c09-e56c-927cdf34e178</uuid> + <memory>1048576</memory> + ... + <devices> + <disk type='file' device='disk'> + <driver name='qemu' type='qcow2'/> + <source file='/path/to/file1'/> + <target dev='vda' bus='virtio'/> + </disk> + <disk type='file' device='disk' snapshot='external'> + <driver name='qemu' type='raw'/> + <source file='/path/to/file2'/> + <target dev='vdb' bus='virtio'/> + </disk> + ... + </devices> + </domain> +</domaincheckpoint></pre> + + <p>With that checkpoint created, the qcow2 image is now tracking + all changes that occur in the image since the checkpoint via + the persistent bitmap named <code>1525889631</code>. Now, we + can make a subsequent call + to <code>virDomainBackupBegin()</code> to perform an incremental + backup of just this data, using the following XML to start a + pull model NBD export of the vda disk: + </p> + <pre> +<domainbackup mode="pull"> + <incremental>1525889631</incremental> + <server transport="unix" socket="/path/to/server"/> + <disks/> + <disk name='vda' type='file'/> + <scratch file=/path/to/file1.scratch'/> + </disk> + </disks/> +</domainbackup> + </pre> + </body> +</html> diff --git a/docs/schemas/domaincheckpoint.rng b/docs/schemas/domaincheckpoint.rng new file mode 100644 index 0000000000..1e2c16e035 --- /dev/null +++ b/docs/schemas/domaincheckpoint.rng @@ -0,0 +1,89 @@ +<?xml version="1.0"?> +<!-- A Relax NG schema for the libvirt domain checkpoint properties XML format --> +<grammar xmlns="http://relaxng.org/ns/structure/1.0"> + <start> + <ref name='domaincheckpoint'/> + </start> + + <include href='domaincommon.rng'/> + + <define name='domaincheckpoint'> + <element name='domaincheckpoint'> + <interleave> + <optional> + <element name='name'> + <text/> + </element> + </optional> + <optional> + <element name='description'> + <text/> + </element> + </optional> + <optional> + <element name='creationTime'> + <text/> + </element> + </optional> + <optional> + <element name='disks'> + <zeroOrMore> + <ref name='diskcheckpoint'/> + </zeroOrMore> + </element> + </optional> + <optional> + <choice> + <element name='domain'> + <element name='uuid'> + <ref name="UUID"/> + </element> + </element> + <!-- Nested grammar ensures that any of our overrides of + storagecommon/domaincommon defines do not conflict + with any domain.rng overrides. --> + <grammar> + <include href='domain.rng'/> + </grammar> + </choice> + </optional> + <optional> + <element name='parent'> + <element name='name'> + <text/> + </element> + </element> + </optional> + </interleave> + </element> + </define> + + <define name='diskcheckpoint'> + <element name='disk'> + <attribute name='name'> + <choice> + <ref name='diskTarget'/> + <ref name='absFilePath'/> + </choice> + </attribute> + <choice> + <attribute name='checkpoint'> + <value>no</value> + </attribute> + <group> + <optional> + <attribute name='checkpoint'> + <value>bitmap</value> + </attribute> + </optional> + <optional> + <attribute name='bitmap'> + <text/> + </attribute> + </optional> + </group> + </choice> + </element> + </define> + +</grammar> diff --git a/libvirt.spec.in b/libvirt.spec.in index ace05820aa..50bd79a7d7 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -2044,6 +2044,7 @@ exit 0 %{_datadir}/libvirt/schemas/cputypes.rng %{_datadir}/libvirt/schemas/domain.rng %{_datadir}/libvirt/schemas/domaincaps.rng +%{_datadir}/libvirt/schemas/domaincheckpoint.rng %{_datadir}/libvirt/schemas/domaincommon.rng %{_datadir}/libvirt/schemas/domainsnapshot.rng %{_datadir}/libvirt/schemas/interface.rng diff --git a/mingw-libvirt.spec.in b/mingw-libvirt.spec.in index 917d2143d8..6912527cf7 100644 --- a/mingw-libvirt.spec.in +++ b/mingw-libvirt.spec.in @@ -241,6 +241,7 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw32_datadir}/libvirt/schemas/cputypes.rng %{mingw32_datadir}/libvirt/schemas/domain.rng %{mingw32_datadir}/libvirt/schemas/domaincaps.rng +%{mingw32_datadir}/libvirt/schemas/domaincheckpoint.rng %{mingw32_datadir}/libvirt/schemas/domaincommon.rng %{mingw32_datadir}/libvirt/schemas/domainsnapshot.rng %{mingw32_datadir}/libvirt/schemas/interface.rng @@ -326,6 +327,7 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw64_datadir}/libvirt/schemas/cputypes.rng %{mingw64_datadir}/libvirt/schemas/domain.rng %{mingw64_datadir}/libvirt/schemas/domaincaps.rng +%{mingw32_datadir}/libvirt/schemas/domaincheckpoint.rng %{mingw64_datadir}/libvirt/schemas/domaincommon.rng %{mingw64_datadir}/libvirt/schemas/domainsnapshot.rng %{mingw64_datadir}/libvirt/schemas/interface.rng diff --git a/tests/domaincheckpointxml2xmlin/empty.xml b/tests/domaincheckpointxml2xmlin/empty.xml new file mode 100644 index 0000000000..dc36449142 --- /dev/null +++ b/tests/domaincheckpointxml2xmlin/empty.xml @@ -0,0 +1 @@ +<domaincheckpoint/> diff --git a/tests/domaincheckpointxml2xmlout/empty.xml b/tests/domaincheckpointxml2xmlout/empty.xml new file mode 100644 index 0000000000..a26c7caab0 --- /dev/null +++ b/tests/domaincheckpointxml2xmlout/empty.xml @@ -0,0 +1,10 @@ +<domaincheckpoint> + <name>1525889631</name> + <creationTime>1525889631</creationTime> + <disks> + <disk name='vda' checkpoint='bitmap' bitmap='1525889631'/> + </disks> + <domain> + <uuid>9d37b878-a7cc-9f9a-b78f-49b3abad25a8</uuid> + </domain> +</domaincheckpoint> diff --git a/tests/virschematest.c b/tests/virschematest.c index 2d35833919..b866db4326 100644 --- a/tests/virschematest.c +++ b/tests/virschematest.c @@ -223,6 +223,8 @@ mymain(void) "genericxml2xmloutdata", "xlconfigdata", "libxlxml2domconfigdata", "qemuhotplugtestdomains"); DO_TEST_DIR("domaincaps.rng", "domaincapsschemadata"); + DO_TEST_DIR("domaincheckpoint.rng", "domaincheckpointxml2xmlin", + "domaincheckpointxml2xmlout"); DO_TEST_DIR("domainsnapshot.rng", "domainsnapshotxml2xmlin", "domainsnapshotxml2xmlout"); DO_TEST_DIR("interface.rng", "interfaceschemadata"); -- 2.14.4

On 06/26/2018 02:51 PM, Nir Soffer wrote:
On Wed, Jun 13, 2018 at 7:42 PM Eric Blake <eblake@redhat.com> wrote:
Prepare for new checkpoint and backup APIs by describing the XML that will represent a checkpoint. This is modeled heavily after the XML for virDomainSnapshotPtr, since both represent a point in time of the guest. But while a snapshot exists with the intent of rolling back to that state, a checkpoint instead makes it possible to create an incremental backup at a later time.
Add testsuite coverage of a minimal use of the XML.
+++ b/docs/formatcheckpoint.html.in @@ -0,0 +1,273 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html> +<html xmlns="http://www.w3.org/1999/xhtml"> + <body> + <h1>Checkpoint and Backup XML format</h1> + + <ul id="toc"></ul> + + <h2><a id="CheckpointAttributes">Checkpoint XML</a></h2>
id=CheckpointXML?
Matches what the existing formatsnapshot.html.in named its tag. (If you haven't guessed, I'm heavily relying on snapshots as my template for adding this).
+ + <p> + Domain disk backups, including incremental backups, are one form + of <a href="domainstatecapture.html">domain state capture</a>. + </p> + <p> + Libvirt is able to facilitate incremental backups by tracking + disk checkpoints, or points in time against which it is easy to + compute which portion of the disk has changed. Given a full + backup (a backup created from the creation of the disk to a + given point in time, coupled with the creation of a disk + checkpoint at that time),
Not clear.
and an incremental backup (a backup + created from just the dirty portion of the disk between the + first checkpoint and the second backup operation),
Also not clear.
Okay, I will try to improve these in v2. But (other than answering these good review emails), my current priority is a working demo (to prove the API works) prior to further doc polish.
it is + possible to do an offline reconstruction of the state of the + disk at the time of the second backup, without having to copy as + much data as a second full backup would require. Most disk + checkpoints are created in concert with a backup, + via <code>virDomainBackupBegin()</code>; however, libvirt also + exposes enough support to create disk checkpoints independently + from a backup operation, + via <code>virDomainCheckpointCreateXML()</code>.
Thanks for the extra context.
+ </p> + <p> + Attributes of libvirt checkpoints are stored as child elements of + the <code>domaincheckpoint</code> element. At checkpoint creation + time, normally only the <code>name</code>, <code>description</code>, + and <code>disks</code> elements are settable; the rest of the + fields are ignored on creation, and will be filled in by + libvirt in for informational purposes
So the user is responsible for creating checkpoints names? Do we use these the same name in qcow2?
My intent is that if the user does not assign a checkpoint name, then libvirt will default it to the current time in seconds-since-the-Epoch. Then, whatever name is given to the checkpoint (whether chosen by libvirt or assigned by the user) will also be the default name of the bitmap created in each qcow2 volume, but the XML also allows you to name the qcow2 bitmaps something different than the checkpoint name (maybe not a wise idea in the common case, but could come in handy later if you use the _REDEFINE flag to teach libvirt about existing bitmaps that are already present in a qcow2 image rather than placed there by libvirt).
+ <p> + Checkpoints are maintained in a hierarchy. A domain can have a + current checkpoint, which is the most recent checkpoint compared to + the current state of the domain (although a domain might have + checkpoints without a current checkpoint, if checkpoints have been + deleted in the meantime). Creating or reverting to a checkpoint + sets that checkpoint as current, and the prior current checkpoint is + the parent of the new checkpoint. Branches in the hierarchy can + be formed by reverting to a checkpoint with a child, then creating + another checkpoint.
This seems too complex. Why do we need arbitrary trees of checkpoints?
Because snapshots had an arbitrary tree, and it was easier to copy from snapshots. Even if we only use a linear tree for now, it is still feasible that in the future, we can facilitate a domain rolling back to the disk state as captured at checkpoint C1, at which point you could then have multiple children C2 (the bitmap created prior to rolling back) and C3 (the bitmap created for tracking changes made after rolling back). Again, for a first cut, I probably will punt and state that snapshots and incremental backups do not play well together yet; but as we get experience and add more code, the API is flexible enough that down the road we really can offer reverting to an arbitrary snapshot and ALSO updating checkpoints to match.
What is the meaning of reverting a checkpoint?
Hmm - right now, you can't (that was one Snapshot API that I intentionally did not copy over to Checkpoint), so I should probably reword that.
+ </p> + <p> + The top-level <code>domaincheckpoint</code> element may contain + the following elements: + </p> + <dl> + <dt><code>name</code></dt> + <dd>The name for this checkpoint. If the name is specified when + initially creating the checkpoint, then the checkpoint will have + that particular name. If the name is omitted when initially + creating the checkpoint, then libvirt will make up a name for + the checkpoint, based on the time when it was created. + </dd>
Why not simplify and require the use to provide a name?
Because we didn't require the user to provide names for snapshots, and generating a name via the current timestamp is still fairly likely to be usable.
+ <dt><code>description</code></dt> + <dd>A human-readable description of the checkpoint. If the + description is omitted when initially creating the checkpoint, + then this field will be empty. + </dd> + <dt><code>disks</code></dt> + <dd>On input, this is an optional listing of specific + instructions for disk checkpoints; it is needed when making a + checkpoint on only a subset of the disks associated with a + domain (in particular, since qemu checkpoints require qcow2 + disks, this element may be needed on input for excluding guest + disks that are not in qcow2 format); if omitted on input, then + all disks participate in the checkpoint. On output, this is + fully populated to show the state of each disk in the + checkpoint. This element has a list of <code>disk</code> + sub-elements, describing anywhere from one to all of the disks + associated with the domain.
Why not always specify the disks?
Because if your guest uses all qcow2 images, and you don't want to exclude any images from the checkpoint, then not specifying <disks> does the right thing with less typing. Just because libvirt tries to have sane defaults doesn't mean you have to rely on them, though.
+ <dl> + <dt><code>disk</code></dt> + <dd>This sub-element describes the checkpoint properties of + a specific disk. The attribute <code>name</code> is + mandatory, and must match either the <code><target + dev='name'/></code> or an unambiguous <code><source + file='name'/></code> of one of + the <a href="formatdomain.html#elementsDisks">disk + devices</a> specified for the domain at the time of the + checkpoint. The attribute <code>checkpoint</code> is + optional on input; possible values are <code>no</code> + when the disk does not participate in this checkpoint; + or <code>bitmap</code> if the disk will track all changes + since the creation of this checkpoint via a bitmap, in + which case another attribute <code>bitmap</code> will be + the name of the tracking bitmap (defaulting to the + checkpoint name).
Seems too complicated. Why do we need to support a checkpoint referencing a bitmap with a different name?
For the same reason that you can support an internal snapshot referencing a qcow2 snapshot with a different name. Yeah, it's probably not a common usage, but there are cases (such as when using _REDEFINE) where it can prove invaluable. You're right that most users won't name qcow2 bitmaps differently from the libvirt checkpoint name.
Instead we can have a list of disk that will participate in the checkpoint. Anything not specified will not participate in the snapshot. The name of the checkpoint is always the name of the bitmap.
My worry is about future extensibility of the XML. If the XML is too simple, then we may lock ourselves into a corner at not being able to support some other backend implementation of checkpoints (just because qemu implements checkpoints via qcow2 bitmaps does not mean that some other hyperviser won't come along that implements checkpoints via a UUID, so I tried to leave room for <disk checkpoint='uuid' uuid='..-..-...'/> as potential XML for such a hypervisor mapping - and while bitmap names different from checkpoint names is unusual, it is much more likely that UUIDs for multiple disks would have to be different per disk)
+ </dd> + </dl> + </dd> + <dt><code>creationTime</code></dt> + <dd>The time this checkpoint was created. The time is specified + in seconds since the Epoch, UTC (i.e. Unix time). Readonly. + </dd> + <dt><code>parent</code></dt> + <dd>The parent of this checkpoint. If present, this element + contains exactly one child element, name. This specifies the + name of the parent checkpoint of this one, and is used to + represent trees of checkpoints. Readonly. + </dd>
I think we are missing here the size of the underlying data in every disk. This probably means how many dirty bits we have in the bitmaps referenced by the checkpoint for every disk.
That would be an output-only XML element, and only if qemu were even modified to expose that information. But yes, I can see how exposing that could be useful.
+ <dt><code>domain</code></dt> + <dd>The inactive <a href="formatdomain.html">domain + configuration</a> at the time the checkpoint was created. + Readonly.
What do you mean by "inactive domain configuration"?
Copy-and-paste from snapshots, but in general, what it would take to start a new VM using a restoration of the backup images corresponding to that checkpoint (that is, the XML is the smaller persistent form, rather than the larger running form; my classic example used to be that the 'inactive domain configuration' omits <alias> tags while the 'running configuration' does not - but since libvirt recently added support for user-settable <alias> tags, that no longer holds...).
+ <dl> + <dt><code>incremental</code></dt> + <dd>Optional. If this element is present, it must name an + existing checkpoint of the domain, which will be used to make + this backup an incremental one (in the push model, only + changes since the checkpoint are written to the destination; + in the pull model, the NBD server uses the + NBD_OPT_SET_META_CONTEXT extension to advertise to the client + which portions of the export contain changes since the + checkpoint). If omitted, a full backup is performed.
Just to make it clear:
For example we start with:
c1 c2 [c3]
c3 is the active checkpoint.
We create a new checkpoint:
c1 c2 c3 [c4]
So - using incremental=c2, we will get data referenced by c2?
Your incremental backup would get all changes since the point in time c2 was created (that is, the changes recorded by the merge of bitmaps c2 and c3).
- using incremental=c1, we will get data reference by both c1 and c2?
Your incremental backup would get all changes since the point in time c1 was created (that is, the changes recorded by the merge of bitmaps c1, c2, and c3).
What if we want to backup only data from c1 to c2, not including c3?
Qemu can't do that right now, so this API can't do it either. Maybe there's a way to add it into the API (and the fact that we used XML leaves that door wide open), but not right now.
I don't have a use case for this, but if we can specify tow checkpoints this can be possible.
For example:
<chekpoints from="c1", to="c2">
Or
<checkpoints from="c2">
Or the current proposal of <incremental> serves as the 'from', and a new sibling element <limit> becomes the 'to', if it becomes possible to limit a backup to an earlier point in time than the present call to the API.
+ </dd> + <dt><code>server</code></dt> + <dd>Present only for a pull mode backup. Contains the same + attributes as the <code>protocol</code> element of a disk + attached via NBD in the domain (such as transport, socket, + name, port, or tls), necessary to set up an NBD server that + exposes the content of each disk at the time the backup + started. + </dd>
To get the list of changed blocks, we planned to use something like:
qemu-img map nbd+unix:/socket=server.sock
Is this possible now? planned?
Possible via the x-nbd-server-add-bitmap command added in qemu commit 767f0c7, coupled with a client that knows how to request NBD_OPT_SET_META_CONTEXT "qemu:dirty-bitmap:foo" then read the bitmap with NBD_CMD_BLOCK_STATUS (I have a hack patch sitting on the qemu list that lets qemu-img behave as such a client: https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg05993.html)
To get the actual data, oVirt needs a device to read from. We don't want to write nbd-client, and we cannot use qemu-img since it does not support streaming data, and we want to stream data using http to the backup application.
I guess we will have do this:
qemu-nbd -c /dev/nbd0 nbd+unix:/socket=server.sock
And serve the data from /dev/nbd0.
Yes, except that the kernel NBD client plugin does not have support for NBD_CMD_BLOCK_STATUS, so reading /dev/nbd0 won't be able to find the dirty blocks. But you could always do it in two steps: first, connect a client that only reads the bitmap (such as qemu-img with my hack), then connect the kernel client so that you can stream just the portions of /dev/nbd0 referenced in the map of the first step. (Or, since both clients would be read-only, you can have them both connected to the qemu server at once)
+ <dt><code>disks</code></dt> + <dd>This is an optional listing of instructions for disks + participating in the backup (if omitted, all disks + participate, and libvirt attempts to generate filenames by + appending the current timestamp as a suffix). When provided on + input, disks omitted from the list do not participate in the + backup. On output, the list is present but contains only the + disks participating in the backup job. This element has a + list of <code>disk</code> sub-elements, describing anywhere + from one to all of the disks associated with the domain. + <dl> + <dt><code>disk</code></dt> + <dd>This sub-element describes the checkpoint properties of + a specific disk. The attribute <code>name</code> is + mandatory, and must match either the <code><target + dev='name'/></code> or an unambiguous <code><source + file='name'/></code> of one of + the <a href="formatdomain.html#elementsDisks">disk + devices</a> specified for the domain at the time of the + checkpoint. The optional attribute <code>type</code> can + be <code>file</code>, <code>block</code>, + or <code>networks</code>, similar to a disk declaration + for a domain, controls what additional sub-elements are + needed to describe the destination (such + as <code>protocol</code> for a network destination). In + push mode backups, the primary subelement + is <code>target</code>; in pull mode, the primary sublement + is <code>scratch</code>; but either way, + the primary sub-element describes the file name to be used + during the backup operation, similar to + the <code>source</code> sub-element of a domain disk. An + optional sublement <code>driver</code> can also be used to + specify a destination format different from qcow2.
This should be similar to the way we specify disks for vm, right? Anything that works as vm disk will work for pushing backups?
Ultimately, yes, I'd like to support gluster/NBD/sheepdog/... destinations. My initial implementation is less ambitious, and supports just local files (because those are easier to test and therefore produce a demo with). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

Introduce a bunch of new public APIs related to backup checkpoints. Checkpoints are modeled heavily after virDomainSnapshotPtr (both represent a point in time of the guest), although a snapshot exists with the intent of rolling back to that state, while a checkpoint exists to make it possible to create an incremental backup at a later time. Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/Makefile.am | 3 + docs/apibuild.py | 2 + docs/docs.html.in | 1 + include/libvirt/libvirt-domain-checkpoint.h | 147 ++++++ include/libvirt/libvirt.h | 5 +- libvirt.spec.in | 1 + mingw-libvirt.spec.in | 2 + po/POTFILES | 1 + src/Makefile.am | 2 + src/driver-hypervisor.h | 60 ++- src/libvirt-domain-checkpoint.c | 708 ++++++++++++++++++++++++++++ src/libvirt_public.syms | 16 + 12 files changed, 944 insertions(+), 4 deletions(-) create mode 100644 include/libvirt/libvirt-domain-checkpoint.h create mode 100644 src/libvirt-domain-checkpoint.c diff --git a/docs/Makefile.am b/docs/Makefile.am index 9620587a77..0df8ebbd64 100644 --- a/docs/Makefile.am +++ b/docs/Makefile.am @@ -25,6 +25,7 @@ apihtml = \ apihtml_generated = \ html/libvirt-libvirt-common.html \ html/libvirt-libvirt-domain.html \ + html/libvirt-libvirt-domain-checkpoint.html \ html/libvirt-libvirt-domain-snapshot.html \ html/libvirt-libvirt-event.html \ html/libvirt-libvirt-host.html \ @@ -318,6 +319,7 @@ $(python_generated_files): $(APIBUILD_STAMP) $(APIBUILD_STAMP): $(srcdir)/apibuild.py \ $(top_srcdir)/include/libvirt/libvirt.h \ $(top_srcdir)/include/libvirt/libvirt-common.h.in \ + $(top_srcdir)/include/libvirt/libvirt-domain-checkpoint.h \ $(top_srcdir)/include/libvirt/libvirt-domain-snapshot.h \ $(top_srcdir)/include/libvirt/libvirt-domain.h \ $(top_srcdir)/include/libvirt/libvirt-event.h \ @@ -334,6 +336,7 @@ $(APIBUILD_STAMP): $(srcdir)/apibuild.py \ $(top_srcdir)/include/libvirt/libvirt-admin.h \ $(top_srcdir)/include/libvirt/virterror.h \ $(top_srcdir)/src/libvirt.c \ + $(top_srcdir)/src/libvirt-domain-checkpoint.c \ $(top_srcdir)/src/libvirt-domain-snapshot.c \ $(top_srcdir)/src/libvirt-domain.c \ $(top_srcdir)/src/libvirt-host.c \ diff --git a/docs/apibuild.py b/docs/apibuild.py index 5e218a9ad0..471547cea7 100755 --- a/docs/apibuild.py +++ b/docs/apibuild.py @@ -26,6 +26,7 @@ debugsym = None included_files = { "libvirt-common.h": "header with general libvirt API definitions", "libvirt-domain.h": "header with general libvirt API definitions", + "libvirt-domain-checkpoint.h": "header with general libvirt API definitions", "libvirt-domain-snapshot.h": "header with general libvirt API definitions", "libvirt-event.h": "header with general libvirt API definitions", "libvirt-host.h": "header with general libvirt API definitions", @@ -39,6 +40,7 @@ included_files = { "virterror.h": "header with error specific API definitions", "libvirt.c": "Main interfaces for the libvirt library", "libvirt-domain.c": "Domain interfaces for the libvirt library", + "libvirt-domain-checkpoint.c": "Domain checkpoint interfaces for the libvirt library", "libvirt-domain-snapshot.c": "Domain snapshot interfaces for the libvirt library", "libvirt-host.c": "Host interfaces for the libvirt library", "libvirt-interface.c": "Interface interfaces for the libvirt library", diff --git a/docs/docs.html.in b/docs/docs.html.in index 11dfd27ba6..63dbdf7755 100644 --- a/docs/docs.html.in +++ b/docs/docs.html.in @@ -97,6 +97,7 @@ <dd>Reference manual for the C public API, split in <a href="html/libvirt-libvirt-common.html">common</a>, <a href="html/libvirt-libvirt-domain.html">domain</a>, + <a href="html/libvirt-libvirt-domain-checkpoint.html">domain checkpoint</a>, <a href="html/libvirt-libvirt-domain-snapshot.html">domain snapshot</a>, <a href="html/libvirt-virterror.html">error</a>, <a href="html/libvirt-libvirt-event.html">event</a>, diff --git a/include/libvirt/libvirt-domain-checkpoint.h b/include/libvirt/libvirt-domain-checkpoint.h new file mode 100644 index 0000000000..4a7dc73089 --- /dev/null +++ b/include/libvirt/libvirt-domain-checkpoint.h @@ -0,0 +1,147 @@ +/* + * libvirt-domain-checkpoint.h + * Summary: APIs for management of domain checkpoints + * Description: Provides APIs for the management of domain checkpoints + * Author: Eric Blake <eblake@redhat.com> + * + * Copyright (C) 2006-2018 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef __VIR_LIBVIRT_DOMAIN_CHECKPOINT_H__ +# define __VIR_LIBVIRT_DOMAIN_CHECKPOINT_H__ + +# ifndef __VIR_LIBVIRT_H_INCLUDES__ +# error "Don't include this file directly, only use libvirt/libvirt.h" +# endif + +/** + * virDomainCheckpoint: + * + * A virDomainCheckpoint is a private structure representing a checkpoint of + * a domain. A checkpoint is useful for tracking which portions of the + * domain disks have been altered since a point in time, but by itself does + * not allow reverting back to that point in time. + */ +typedef struct _virDomainCheckpoint virDomainCheckpoint; + +/** + * virDomainCheckpointPtr: + * + * A virDomainCheckpointPtr is pointer to a virDomainCheckpoint + * private structure, and is the type used to reference a domain + * checkpoint in the API. + */ +typedef virDomainCheckpoint *virDomainCheckpointPtr; + +const char *virDomainCheckpointGetName(virDomainCheckpointPtr checkpoint); +virDomainPtr virDomainCheckpointGetDomain(virDomainCheckpointPtr checkpoint); +virConnectPtr virDomainCheckpointGetConnect(virDomainCheckpointPtr checkpoint); + +typedef enum { + VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE = (1 << 0), /* Restore or alter + metadata */ + VIR_DOMAIN_CHECKPOINT_CREATE_CURRENT = (1 << 1), /* With redefine, make + checkpoint current */ + VIR_DOMAIN_CHECKPOINT_CREATE_NO_METADATA = (1 << 2), /* Make checkpoint without + remembering it */ +} virDomainCheckpointCreateFlags; + +/* Create a checkpoint using the current VM state. */ +virDomainCheckpointPtr virDomainCheckpointCreateXML(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags); + +/* Dump the XML of a checkpoint */ +char *virDomainCheckpointGetXMLDesc(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +/** + * virDomainCheckpointListFlags: + * + * Flags valid for virDomainListCheckpoints() and + * virDomainCheckpointListChildren(). Note that the interpretation of + * flag (1<<0) depends on which function it is passed to; but serves + * to toggle the per-call default of whether the listing is shallow or + * recursive. Remaining bits come in groups; if all bits from a group + * are 0, then that group is not used to filter results. */ +typedef enum { + VIR_DOMAIN_CHECKPOINT_LIST_ROOTS = (1 << 0), /* Filter by checkpoints + with no parents, when + listing a domain */ + VIR_DOMAIN_CHECKPOINT_LIST_DESCENDANTS = (1 << 0), /* List all descendants, + not just children, when + listing a checkpoint */ + + VIR_DOMAIN_CHECKPOINT_LIST_LEAVES = (1 << 1), /* Filter by checkpoints + with no children */ + VIR_DOMAIN_CHECKPOINT_LIST_NO_LEAVES = (1 << 2), /* Filter by checkpoints + that have children */ + + VIR_DOMAIN_CHECKPOINT_LIST_METADATA = (1 << 3), /* Filter by checkpoints + which have metadata */ + VIR_DOMAIN_CHECKPOINT_LIST_NO_METADATA = (1 << 4), /* Filter by checkpoints + with no metadata */ +} virDomainCheckpointListFlags; + +/* Get all checkpoint objects for this domain */ +int virDomainListCheckpoints(virDomainPtr domain, + virDomainCheckpointPtr **checkpoints, + unsigned int flags); + +/* Get all checkpoint object children for this checkpoint */ +int virDomainCheckpointListChildren(virDomainCheckpointPtr checkpoint, + virDomainCheckpointPtr **children, + unsigned int flags); + +/* Get a handle to a named checkpoint */ +virDomainCheckpointPtr virDomainCheckpointLookupByName(virDomainPtr domain, + const char *name, + unsigned int flags); + +/* Check whether a domain has a checkpoint which is currently used */ +int virDomainHasCurrentCheckpoint(virDomainPtr domain, unsigned int flags); + +/* Get a handle to the current checkpoint */ +virDomainCheckpointPtr virDomainCheckpointCurrent(virDomainPtr domain, + unsigned int flags); + +/* Get a handle to the parent checkpoint, if one exists */ +virDomainCheckpointPtr virDomainCheckpointGetParent(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +/* Determine if a checkpoint is the current checkpoint of its domain. */ +int virDomainCheckpointIsCurrent(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +/* Determine if checkpoint has metadata that would prevent domain deletion. */ +int virDomainCheckpointHasMetadata(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +/* Delete a checkpoint */ +typedef enum { + VIR_DOMAIN_CHECKPOINT_DELETE_CHILDREN = (1 << 0), /* Also delete children */ + VIR_DOMAIN_CHECKPOINT_DELETE_METADATA_ONLY = (1 << 1), /* Delete just metadata */ + VIR_DOMAIN_CHECKPOINT_DELETE_CHILDREN_ONLY = (1 << 2), /* Delete just children */ +} virDomainCheckpointDeleteFlags; + +int virDomainCheckpointDelete(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +int virDomainCheckpointRef(virDomainCheckpointPtr checkpoint); +int virDomainCheckpointFree(virDomainCheckpointPtr checkpoint); + +#endif /* __VIR_LIBVIRT_DOMAIN_CHECKPOINT_H__ */ diff --git a/include/libvirt/libvirt.h b/include/libvirt/libvirt.h index 26887a40e7..4e7da0afc4 100644 --- a/include/libvirt/libvirt.h +++ b/include/libvirt/libvirt.h @@ -4,7 +4,7 @@ * Description: Provides the interfaces of the libvirt library to handle * virtualized domains * - * Copyright (C) 2005-2006, 2010-2014 Red Hat, Inc. + * Copyright (C) 2005-2006, 2010-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -36,8 +36,7 @@ extern "C" { # include <libvirt/libvirt-common.h> # include <libvirt/libvirt-host.h> # include <libvirt/libvirt-domain.h> -typedef struct _virDomainCheckpoint virDomainCheckpoint; -typedef virDomainCheckpoint *virDomainCheckpointPtr; +# include <libvirt/libvirt-domain-checkpoint.h> # include <libvirt/libvirt-domain-snapshot.h> # include <libvirt/libvirt-event.h> # include <libvirt/libvirt-interface.h> diff --git a/libvirt.spec.in b/libvirt.spec.in index 50bd79a7d7..a82e97f3db 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -2102,6 +2102,7 @@ exit 0 %{_includedir}/libvirt/libvirt-admin.h %{_includedir}/libvirt/libvirt-common.h %{_includedir}/libvirt/libvirt-domain.h +%{_includedir}/libvirt/libvirt-domain-checkpoint.h %{_includedir}/libvirt/libvirt-domain-snapshot.h %{_includedir}/libvirt/libvirt-event.h %{_includedir}/libvirt/libvirt-host.h diff --git a/mingw-libvirt.spec.in b/mingw-libvirt.spec.in index 6912527cf7..3b41e69661 100644 --- a/mingw-libvirt.spec.in +++ b/mingw-libvirt.spec.in @@ -269,6 +269,7 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw32_includedir}/libvirt/libvirt.h %{mingw32_includedir}/libvirt/libvirt-common.h %{mingw32_includedir}/libvirt/libvirt-domain.h +%{mingw32_includedir}/libvirt/libvirt-domain-checkpoint.h %{mingw32_includedir}/libvirt/libvirt-domain-snapshot.h %{mingw32_includedir}/libvirt/libvirt-event.h %{mingw32_includedir}/libvirt/libvirt-host.h @@ -355,6 +356,7 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw64_includedir}/libvirt/libvirt.h %{mingw64_includedir}/libvirt/libvirt-common.h %{mingw64_includedir}/libvirt/libvirt-domain.h +%{mingw64_includedir}/libvirt/libvirt-domain-checkpoint.h %{mingw64_includedir}/libvirt/libvirt-domain-snapshot.h %{mingw64_includedir}/libvirt/libvirt-event.h %{mingw64_includedir}/libvirt/libvirt-host.h diff --git a/po/POTFILES b/po/POTFILES index be2874487c..d246657188 100644 --- a/po/POTFILES +++ b/po/POTFILES @@ -69,6 +69,7 @@ src/interface/interface_backend_netcf.c src/interface/interface_backend_udev.c src/internal.h src/libvirt-admin.c +src/libvirt-domain-checkpoint.c src/libvirt-domain-snapshot.c src/libvirt-domain.c src/libvirt-host.c diff --git a/src/Makefile.am b/src/Makefile.am index db8c8ebd1a..d20d65574e 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -174,6 +174,7 @@ DRIVER_SOURCES += \ $(DATATYPES_SOURCES) \ libvirt.c libvirt_internal.h \ libvirt-domain.c \ + libvirt-domain-checkpoint.c \ libvirt-domain-snapshot.c \ libvirt-host.c \ libvirt-interface.c \ @@ -728,6 +729,7 @@ libvirt_setuid_rpc_client_la_SOURCES = \ datatypes.c \ libvirt.c \ libvirt-domain.c \ + libvirt-domain-checkpoint.c \ libvirt-domain-snapshot.c \ libvirt-host.c \ libvirt-interface.c \ diff --git a/src/driver-hypervisor.h b/src/driver-hypervisor.h index eef31eb1f0..ee8a9a3e0e 100644 --- a/src/driver-hypervisor.h +++ b/src/driver-hypervisor.h @@ -1,7 +1,7 @@ /* * driver-hypervisor.h: entry points for hypervisor drivers * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -1321,6 +1321,53 @@ typedef int int *nparams, unsigned int flags); +typedef virDomainCheckpointPtr +(*virDrvDomainCheckpointCreateXML)(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags); + +typedef char * +(*virDrvDomainCheckpointGetXMLDesc)(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +typedef int +(*virDrvDomainListCheckpoints)(virDomainPtr domain, + virDomainCheckpointPtr **checkpoints, + unsigned int flags); + +typedef int +(*virDrvDomainCheckpointListChildren)(virDomainCheckpointPtr checkpoint, + virDomainCheckpointPtr **children, + unsigned int flags); + +typedef virDomainCheckpointPtr +(*virDrvDomainCheckpointLookupByName)(virDomainPtr domain, + const char *name, + unsigned int flags); + +typedef int +(*virDrvDomainHasCurrentCheckpoint)(virDomainPtr domain, + unsigned int flags); + +typedef virDomainCheckpointPtr +(*virDrvDomainCheckpointGetParent)(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +typedef virDomainCheckpointPtr +(*virDrvDomainCheckpointCurrent)(virDomainPtr domain, + unsigned int flags); + +typedef int +(*virDrvDomainCheckpointIsCurrent)(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +typedef int +(*virDrvDomainCheckpointHasMetadata)(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +typedef int +(*virDrvDomainCheckpointDelete)(virDomainCheckpointPtr checkpoint, + unsigned int flags); typedef struct _virHypervisorDriver virHypervisorDriver; typedef virHypervisorDriver *virHypervisorDriverPtr; @@ -1572,6 +1619,17 @@ struct _virHypervisorDriver { virDrvConnectBaselineHypervisorCPU connectBaselineHypervisorCPU; virDrvNodeGetSEVInfo nodeGetSEVInfo; virDrvDomainGetLaunchSecurityInfo domainGetLaunchSecurityInfo; + virDrvDomainCheckpointCreateXML domainCheckpointCreateXML; + virDrvDomainCheckpointGetXMLDesc domainCheckpointGetXMLDesc; + virDrvDomainListCheckpoints domainListCheckpoints; + virDrvDomainCheckpointListChildren domainCheckpointListChildren; + virDrvDomainCheckpointLookupByName domainCheckpointLookupByName; + virDrvDomainHasCurrentCheckpoint domainHasCurrentCheckpoint; + virDrvDomainCheckpointGetParent domainCheckpointGetParent; + virDrvDomainCheckpointCurrent domainCheckpointCurrent; + virDrvDomainCheckpointIsCurrent domainCheckpointIsCurrent; + virDrvDomainCheckpointHasMetadata domainCheckpointHasMetadata; + virDrvDomainCheckpointDelete domainCheckpointDelete; }; diff --git a/src/libvirt-domain-checkpoint.c b/src/libvirt-domain-checkpoint.c new file mode 100644 index 0000000000..12511a13ee --- /dev/null +++ b/src/libvirt-domain-checkpoint.c @@ -0,0 +1,708 @@ +/* + * libvirt-domain-checkpoint.c: entry points for virDomainCheckpointPtr APIs + * + * Copyright (C) 2006-2014, 2018 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#include <config.h> + +#include "datatypes.h" +#include "virlog.h" + +VIR_LOG_INIT("libvirt.domain-checkpoint"); + +#define VIR_FROM_THIS VIR_FROM_DOMAIN_CHECKPOINT + +/** + * virDomainCheckpointGetName: + * @checkpoint: a checkpoint object + * + * Get the public name for that checkpoint + * + * Returns a pointer to the name or NULL, the string need not be deallocated + * as its lifetime will be the same as the checkpoint object. + */ +const char * +virDomainCheckpointGetName(virDomainCheckpointPtr checkpoint) +{ + VIR_DEBUG("checkpoint=%p", checkpoint); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, NULL); + + return checkpoint->name; +} + + +/** + * virDomainCheckpointGetDomain: + * @checkpoint: a checkpoint object + * + * Provides the domain pointer associated with a checkpoint. The + * reference counter on the domain is not increased by this + * call. + * + * Returns the domain or NULL. + */ +virDomainPtr +virDomainCheckpointGetDomain(virDomainCheckpointPtr checkpoint) +{ + VIR_DEBUG("checkpoint=%p", checkpoint); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, NULL); + + return checkpoint->domain; +} + + +/** + * virDomainCheckpointGetConnect: + * @checkpoint: a checkpoint object + * + * Provides the connection pointer associated with a checkpoint. The + * reference counter on the connection is not increased by this + * call. + * + * Returns the connection or NULL. + */ +virConnectPtr +virDomainCheckpointGetConnect(virDomainCheckpointPtr checkpoint) +{ + VIR_DEBUG("checkpoint=%p", checkpoint); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, NULL); + + return checkpoint->domain->conn; +} + + +/** + * virDomainCheckpointCreateXML: + * @domain: a domain object + * @xmlDesc: description of the checkpoint to create + * @flags: bitwise-OR of supported virDomainCheckpointCreateFlags + * + * Create a new checkpoint using @xmlDesc on a running @domain. + * Typically, it is more common to create a new checkpoint as part of + * kicking off a backup job with virDomainBackupBegin(); however, it + * is also possible to start a checkpoint without a backup. + * + * See formatcheckpoint.html#CheckpointAttributes document for more + * details on @xmlDesc. + * + * If @flags includes VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE, then this + * is a request to reinstate checkpoint metadata that was previously + * discarded, rather than creating a new checkpoint. When redefining + * checkpoint metadata, the current checkpoint will not be altered + * unless the VIR_DOMAIN_CHECKPOINT_CREATE_CURRENT flag is also + * present. It is an error to request the + * VIR_DOMAIN_CHECKPOINT_CREATE_CURRENT flag without + * VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE. + * + * If @flags includes VIR_DOMAIN_CHECKPOINT_CREATE_NO_METADATA, then + * the domain's disk images are modified according to @xmlDesc, but + * then the just-created checkpoint has its metadata deleted. This + * flag is incompatible with VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE. + * + * Returns an (opaque) new virDomainCheckpointPtr on success, or NULL + * on failure. + */ +virDomainCheckpointPtr +virDomainCheckpointCreateXML(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "xmlDesc=%s, flags=0x%x", xmlDesc, flags); + + virResetLastError(); + + virCheckDomainReturn(domain, NULL); + conn = domain->conn; + + virCheckNonNullArgGoto(xmlDesc, error); + virCheckReadOnlyGoto(conn->flags, error); + + VIR_REQUIRE_FLAG_GOTO(VIR_DOMAIN_CHECKPOINT_CREATE_CURRENT, + VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE, + error); + + VIR_EXCLUSIVE_FLAGS_GOTO(VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE, + VIR_DOMAIN_CHECKPOINT_CREATE_NO_METADATA, + error); + + if (conn->driver->domainCheckpointCreateXML) { + virDomainCheckpointPtr ret; + ret = conn->driver->domainCheckpointCreateXML(domain, xmlDesc, flags); + if (!ret) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + +/** + * virDomainCheckpointGetXMLDesc: + * @checkpoint: a domain checkpoint object + * @flags: bitwise-OR of subset of virDomainXMLFlags + * + * Provide an XML description of the domain checkpoint. + * + * No security-sensitive data will be included unless @flags contains + * VIR_DOMAIN_XML_SECURE; this flag is rejected on read-only + * connections. For this API, @flags should not contain either + * VIR_DOMAIN_XML_INACTIVE or VIR_DOMAIN_XML_UPDATE_CPU. + * + * Returns a 0 terminated UTF-8 encoded XML instance, or NULL in case of error. + * the caller must free() the returned value. + */ +char * +virDomainCheckpointGetXMLDesc(virDomainCheckpointPtr checkpoint, + unsigned int flags) +{ + virConnectPtr conn; + VIR_DEBUG("checkpoint=%p, flags=0x%x", checkpoint, flags); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, NULL); + conn = checkpoint->domain->conn; + + if ((conn->flags & VIR_CONNECT_RO) && (flags & VIR_DOMAIN_XML_SECURE)) { + virReportError(VIR_ERR_OPERATION_DENIED, "%s", + _("virDomainCheckpointGetXMLDesc with secure flag")); + goto error; + } + + if (conn->driver->domainCheckpointGetXMLDesc) { + char *ret; + ret = conn->driver->domainCheckpointGetXMLDesc(checkpoint, flags); + if (!ret) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + +/** + * virDomainListCheckpoints: + * @domain: a domain object + * @checkpoints: pointer to variable to store the array containing checkpoint + * objects, or NULL if the list is not required (just returns + * number of checkpoints) + * @flags: bitwise-OR of supported virDomainCheckpoinListFlags + * + * Collect the list of domain checkpoints for the given domain, and allocate + * an array to store those objects. + * + * By default, this command covers all checkpoints; it is also possible to + * limit things to just checkpoints with no parents, when @flags includes + * VIR_DOMAIN_CHECKPOINT_LIST_ROOTS. Additional filters are provided in + * groups, where each group contains bits that describe mutually exclusive + * attributes of a checkpoint, and where all bits within a group describe + * all possible checkpoints. Some hypervisors might reject explicit bits + * from a group where the hypervisor cannot make a distinction. For a + * group supported by a given hypervisor, the behavior when no bits of a + * group are set is identical to the behavior when all bits in that group + * are set. When setting bits from more than one group, it is possible to + * select an impossible combination, in that case a hypervisor may return + * either 0 or an error. + * + * The first group of @flags is VIR_DOMAIN_CHECKPOINT_LIST_LEAVES and + * VIR_DOMAIN_CHECKPOINT_LIST_NO_LEAVES, to filter based on checkpoints that + * have no further children (a leaf checkpoint). + * + * The next group of @flags is VIR_DOMAIN_CHECKPOINT_LIST_METADATA and + * VIR_DOMAIN_CHECKPOINT_LIST_NO_METADATA, for filtering checkpoints based on + * whether they have metadata that would prevent the removal of the last + * reference to a domain. + * + * Returns the number of domain checkpoints found or -1 and sets @checkpoints + * to NULL in case of error. On success, the array stored into @checkpoints + * is guaranteed to have an extra allocated element set to NULL but not + * included in the return count, to make iteration easier. The caller is + * responsible for calling virDomainCheckpointFree() on each array element, + * then calling free() on @checkpoints. + */ +int +virDomainListCheckpoints(virDomainPtr domain, + virDomainCheckpointPtr **checkpoints, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "checkpoints=%p, flags=0x%x", checkpoints, flags); + + virResetLastError(); + + if (checkpoints) + *checkpoints = NULL; + + virCheckDomainReturn(domain, -1); + conn = domain->conn; + + if (conn->driver->domainListCheckpoints) { + int ret = conn->driver->domainListCheckpoints(domain, checkpoints, + flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointListChildren: + * @checkpoint: a domain checkpoint object + * @children: pointer to variable to store the array containing checkpoint + * objects, or NULL if the list is not required (just returns + * number of checkpoints) + * @flags: bitwise-OR of supported virDomainCheckpointListFlags + * + * Collect the list of domain checkpoints that are children of the given + * checkpoint, and allocate an array to store those objects. + * + * By default, this command covers only direct children; it is also possible + * to expand things to cover all descendants, when @flags includes + * VIR_DOMAIN_CHECKPOINT_LIST_DESCENDANTS. Also, some filters are provided in + * groups, where each group contains bits that describe mutually exclusive + * attributes of a snapshot, and where all bits within a group describe + * all possible snapshots. Some hypervisors might reject explicit bits + * from a group where the hypervisor cannot make a distinction. For a + * group supported by a given hypervisor, the behavior when no bits of a + * group are set is identical to the behavior when all bits in that group + * are set. When setting bits from more than one group, it is possible to + * select an impossible combination, in that case a hypervisor may return + * either 0 or an error. + * + * The first group of @flags is VIR_DOMAIN_CHECKPOINT_LIST_LEAVES and + * VIR_DOMAIN_CHECKPOINT_LIST_NO_LEAVES, to filter based on checkpoints that + * have no further children (a leaf checkpoint). + * + * The next group of @flags is VIR_DOMAIN_CHECKPOINT_LIST_METADATA and + * VIR_DOMAIN_CHECKPOINT_LIST_NO_METADATA, for filtering checkpoints based on + * whether they have metadata that would prevent the removal of the last + * reference to a domain. + * + * Returns the number of domain checkpoints found or -1 and sets @children to + * NULL in case of error. On success, the array stored into @children is + * guaranteed to have an extra allocated element set to NULL but not included + * in the return count, to make iteration easier. The caller is responsible + * for calling virDomainCheckpointFree() on each array element, then calling + * free() on @children. + */ +int +virDomainCheckpointListChildren(virDomainCheckpointPtr checkpoint, + virDomainCheckpointPtr **children, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DEBUG("checkpoint=%p, children=%p, flags=0x%x", + checkpoint, children, flags); + + virResetLastError(); + + if (children) + *children = NULL; + + virCheckDomainCheckpointReturn(checkpoint, -1); + conn = checkpoint->domain->conn; + + if (conn->driver->domainCheckpointListChildren) { + int ret = conn->driver->domainCheckpointListChildren(checkpoint, + children, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointLookupByName: + * @domain: a domain object + * @name: name for the domain checkpoint + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Try to lookup a domain checkpoint based on its name. + * + * Returns a domain checkpoint object or NULL in case of failure. If the + * domain checkpoint cannot be found, then the VIR_ERR_NO_DOMAIN_CHECKPOINT + * error is raised. + */ +virDomainCheckpointPtr +virDomainCheckpointLookupByName(virDomainPtr domain, + const char *name, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "name=%s, flags=0x%x", name, flags); + + virResetLastError(); + + virCheckDomainReturn(domain, NULL); + conn = domain->conn; + + virCheckNonNullArgGoto(name, error); + + if (conn->driver->domainCheckpointLookupByName) { + virDomainCheckpointPtr dom; + dom = conn->driver->domainCheckpointLookupByName(domain, name, flags); + if (!dom) + goto error; + return dom; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + +/** + * virDomainHasCurrentCheckpoint: + * @domain: pointer to the domain object + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Determine if the domain has a current checkpoint. + * + * Returns 1 if such checkpoint exists, 0 if it doesn't, -1 on error. + */ +int +virDomainHasCurrentCheckpoint(virDomainPtr domain, unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "flags=0x%x", flags); + + virResetLastError(); + + virCheckDomainReturn(domain, -1); + conn = domain->conn; + + if (conn->driver->domainHasCurrentCheckpoint) { + int ret = conn->driver->domainHasCurrentCheckpoint(domain, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointCurrent: + * @domain: a domain object + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Get the current checkpoint for a domain, if any. + * + * virDomainCheckpointFree should be used to free the resources after the + * checkpoint object is no longer needed. + * + * Returns a domain checkpoint object or NULL in case of failure. If the + * current domain checkpoint cannot be found, then the + * VIR_ERR_NO_DOMAIN_CHECKPOINT error is raised. + */ +virDomainCheckpointPtr +virDomainCheckpointCurrent(virDomainPtr domain, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "flags=0x%x", flags); + + virResetLastError(); + + virCheckDomainReturn(domain, NULL); + conn = domain->conn; + + if (conn->driver->domainCheckpointCurrent) { + virDomainCheckpointPtr snap; + snap = conn->driver->domainCheckpointCurrent(domain, flags); + if (!snap) + goto error; + return snap; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + +/** + * virDomainCheckpointGetParent: + * @checkpoint: a checkpoint object + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Get the parent checkpoint for @checkpoint, if any. + * + * virDomainCheckpointFree should be used to free the resources after the + * checkpoint object is no longer needed. + * + * Returns a domain checkpoint object or NULL in case of failure. If the + * given checkpoint is a root (no parent), then the VIR_ERR_NO_DOMAIN_CHECKPOINT + * error is raised. + */ +virDomainCheckpointPtr +virDomainCheckpointGetParent(virDomainCheckpointPtr checkpoint, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DEBUG("checkpoint=%p, flags=0x%x", checkpoint, flags); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, NULL); + conn = checkpoint->domain->conn; + + if (conn->driver->domainCheckpointGetParent) { + virDomainCheckpointPtr snap; + snap = conn->driver->domainCheckpointGetParent(checkpoint, flags); + if (!snap) + goto error; + return snap; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + +/** + * virDomainCheckpointIsCurrent: + * @checkpoint: a checkpoint object + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Determine if the given checkpoint is the domain's current checkpoint. See + * also virDomainHasCurrentCheckpoint(). + * + * Returns 1 if current, 0 if not current, or -1 on error. + */ +int +virDomainCheckpointIsCurrent(virDomainCheckpointPtr checkpoint, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DEBUG("checkpoint=%p, flags=0x%x", checkpoint, flags); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, -1); + conn = checkpoint->domain->conn; + + if (conn->driver->domainCheckpointIsCurrent) { + int ret; + ret = conn->driver->domainCheckpointIsCurrent(checkpoint, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointHasMetadata: + * @checkpoint: a checkpoint object + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Determine if the given checkpoint is associated with libvirt metadata + * that would prevent the deletion of the domain. + * + * Returns 1 if the checkpoint has metadata, 0 if the checkpoint exists without + * help from libvirt, or -1 on error. + */ +int +virDomainCheckpointHasMetadata(virDomainCheckpointPtr checkpoint, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DEBUG("checkpoint=%p, flags=0x%x", checkpoint, flags); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, -1); + conn = checkpoint->domain->conn; + + if (conn->driver->domainCheckpointHasMetadata) { + int ret; + ret = conn->driver->domainCheckpointHasMetadata(checkpoint, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointDelete: + * @checkpoint: the checkpoint to remove + * @flags: not used yet, pass 0 + * @flags: bitwise-OR of supported virDomainCheckpointDeleteFlags + * + * Removes a checkpoint from the domain. + * + * When removing a checkpoint, the record of which portions of the + * disk were dirtied after the checkpoint will be merged into the + * record tracked by the parent checkpoint, if any. Likewise, if the + * checkpoint being deleted was the current checkpoint, the parent + * checkpoint becomes the new current checkpoint. + * + * If @flags includes VIR_DOMAIN_CHECKPOINT_DELETE_METADATA_ONLY, then + * any checkpoint metadata tracked by libvirt is removed while keeping + * the checkpoint contents intact; if a hypervisor does not require + * any libvirt metadata to track checkpoints, then this flag is + * silently ignored. + * + * Returns 0 on success, -1 on error. + */ +int +virDomainCheckpointDelete(virDomainCheckpointPtr checkpoint, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DEBUG("checkpoint=%p, flags=0x%x", checkpoint, flags); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, -1); + conn = checkpoint->domain->conn; + + virCheckReadOnlyGoto(conn->flags, error); + + VIR_EXCLUSIVE_FLAGS_GOTO(VIR_DOMAIN_CHECKPOINT_DELETE_CHILDREN, + VIR_DOMAIN_CHECKPOINT_DELETE_CHILDREN_ONLY, + error); + + if (conn->driver->domainCheckpointDelete) { + int ret = conn->driver->domainCheckpointDelete(checkpoint, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointRef: + * @checkpoint: the checkpoint to hold a reference on + * + * Increment the reference count on the checkpoint. For each + * additional call to this method, there shall be a corresponding + * call to virDomainCheckpointFree to release the reference count, once + * the caller no longer needs the reference to this object. + * + * This method is typically useful for applications where multiple + * threads are using a connection, and it is required that the + * connection and domain remain open until all threads have finished + * using the checkpoint. ie, each new thread using a checkpoint would + * increment the reference count. + * + * Returns 0 in case of success and -1 in case of failure. + */ +int +virDomainCheckpointRef(virDomainCheckpointPtr checkpoint) +{ + VIR_DEBUG("checkpoint=%p, refs=%d", checkpoint, + checkpoint ? checkpoint->parent.u.s.refs : 0); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, -1); + + virObjectRef(checkpoint); + return 0; +} + + +/** + * virDomainCheckpointFree: + * @checkpoint: a domain checkpoint object + * + * Free the domain checkpoint object. The checkpoint itself is not modified. + * The data structure is freed and should not be used thereafter. + * + * Returns 0 in case of success and -1 in case of failure. + */ +int +virDomainCheckpointFree(virDomainCheckpointPtr checkpoint) +{ + VIR_DEBUG("checkpoint=%p", checkpoint); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, -1); + + virObjectUnref(checkpoint); + return 0; +} diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index 3bf3c3f916..a3e12b9a12 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -794,6 +794,22 @@ LIBVIRT_4.4.0 { LIBVIRT_4.5.0 { global: + virDomainCheckpointCreateXML; + virDomainCheckpointCurrent; + virDomainCheckpointDelete; + virDomainCheckpointFree; + virDomainCheckpointGetConnect; + virDomainCheckpointGetDomain; + virDomainCheckpointGetParent; + virDomainCheckpointGetXMLDesc; + virDomainCheckpointHasMetadata; + virDomainCheckpointIsCurrent; + virDomainCheckpointListChildren; + virDomainCheckpointLookupByName; + virDomainCheckpointRef; + virDomainHasCurrentCheckpoint; + virDomainListCheckpoints; + virDomainCheckpointGetName; virGetLastErrorCode; virGetLastErrorDomain; virNodeGetSEVInfo; -- 2.14.4

On 06/13/2018 12:42 PM, Eric Blake wrote:
Introduce a bunch of new public APIs related to backup checkpoints. Checkpoints are modeled heavily after virDomainSnapshotPtr (both represent a point in time of the guest), although a snapshot exists with the intent of rolling back to that state, while a checkpoint exists to make it possible to create an incremental backup at a later time.
Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/Makefile.am | 3 + docs/apibuild.py | 2 + docs/docs.html.in | 1 + include/libvirt/libvirt-domain-checkpoint.h | 147 ++++++ include/libvirt/libvirt.h | 5 +- libvirt.spec.in | 1 + mingw-libvirt.spec.in | 2 + po/POTFILES | 1 + src/Makefile.am | 2 + src/driver-hypervisor.h | 60 ++- src/libvirt-domain-checkpoint.c | 708 ++++++++++++++++++++++++++++ src/libvirt_public.syms | 16 + 12 files changed, 944 insertions(+), 4 deletions(-) create mode 100644 include/libvirt/libvirt-domain-checkpoint.h create mode 100644 src/libvirt-domain-checkpoint.c
In a word... Overwhelming! I have concerns related to committing the API before everyone is sure about the underlying hypervisor code. No sense in baking in an API only to find out later needs/issues. It seems we I see checkpoint code borrows the domain connection - is that similar to snapshots? I won't go too in depth here - mostly just scan to look for obvious issues.
diff --git a/docs/Makefile.am b/docs/Makefile.am index 9620587a77..0df8ebbd64 100644 --- a/docs/Makefile.am +++ b/docs/Makefile.am @@ -25,6 +25,7 @@ apihtml = \ apihtml_generated = \ html/libvirt-libvirt-common.html \ html/libvirt-libvirt-domain.html \ + html/libvirt-libvirt-domain-checkpoint.html \ html/libvirt-libvirt-domain-snapshot.html \ html/libvirt-libvirt-event.html \ html/libvirt-libvirt-host.html \ @@ -318,6 +319,7 @@ $(python_generated_files): $(APIBUILD_STAMP) $(APIBUILD_STAMP): $(srcdir)/apibuild.py \ $(top_srcdir)/include/libvirt/libvirt.h \ $(top_srcdir)/include/libvirt/libvirt-common.h.in \ + $(top_srcdir)/include/libvirt/libvirt-domain-checkpoint.h \ $(top_srcdir)/include/libvirt/libvirt-domain-snapshot.h \ $(top_srcdir)/include/libvirt/libvirt-domain.h \ $(top_srcdir)/include/libvirt/libvirt-event.h \ @@ -334,6 +336,7 @@ $(APIBUILD_STAMP): $(srcdir)/apibuild.py \ $(top_srcdir)/include/libvirt/libvirt-admin.h \ $(top_srcdir)/include/libvirt/virterror.h \ $(top_srcdir)/src/libvirt.c \ + $(top_srcdir)/src/libvirt-domain-checkpoint.c \ $(top_srcdir)/src/libvirt-domain-snapshot.c \ $(top_srcdir)/src/libvirt-domain.c \ $(top_srcdir)/src/libvirt-host.c \ diff --git a/docs/apibuild.py b/docs/apibuild.py index 5e218a9ad0..471547cea7 100755 --- a/docs/apibuild.py +++ b/docs/apibuild.py @@ -26,6 +26,7 @@ debugsym = None included_files = { "libvirt-common.h": "header with general libvirt API definitions", "libvirt-domain.h": "header with general libvirt API definitions", + "libvirt-domain-checkpoint.h": "header with general libvirt API definitions", "libvirt-domain-snapshot.h": "header with general libvirt API definitions", "libvirt-event.h": "header with general libvirt API definitions", "libvirt-host.h": "header with general libvirt API definitions", @@ -39,6 +40,7 @@ included_files = { "virterror.h": "header with error specific API definitions", "libvirt.c": "Main interfaces for the libvirt library", "libvirt-domain.c": "Domain interfaces for the libvirt library", + "libvirt-domain-checkpoint.c": "Domain checkpoint interfaces for the libvirt library", "libvirt-domain-snapshot.c": "Domain snapshot interfaces for the libvirt library", "libvirt-host.c": "Host interfaces for the libvirt library", "libvirt-interface.c": "Interface interfaces for the libvirt library", diff --git a/docs/docs.html.in b/docs/docs.html.in index 11dfd27ba6..63dbdf7755 100644 --- a/docs/docs.html.in +++ b/docs/docs.html.in @@ -97,6 +97,7 @@ <dd>Reference manual for the C public API, split in <a href="html/libvirt-libvirt-common.html">common</a>, <a href="html/libvirt-libvirt-domain.html">domain</a>, + <a href="html/libvirt-libvirt-domain-checkpoint.html">domain checkpoint</a>, <a href="html/libvirt-libvirt-domain-snapshot.html">domain snapshot</a>, <a href="html/libvirt-virterror.html">error</a>, <a href="html/libvirt-libvirt-event.html">event</a>, diff --git a/include/libvirt/libvirt-domain-checkpoint.h b/include/libvirt/libvirt-domain-checkpoint.h new file mode 100644 index 0000000000..4a7dc73089 --- /dev/null +++ b/include/libvirt/libvirt-domain-checkpoint.h @@ -0,0 +1,147 @@ +/* + * libvirt-domain-checkpoint.h + * Summary: APIs for management of domain checkpoints + * Description: Provides APIs for the management of domain checkpoints + * Author: Eric Blake <eblake@redhat.com> + * + * Copyright (C) 2006-2018 Red Hat, Inc.
Since it's created in 2018 - shouldn't it just list that? Not my area of expertise by any stretch of the imagination though.
+ * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef __VIR_LIBVIRT_DOMAIN_CHECKPOINT_H__ +# define __VIR_LIBVIRT_DOMAIN_CHECKPOINT_H__ + +# ifndef __VIR_LIBVIRT_H_INCLUDES__ +# error "Don't include this file directly, only use libvirt/libvirt.h" +# endif + +/** + * virDomainCheckpoint: + * + * A virDomainCheckpoint is a private structure representing a checkpoint of + * a domain. A checkpoint is useful for tracking which portions of the + * domain disks have been altered since a point in time, but by itself does + * not allow reverting back to that point in time. + */ +typedef struct _virDomainCheckpoint virDomainCheckpoint; + +/** + * virDomainCheckpointPtr: + * + * A virDomainCheckpointPtr is pointer to a virDomainCheckpoint + * private structure, and is the type used to reference a domain + * checkpoint in the API. + */ +typedef virDomainCheckpoint *virDomainCheckpointPtr; + +const char *virDomainCheckpointGetName(virDomainCheckpointPtr checkpoint); +virDomainPtr virDomainCheckpointGetDomain(virDomainCheckpointPtr checkpoint); +virConnectPtr virDomainCheckpointGetConnect(virDomainCheckpointPtr checkpoint); + +typedef enum { + VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE = (1 << 0), /* Restore or alter + metadata */ + VIR_DOMAIN_CHECKPOINT_CREATE_CURRENT = (1 << 1), /* With redefine, make + checkpoint current */ + VIR_DOMAIN_CHECKPOINT_CREATE_NO_METADATA = (1 << 2), /* Make checkpoint without + remembering it */ +} virDomainCheckpointCreateFlags; + +/* Create a checkpoint using the current VM state. */ +virDomainCheckpointPtr virDomainCheckpointCreateXML(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags); + +/* Dump the XML of a checkpoint */ +char *virDomainCheckpointGetXMLDesc(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +/** + * virDomainCheckpointListFlags: + * + * Flags valid for virDomainListCheckpoints() and + * virDomainCheckpointListChildren(). Note that the interpretation of + * flag (1<<0) depends on which function it is passed to; but serves + * to toggle the per-call default of whether the listing is shallow or + * recursive. Remaining bits come in groups; if all bits from a group + * are 0, then that group is not used to filter results. */ ^^ There's an extra space here
+typedef enum { + VIR_DOMAIN_CHECKPOINT_LIST_ROOTS = (1 << 0), /* Filter by checkpoints + with no parents, when + listing a domain */ + VIR_DOMAIN_CHECKPOINT_LIST_DESCENDANTS = (1 << 0), /* List all descendants, + not just children, when + listing a checkpoint */
There's two "1 << 0" entries - ironically the doc page render lists these in the opposite order. Still I see this is essentially a copy of the SnapshotListFlags. So do we really need to keep that when there's only 2 API's?
+ + VIR_DOMAIN_CHECKPOINT_LIST_LEAVES = (1 << 1), /* Filter by checkpoints + with no children */ + VIR_DOMAIN_CHECKPOINT_LIST_NO_LEAVES = (1 << 2), /* Filter by checkpoints + that have children */ + + VIR_DOMAIN_CHECKPOINT_LIST_METADATA = (1 << 3), /* Filter by checkpoints + which have metadata */ + VIR_DOMAIN_CHECKPOINT_LIST_NO_METADATA = (1 << 4), /* Filter by checkpoints + with no metadata */ +} virDomainCheckpointListFlags;
Not quite sure where/how metadata comes into play. What metadata? Where/when was that introduced?
+ +/* Get all checkpoint objects for this domain */ +int virDomainListCheckpoints(virDomainPtr domain, + virDomainCheckpointPtr **checkpoints, + unsigned int flags); + +/* Get all checkpoint object children for this checkpoint */ +int virDomainCheckpointListChildren(virDomainCheckpointPtr checkpoint, + virDomainCheckpointPtr **children, + unsigned int flags); + +/* Get a handle to a named checkpoint */ +virDomainCheckpointPtr virDomainCheckpointLookupByName(virDomainPtr domain, + const char *name, + unsigned int flags); + +/* Check whether a domain has a checkpoint which is currently used */ +int virDomainHasCurrentCheckpoint(virDomainPtr domain, unsigned int flags);
Two lines for arguments. (consistency)
+ +/* Get a handle to the current checkpoint */ +virDomainCheckpointPtr virDomainCheckpointCurrent(virDomainPtr domain, + unsigned int flags); + +/* Get a handle to the parent checkpoint, if one exists */ +virDomainCheckpointPtr virDomainCheckpointGetParent(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +/* Determine if a checkpoint is the current checkpoint of its domain. */ +int virDomainCheckpointIsCurrent(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +/* Determine if checkpoint has metadata that would prevent domain deletion. */ +int virDomainCheckpointHasMetadata(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +/* Delete a checkpoint */ +typedef enum { + VIR_DOMAIN_CHECKPOINT_DELETE_CHILDREN = (1 << 0), /* Also delete children */ + VIR_DOMAIN_CHECKPOINT_DELETE_METADATA_ONLY = (1 << 1), /* Delete just metadata */ + VIR_DOMAIN_CHECKPOINT_DELETE_CHILDREN_ONLY = (1 << 2), /* Delete just children */ +} virDomainCheckpointDeleteFlags;
Again, not sure what the metadata entails here. Although this perhaps answer a different question I had about converging bitmaps by deleting "middle" checkpoints.
+ +int virDomainCheckpointDelete(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +int virDomainCheckpointRef(virDomainCheckpointPtr checkpoint); +int virDomainCheckpointFree(virDomainCheckpointPtr checkpoint); + +#endif /* __VIR_LIBVIRT_DOMAIN_CHECKPOINT_H__ */ diff --git a/include/libvirt/libvirt.h b/include/libvirt/libvirt.h index 26887a40e7..4e7da0afc4 100644 --- a/include/libvirt/libvirt.h +++ b/include/libvirt/libvirt.h @@ -4,7 +4,7 @@ * Description: Provides the interfaces of the libvirt library to handle * virtualized domains * - * Copyright (C) 2005-2006, 2010-2014 Red Hat, Inc. + * Copyright (C) 2005-2006, 2010-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -36,8 +36,7 @@ extern "C" { # include <libvirt/libvirt-common.h> # include <libvirt/libvirt-host.h> # include <libvirt/libvirt-domain.h> -typedef struct _virDomainCheckpoint virDomainCheckpoint; -typedef virDomainCheckpoint *virDomainCheckpointPtr; +# include <libvirt/libvirt-domain-checkpoint.h> # include <libvirt/libvirt-domain-snapshot.h> # include <libvirt/libvirt-event.h> # include <libvirt/libvirt-interface.h> diff --git a/libvirt.spec.in b/libvirt.spec.in index 50bd79a7d7..a82e97f3db 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -2102,6 +2102,7 @@ exit 0 %{_includedir}/libvirt/libvirt-admin.h %{_includedir}/libvirt/libvirt-common.h %{_includedir}/libvirt/libvirt-domain.h +%{_includedir}/libvirt/libvirt-domain-checkpoint.h %{_includedir}/libvirt/libvirt-domain-snapshot.h %{_includedir}/libvirt/libvirt-event.h %{_includedir}/libvirt/libvirt-host.h diff --git a/mingw-libvirt.spec.in b/mingw-libvirt.spec.in index 6912527cf7..3b41e69661 100644 --- a/mingw-libvirt.spec.in +++ b/mingw-libvirt.spec.in @@ -269,6 +269,7 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw32_includedir}/libvirt/libvirt.h %{mingw32_includedir}/libvirt/libvirt-common.h %{mingw32_includedir}/libvirt/libvirt-domain.h +%{mingw32_includedir}/libvirt/libvirt-domain-checkpoint.h %{mingw32_includedir}/libvirt/libvirt-domain-snapshot.h %{mingw32_includedir}/libvirt/libvirt-event.h %{mingw32_includedir}/libvirt/libvirt-host.h @@ -355,6 +356,7 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw64_includedir}/libvirt/libvirt.h %{mingw64_includedir}/libvirt/libvirt-common.h %{mingw64_includedir}/libvirt/libvirt-domain.h +%{mingw64_includedir}/libvirt/libvirt-domain-checkpoint.h %{mingw64_includedir}/libvirt/libvirt-domain-snapshot.h %{mingw64_includedir}/libvirt/libvirt-event.h %{mingw64_includedir}/libvirt/libvirt-host.h
Not an area I know well, but as long as the various make with rpm options is tested, then great. Seems we always seem to forget something related to some obscure option every time we add something new!
diff --git a/po/POTFILES b/po/POTFILES index be2874487c..d246657188 100644 --- a/po/POTFILES +++ b/po/POTFILES @@ -69,6 +69,7 @@ src/interface/interface_backend_netcf.c src/interface/interface_backend_udev.c src/internal.h src/libvirt-admin.c +src/libvirt-domain-checkpoint.c src/libvirt-domain-snapshot.c src/libvirt-domain.c src/libvirt-host.c diff --git a/src/Makefile.am b/src/Makefile.am index db8c8ebd1a..d20d65574e 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -174,6 +174,7 @@ DRIVER_SOURCES += \ $(DATATYPES_SOURCES) \ libvirt.c libvirt_internal.h \ libvirt-domain.c \ + libvirt-domain-checkpoint.c \ libvirt-domain-snapshot.c \ libvirt-host.c \ libvirt-interface.c \ @@ -728,6 +729,7 @@ libvirt_setuid_rpc_client_la_SOURCES = \ datatypes.c \ libvirt.c \ libvirt-domain.c \ + libvirt-domain-checkpoint.c \ libvirt-domain-snapshot.c \ libvirt-host.c \ libvirt-interface.c \ diff --git a/src/driver-hypervisor.h b/src/driver-hypervisor.h index eef31eb1f0..ee8a9a3e0e 100644 --- a/src/driver-hypervisor.h +++ b/src/driver-hypervisor.h @@ -1,7 +1,7 @@ /* * driver-hypervisor.h: entry points for hypervisor drivers * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -1321,6 +1321,53 @@ typedef int int *nparams, unsigned int flags);
+typedef virDomainCheckpointPtr +(*virDrvDomainCheckpointCreateXML)(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags); + +typedef char * +(*virDrvDomainCheckpointGetXMLDesc)(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +typedef int +(*virDrvDomainListCheckpoints)(virDomainPtr domain, + virDomainCheckpointPtr **checkpoints, + unsigned int flags); + +typedef int +(*virDrvDomainCheckpointListChildren)(virDomainCheckpointPtr checkpoint, + virDomainCheckpointPtr **children, + unsigned int flags); + +typedef virDomainCheckpointPtr +(*virDrvDomainCheckpointLookupByName)(virDomainPtr domain, + const char *name, + unsigned int flags); + +typedef int +(*virDrvDomainHasCurrentCheckpoint)(virDomainPtr domain, + unsigned int flags); + +typedef virDomainCheckpointPtr +(*virDrvDomainCheckpointGetParent)(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +typedef virDomainCheckpointPtr +(*virDrvDomainCheckpointCurrent)(virDomainPtr domain, + unsigned int flags); + +typedef int +(*virDrvDomainCheckpointIsCurrent)(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +typedef int +(*virDrvDomainCheckpointHasMetadata)(virDomainCheckpointPtr checkpoint, + unsigned int flags); + +typedef int +(*virDrvDomainCheckpointDelete)(virDomainCheckpointPtr checkpoint, + unsigned int flags);
typedef struct _virHypervisorDriver virHypervisorDriver; typedef virHypervisorDriver *virHypervisorDriverPtr; @@ -1572,6 +1619,17 @@ struct _virHypervisorDriver { virDrvConnectBaselineHypervisorCPU connectBaselineHypervisorCPU; virDrvNodeGetSEVInfo nodeGetSEVInfo; virDrvDomainGetLaunchSecurityInfo domainGetLaunchSecurityInfo; + virDrvDomainCheckpointCreateXML domainCheckpointCreateXML; + virDrvDomainCheckpointGetXMLDesc domainCheckpointGetXMLDesc; + virDrvDomainListCheckpoints domainListCheckpoints; + virDrvDomainCheckpointListChildren domainCheckpointListChildren; + virDrvDomainCheckpointLookupByName domainCheckpointLookupByName; + virDrvDomainHasCurrentCheckpoint domainHasCurrentCheckpoint; + virDrvDomainCheckpointGetParent domainCheckpointGetParent; + virDrvDomainCheckpointCurrent domainCheckpointCurrent; + virDrvDomainCheckpointIsCurrent domainCheckpointIsCurrent; + virDrvDomainCheckpointHasMetadata domainCheckpointHasMetadata; + virDrvDomainCheckpointDelete domainCheckpointDelete; };
diff --git a/src/libvirt-domain-checkpoint.c b/src/libvirt-domain-checkpoint.c new file mode 100644 index 0000000000..12511a13ee --- /dev/null +++ b/src/libvirt-domain-checkpoint.c @@ -0,0 +1,708 @@ +/* + * libvirt-domain-checkpoint.c: entry points for virDomainCheckpointPtr APIs + * + * Copyright (C) 2006-2014, 2018 Red Hat, Inc.
Similar year thing
+ * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#include <config.h> + +#include "datatypes.h" +#include "virlog.h" + +VIR_LOG_INIT("libvirt.domain-checkpoint"); + +#define VIR_FROM_THIS VIR_FROM_DOMAIN_CHECKPOINT + +/** + * virDomainCheckpointGetName: + * @checkpoint: a checkpoint object + * + * Get the public name for that checkpoint + * + * Returns a pointer to the name or NULL, the string need not be deallocated + * as its lifetime will be the same as the checkpoint object. + */ +const char * +virDomainCheckpointGetName(virDomainCheckpointPtr checkpoint) +{ + VIR_DEBUG("checkpoint=%p", checkpoint); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, NULL); + + return checkpoint->name; +} + + +/** + * virDomainCheckpointGetDomain: + * @checkpoint: a checkpoint object + * + * Provides the domain pointer associated with a checkpoint. The + * reference counter on the domain is not increased by this + * call.
Seems call could fit on previous line.
+ * + * Returns the domain or NULL. + */ +virDomainPtr +virDomainCheckpointGetDomain(virDomainCheckpointPtr checkpoint) +{ + VIR_DEBUG("checkpoint=%p", checkpoint); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, NULL); + + return checkpoint->domain; +} + + +/** + * virDomainCheckpointGetConnect: + * @checkpoint: a checkpoint object + * + * Provides the connection pointer associated with a checkpoint. The + * reference counter on the connection is not increased by this + * call.
Previous line again.
+ * + * Returns the connection or NULL. + */ +virConnectPtr +virDomainCheckpointGetConnect(virDomainCheckpointPtr checkpoint) +{ + VIR_DEBUG("checkpoint=%p", checkpoint); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, NULL); + + return checkpoint->domain->conn; +} + + +/** + * virDomainCheckpointCreateXML: + * @domain: a domain object + * @xmlDesc: description of the checkpoint to create + * @flags: bitwise-OR of supported virDomainCheckpointCreateFlags + * + * Create a new checkpoint using @xmlDesc on a running @domain. + * Typically, it is more common to create a new checkpoint as part of + * kicking off a backup job with virDomainBackupBegin(); however, it + * is also possible to start a checkpoint without a backup.
Should we state that the domain needs to have an open connection already or is that for free (too lazy to check right now ;-))
+ * + * See formatcheckpoint.html#CheckpointAttributes document for more + * details on @xmlDesc. + * + * If @flags includes VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE, then this + * is a request to reinstate checkpoint metadata that was previously + * discarded, rather than creating a new checkpoint. When redefining + * checkpoint metadata, the current checkpoint will not be altered + * unless the VIR_DOMAIN_CHECKPOINT_CREATE_CURRENT flag is also + * present. It is an error to request the + * VIR_DOMAIN_CHECKPOINT_CREATE_CURRENT flag without + * VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE. + * + * If @flags includes VIR_DOMAIN_CHECKPOINT_CREATE_NO_METADATA, then + * the domain's disk images are modified according to @xmlDesc, but + * then the just-created checkpoint has its metadata deleted. This + * flag is incompatible with VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE. + * + * Returns an (opaque) new virDomainCheckpointPtr on success, or NULL + * on failure. + */
Not sure I quite yet understand the "metadata" reference... Not sure how much of this is cut-n-paste from the snapshot world.
+virDomainCheckpointPtr +virDomainCheckpointCreateXML(virDomainPtr domain, + const char *xmlDesc, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "xmlDesc=%s, flags=0x%x", xmlDesc, flags); + + virResetLastError(); + + virCheckDomainReturn(domain, NULL); + conn = domain->conn; + + virCheckNonNullArgGoto(xmlDesc, error); + virCheckReadOnlyGoto(conn->flags, error); + + VIR_REQUIRE_FLAG_GOTO(VIR_DOMAIN_CHECKPOINT_CREATE_CURRENT, + VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE, + error); + + VIR_EXCLUSIVE_FLAGS_GOTO(VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE, + VIR_DOMAIN_CHECKPOINT_CREATE_NO_METADATA, + error); + + if (conn->driver->domainCheckpointCreateXML) { + virDomainCheckpointPtr ret; + ret = conn->driver->domainCheckpointCreateXML(domain, xmlDesc, flags); + if (!ret) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + +/** + * virDomainCheckpointGetXMLDesc: + * @checkpoint: a domain checkpoint object + * @flags: bitwise-OR of subset of virDomainXMLFlags + * + * Provide an XML description of the domain checkpoint. + * + * No security-sensitive data will be included unless @flags contains + * VIR_DOMAIN_XML_SECURE; this flag is rejected on read-only + * connections. For this API, @flags should not contain either + * VIR_DOMAIN_XML_INACTIVE or VIR_DOMAIN_XML_UPDATE_CPU.
New paragraph for the last sentence and perhaps just state "This API does not support the xx or xx flags.
+ * + * Returns a 0 terminated UTF-8 encoded XML instance, or NULL in case of error. + * the caller must free() the returned value. + */ +char * +virDomainCheckpointGetXMLDesc(virDomainCheckpointPtr checkpoint, + unsigned int flags) +{ + virConnectPtr conn; + VIR_DEBUG("checkpoint=%p, flags=0x%x", checkpoint, flags); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, NULL); + conn = checkpoint->domain->conn; + + if ((conn->flags & VIR_CONNECT_RO) && (flags & VIR_DOMAIN_XML_SECURE)) { + virReportError(VIR_ERR_OPERATION_DENIED, "%s", + _("virDomainCheckpointGetXMLDesc with secure flag")); + goto error; + } + + if (conn->driver->domainCheckpointGetXMLDesc) { + char *ret; + ret = conn->driver->domainCheckpointGetXMLDesc(checkpoint, flags); + if (!ret) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + +/** + * virDomainListCheckpoints: + * @domain: a domain object + * @checkpoints: pointer to variable to store the array containing checkpoint + * objects, or NULL if the list is not required (just returns
s/objects,/objects/
+ * number of checkpoints) + * @flags: bitwise-OR of supported virDomainCheckpoinListFlags
CheckpointListFlags
+ * + * Collect the list of domain checkpoints for the given domain, and allocate + * an array to store those objects. + * + * By default, this command covers all checkpoints; it is also possible to + * limit things to just checkpoints with no parents, when @flags includes + * VIR_DOMAIN_CHECKPOINT_LIST_ROOTS. Additional filters are provided in + * groups, where each group contains bits that describe mutually exclusive + * attributes of a checkpoint, and where all bits within a group describe + * all possible checkpoints. Some hypervisors might reject explicit bits + * from a group where the hypervisor cannot make a distinction. For a + * group supported by a given hypervisor, the behavior when no bits of a + * group are set is identical to the behavior when all bits in that group + * are set. When setting bits from more than one group, it is possible to + * select an impossible combination, in that case a hypervisor may return + * either 0 or an error.
Huh, what? This is really a confusing statement. Considering "other factors" - rather than returning multiple levels of data, maybe we ought to consider simplifying our lives and only return the main/top level checkpoint object. Forcing the consumer to handle all the iteration logic if they so desire. The concern being you end up 100 levels deep, too much data, and timeouts (think backingStore issues). Of course that alters the name of the API a bit. Assuming the driver level could easily return a count of checkpoints, that allows the consumer all the ammunition they need I would think. The way logic works now is that none or all type thing. Let the consumer decide how far they want to chase.
+ * + * The first group of @flags is VIR_DOMAIN_CHECKPOINT_LIST_LEAVES and + * VIR_DOMAIN_CHECKPOINT_LIST_NO_LEAVES, to filter based on checkpoints that + * have no further children (a leaf checkpoint). + * + * The next group of @flags is VIR_DOMAIN_CHECKPOINT_LIST_METADATA and + * VIR_DOMAIN_CHECKPOINT_LIST_NO_METADATA, for filtering checkpoints based on + * whether they have metadata that would prevent the removal of the last + * reference to a domain. + * + * Returns the number of domain checkpoints found or -1 and sets @checkpoints + * to NULL in case of error. On success, the array stored into @checkpoints + * is guaranteed to have an extra allocated element set to NULL but not + * included in the return count, to make iteration easier. The caller is + * responsible for calling virDomainCheckpointFree() on each array element, + * then calling free() on @checkpoints. + */ +int +virDomainListCheckpoints(virDomainPtr domain, + virDomainCheckpointPtr **checkpoints, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "checkpoints=%p, flags=0x%x", checkpoints, flags); + + virResetLastError(); + + if (checkpoints) + *checkpoints = NULL; + + virCheckDomainReturn(domain, -1); + conn = domain->conn; + + if (conn->driver->domainListCheckpoints) { + int ret = conn->driver->domainListCheckpoints(domain, checkpoints, + flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointListChildren: + * @checkpoint: a domain checkpoint object + * @children: pointer to variable to store the array containing checkpoint + * objects, or NULL if the list is not required (just returns
s/objects,/objects
+ * number of checkpoints) + * @flags: bitwise-OR of supported virDomainCheckpointListFlags + * + * Collect the list of domain checkpoints that are children of the given + * checkpoint, and allocate an array to store those objects.
assuming using the open domain connection pointer for the provided checkpoint.
+ * + * By default, this command covers only direct children; it is also possible + * to expand things to cover all descendants, when @flags includes + * VIR_DOMAIN_CHECKPOINT_LIST_DESCENDANTS. Also, some filters are provided in + * groups, where each group contains bits that describe mutually exclusive + * attributes of a snapshot, and where all bits within a group describe
s/snapshot/checkpoint
+ * all possible snapshots. Some hypervisors might reject explicit bits
s/snapshots/checkpoints
+ * from a group where the hypervisor cannot make a distinction. For a + * group supported by a given hypervisor, the behavior when no bits of a + * group are set is identical to the behavior when all bits in that group + * are set. When setting bits from more than one group, it is possible to + * select an impossible combination, in that case a hypervisor may return + * either 0 or an error.
Similar traverse logic concerns here (with similar name change Children to Child).
+ * + * The first group of @flags is VIR_DOMAIN_CHECKPOINT_LIST_LEAVES and + * VIR_DOMAIN_CHECKPOINT_LIST_NO_LEAVES, to filter based on checkpoints that + * have no further children (a leaf checkpoint). + * + * The next group of @flags is VIR_DOMAIN_CHECKPOINT_LIST_METADATA and + * VIR_DOMAIN_CHECKPOINT_LIST_NO_METADATA, for filtering checkpoints based on + * whether they have metadata that would prevent the removal of the last + * reference to a domain. + * + * Returns the number of domain checkpoints found or -1 and sets @children to + * NULL in case of error. On success, the array stored into @children is + * guaranteed to have an extra allocated element set to NULL but not included + * in the return count, to make iteration easier. The caller is responsible + * for calling virDomainCheckpointFree() on each array element, then calling + * free() on @children. + */ +int +virDomainCheckpointListChildren(virDomainCheckpointPtr checkpoint, + virDomainCheckpointPtr **children, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DEBUG("checkpoint=%p, children=%p, flags=0x%x", + checkpoint, children, flags); + + virResetLastError(); + + if (children) + *children = NULL; + + virCheckDomainCheckpointReturn(checkpoint, -1); + conn = checkpoint->domain->conn; + + if (conn->driver->domainCheckpointListChildren) { + int ret = conn->driver->domainCheckpointListChildren(checkpoint, + children, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointLookupByName: + * @domain: a domain object + * @name: name for the domain checkpoint + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Try to lookup a domain checkpoint based on its name. + * + * Returns a domain checkpoint object or NULL in case of failure. If the + * domain checkpoint cannot be found, then the VIR_ERR_NO_DOMAIN_CHECKPOINT + * error is raised. + */ +virDomainCheckpointPtr +virDomainCheckpointLookupByName(virDomainPtr domain, + const char *name, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "name=%s, flags=0x%x", name, flags); + + virResetLastError(); + + virCheckDomainReturn(domain, NULL); + conn = domain->conn; + + virCheckNonNullArgGoto(name, error); + + if (conn->driver->domainCheckpointLookupByName) { + virDomainCheckpointPtr dom; + dom = conn->driver->domainCheckpointLookupByName(domain, name, flags); + if (!dom) + goto error; + return dom; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + +/** + * virDomainHasCurrentCheckpoint: + * @domain: pointer to the domain object + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Determine if the domain has a current checkpoint. + * + * Returns 1 if such checkpoint exists, 0 if it doesn't, -1 on error. + */ +int +virDomainHasCurrentCheckpoint(virDomainPtr domain, unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "flags=0x%x", flags); + + virResetLastError(); + + virCheckDomainReturn(domain, -1); + conn = domain->conn; + + if (conn->driver->domainHasCurrentCheckpoint) { + int ret = conn->driver->domainHasCurrentCheckpoint(domain, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointCurrent: + * @domain: a domain object + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Get the current checkpoint for a domain, if any. + * + * virDomainCheckpointFree should be used to free the resources after the + * checkpoint object is no longer needed. + * + * Returns a domain checkpoint object or NULL in case of failure. If the + * current domain checkpoint cannot be found, then the + * VIR_ERR_NO_DOMAIN_CHECKPOINT error is raised. + */ +virDomainCheckpointPtr +virDomainCheckpointCurrent(virDomainPtr domain, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "flags=0x%x", flags); + + virResetLastError(); + + virCheckDomainReturn(domain, NULL); + conn = domain->conn; + + if (conn->driver->domainCheckpointCurrent) { + virDomainCheckpointPtr snap;
s/snap/checkpoint
+ snap = conn->driver->domainCheckpointCurrent(domain, flags); + if (!snap) + goto error; + return snap; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + +/** + * virDomainCheckpointGetParent: + * @checkpoint: a checkpoint object + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Get the parent checkpoint for @checkpoint, if any. + * + * virDomainCheckpointFree should be used to free the resources after the + * checkpoint object is no longer needed. + * + * Returns a domain checkpoint object or NULL in case of failure. If the + * given checkpoint is a root (no parent), then the VIR_ERR_NO_DOMAIN_CHECKPOINT + * error is raised. + */ +virDomainCheckpointPtr +virDomainCheckpointGetParent(virDomainCheckpointPtr checkpoint, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DEBUG("checkpoint=%p, flags=0x%x", checkpoint, flags); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, NULL); + conn = checkpoint->domain->conn; + + if (conn->driver->domainCheckpointGetParent) { + virDomainCheckpointPtr snap;
s/snap/checkpoint
+ snap = conn->driver->domainCheckpointGetParent(checkpoint, flags); + if (!snap) + goto error; + return snap; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + +/** + * virDomainCheckpointIsCurrent: + * @checkpoint: a checkpoint object + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Determine if the given checkpoint is the domain's current checkpoint. See + * also virDomainHasCurrentCheckpoint(). + * + * Returns 1 if current, 0 if not current, or -1 on error. + */ +int +virDomainCheckpointIsCurrent(virDomainCheckpointPtr checkpoint, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DEBUG("checkpoint=%p, flags=0x%x", checkpoint, flags); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, -1); + conn = checkpoint->domain->conn; + + if (conn->driver->domainCheckpointIsCurrent) { + int ret; + ret = conn->driver->domainCheckpointIsCurrent(checkpoint, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointHasMetadata: + * @checkpoint: a checkpoint object + * @flags: extra flags; not used yet, so callers should always pass 0 + * + * Determine if the given checkpoint is associated with libvirt metadata + * that would prevent the deletion of the domain. + * + * Returns 1 if the checkpoint has metadata, 0 if the checkpoint exists without + * help from libvirt, or -1 on error. + */ +int +virDomainCheckpointHasMetadata(virDomainCheckpointPtr checkpoint, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DEBUG("checkpoint=%p, flags=0x%x", checkpoint, flags); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, -1); + conn = checkpoint->domain->conn; + + if (conn->driver->domainCheckpointHasMetadata) { + int ret; + ret = conn->driver->domainCheckpointHasMetadata(checkpoint, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointDelete: + * @checkpoint: the checkpoint to remove + * @flags: not used yet, pass 0 + * @flags: bitwise-OR of supported virDomainCheckpointDeleteFlags + * + * Removes a checkpoint from the domain. + * + * When removing a checkpoint, the record of which portions of the + * disk were dirtied after the checkpoint will be merged into the + * record tracked by the parent checkpoint, if any. Likewise, if the + * checkpoint being deleted was the current checkpoint, the parent + * checkpoint becomes the new current checkpoint. + * + * If @flags includes VIR_DOMAIN_CHECKPOINT_DELETE_METADATA_ONLY, then + * any checkpoint metadata tracked by libvirt is removed while keeping + * the checkpoint contents intact; if a hypervisor does not require + * any libvirt metadata to track checkpoints, then this flag is + * silently ignored. + * + * Returns 0 on success, -1 on error. + */ +int +virDomainCheckpointDelete(virDomainCheckpointPtr checkpoint, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DEBUG("checkpoint=%p, flags=0x%x", checkpoint, flags); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, -1); + conn = checkpoint->domain->conn; + + virCheckReadOnlyGoto(conn->flags, error); + + VIR_EXCLUSIVE_FLAGS_GOTO(VIR_DOMAIN_CHECKPOINT_DELETE_CHILDREN, + VIR_DOMAIN_CHECKPOINT_DELETE_CHILDREN_ONLY, + error); + + if (conn->driver->domainCheckpointDelete) { + int ret = conn->driver->domainCheckpointDelete(checkpoint, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainCheckpointRef: + * @checkpoint: the checkpoint to hold a reference on + * + * Increment the reference count on the checkpoint. For each + * additional call to this method, there shall be a corresponding + * call to virDomainCheckpointFree to release the reference count, once + * the caller no longer needs the reference to this object. + * + * This method is typically useful for applications where multiple + * threads are using a connection, and it is required that the + * connection and domain remain open until all threads have finished + * using the checkpoint. ie, each new thread using a checkpoint would + * increment the reference count.
Kind of the "gray area" when checkpoints use domain objects which would own the connection. Almost makes me wonder if by creating a checkpoint object, then should we just make the domm->conn ref to ensure that conn doesn't get free'd inadvertently. John
+ * + * Returns 0 in case of success and -1 in case of failure. + */ +int +virDomainCheckpointRef(virDomainCheckpointPtr checkpoint) +{ + VIR_DEBUG("checkpoint=%p, refs=%d", checkpoint, + checkpoint ? checkpoint->parent.u.s.refs : 0); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, -1); + + virObjectRef(checkpoint); + return 0; +} + + +/** + * virDomainCheckpointFree: + * @checkpoint: a domain checkpoint object + * + * Free the domain checkpoint object. The checkpoint itself is not modified. + * The data structure is freed and should not be used thereafter. + * + * Returns 0 in case of success and -1 in case of failure. + */ +int +virDomainCheckpointFree(virDomainCheckpointPtr checkpoint) +{ + VIR_DEBUG("checkpoint=%p", checkpoint); + + virResetLastError(); + + virCheckDomainCheckpointReturn(checkpoint, -1); + + virObjectUnref(checkpoint); + return 0; +} diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index 3bf3c3f916..a3e12b9a12 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -794,6 +794,22 @@ LIBVIRT_4.4.0 {
LIBVIRT_4.5.0 { global: + virDomainCheckpointCreateXML; + virDomainCheckpointCurrent; + virDomainCheckpointDelete; + virDomainCheckpointFree; + virDomainCheckpointGetConnect; + virDomainCheckpointGetDomain; + virDomainCheckpointGetParent; + virDomainCheckpointGetXMLDesc; + virDomainCheckpointHasMetadata; + virDomainCheckpointIsCurrent; + virDomainCheckpointListChildren; + virDomainCheckpointLookupByName; + virDomainCheckpointRef; + virDomainHasCurrentCheckpoint; + virDomainListCheckpoints; + virDomainCheckpointGetName; virGetLastErrorCode; virGetLastErrorDomain; virNodeGetSEVInfo;

On 06/25/2018 06:16 PM, John Ferlan wrote:
On 06/13/2018 12:42 PM, Eric Blake wrote:
Introduce a bunch of new public APIs related to backup checkpoints. Checkpoints are modeled heavily after virDomainSnapshotPtr (both represent a point in time of the guest), although a snapshot exists with the intent of rolling back to that state, while a checkpoint exists to make it possible to create an incremental backup at a later time.
Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/Makefile.am | 3 + docs/apibuild.py | 2 + docs/docs.html.in | 1 + include/libvirt/libvirt-domain-checkpoint.h | 147 ++++++ include/libvirt/libvirt.h | 5 +- libvirt.spec.in | 1 + mingw-libvirt.spec.in | 2 + po/POTFILES | 1 + src/Makefile.am | 2 + src/driver-hypervisor.h | 60 ++- src/libvirt-domain-checkpoint.c | 708 ++++++++++++++++++++++++++++ src/libvirt_public.syms | 16 + 12 files changed, 944 insertions(+), 4 deletions(-) create mode 100644 include/libvirt/libvirt-domain-checkpoint.h create mode 100644 src/libvirt-domain-checkpoint.c
In a word... Overwhelming!
Yeah, it's been a lot of code on my end, and more still to come.
I have concerns related to committing the API before everyone is sure about the underlying hypervisor code. No sense in baking in an API only to find out later needs/issues. It seems we
Incomplete sentence? But it definitely explains why I want a working demo of the API in use, even if targeting what is currently experimental qemu commands, to show that the API does what we want.
I see checkpoint code borrows the domain connection - is that similar to snapshots?
Yes, I was very heavily borrowing the code for snapshots, as both items are sub-objects that are related to a domain at a point in time.
I won't go too in depth here - mostly just scan to look for obvious issues.
+++ b/include/libvirt/libvirt-domain-checkpoint.h @@ -0,0 +1,147 @@ +/* + * libvirt-domain-checkpoint.h + * Summary: APIs for management of domain checkpoints + * Description: Provides APIs for the management of domain checkpoints + * Author: Eric Blake <eblake@redhat.com> + * + * Copyright (C) 2006-2018 Red Hat, Inc.
Since it's created in 2018 - shouldn't it just list that? Not my area of expertise by any stretch of the imagination though.
Since I copied-and-pasted from snapshots, I like to keep the full range of years from my template code; I could use just 2018 by calling it new code, but I don't see it being much of an issue either way (no one will use this header in isolation, so whether it has the project's full copyright years, or just this file's earliest year of existence, doesn't really matter when git history will paint a full picture).
+/** + * virDomainCheckpointListFlags: + * + * Flags valid for virDomainListCheckpoints() and + * virDomainCheckpointListChildren(). Note that the interpretation of + * flag (1<<0) depends on which function it is passed to; but serves + * to toggle the per-call default of whether the listing is shallow or + * recursive. Remaining bits come in groups; if all bits from a group + * are 0, then that group is not used to filter results. */ ^^ There's an extra space here
Yeah, I tend to do two spaces after full stop (old-school typewriting convention). I can make it a point to use just one when revising this patch, although I don't think it makes too much difference either way.
+typedef enum { + VIR_DOMAIN_CHECKPOINT_LIST_ROOTS = (1 << 0), /* Filter by checkpoints + with no parents, when + listing a domain */ + VIR_DOMAIN_CHECKPOINT_LIST_DESCENDANTS = (1 << 0), /* List all descendants, + not just children, when + listing a checkpoint */
There's two "1 << 0" entries - ironically the doc page render lists these in the opposite order. Still I see this is essentially a copy of the SnapshotListFlags. So do we really need to keep that when there's only 2 API's?
That's the same way snapshots did it, and yes, two separate names makes the most sense for both consistency and usage. That is, you call either: virDomainListCheckpoints(,0) (list all checkpoints, by recursing) virDomainListCheckpoints(,_LIST_ROOTS) (list only roots, by not recursing) virDomainCheckpointListChildren(,0) (list only direct children, by not recursing) virDomainCheckpointListChildren(,_LIST_DESCENDENTS) (list all generations, by recursing) but it was easier to declare one set of flags than two separate enums for the one difference. There's also the fact that the recurse-or-not bit has a different sense between the two APIs, so having two different names for the bit makes it easier to see which sense you are getting.
+ + VIR_DOMAIN_CHECKPOINT_LIST_LEAVES = (1 << 1), /* Filter by checkpoints + with no children */ + VIR_DOMAIN_CHECKPOINT_LIST_NO_LEAVES = (1 << 2), /* Filter by checkpoints + that have children */ + + VIR_DOMAIN_CHECKPOINT_LIST_METADATA = (1 << 3), /* Filter by checkpoints + which have metadata */ + VIR_DOMAIN_CHECKPOINT_LIST_NO_METADATA = (1 << 4), /* Filter by checkpoints + with no metadata */ +} virDomainCheckpointListFlags;
Not quite sure where/how metadata comes into play. What metadata? Where/when was that introduced?
I'm still not convinced whether we need this flag; we could remove it for initial introduction, and add it later if it proves useful; however, it is modeled after what proved useful for snapshots. The idea here is that if qemu bitmaps learn the ability to track their parent bitmap, then libvirt could reconstruct the checkpoint chain purely by reading the qcow2 metadata, instead of having to track XML itself. Such a tracking will be limited (libvirt wants to store extra metadata such as 'description', 'timestamp', and the full 'domain' layout at the time of the checkpoint, while do not have room in qcow2 to be stored there); but if we allow libvirt to import checkpoints by reading what bitmaps already exist in the qcow2 file, then knowing which checkpoints have no metadata (the ones reconstructed from qcow2) vs. DO have metadata (the ones that libvirt has tracked in secondary XML files) will be useful.
+/* Delete a checkpoint */ +typedef enum { + VIR_DOMAIN_CHECKPOINT_DELETE_CHILDREN = (1 << 0), /* Also delete children */ + VIR_DOMAIN_CHECKPOINT_DELETE_METADATA_ONLY = (1 << 1), /* Delete just metadata */ + VIR_DOMAIN_CHECKPOINT_DELETE_CHILDREN_ONLY = (1 << 2), /* Delete just children */ +} virDomainCheckpointDeleteFlags;
Again, not sure what the metadata entails here. Although this perhaps answer a different question I had about converging bitmaps by deleting "middle" checkpoints.
Again, modeled after snapshots. The 8 combinations result in: children | metadata_only | children_only 0 0 0 - delete one snapshot (both the bitmap in qcow2 and the libvirt XML) 0 0 1 - invalid (children_only requires children) 0 1 0 - delete the libvirt xml, but leave the bitmap in qcow2 0 1 1 - invalid 1 0 0 - delete a full tree of snapshots (this one and all its children), including libvirt XML 1 0 1 - delete a partial tree of snapshots (all the children, but leave this one intact) 1 1 0 - delete full tree of libvirt xml but leave bitmaps in qcow2 1 1 1 - delete partial tree of libvirt xml but leave bitmaps Hmm - I didn't document _CHILDREN or _CHILDREN_ONLY in the .c file. Maybe I should just delete those flags from here for now (again, we can always add flags later).
@@ -355,6 +356,7 @@ rm -rf $RPM_BUILD_ROOT%{mingw64_libexecdir}/libvirt-guests.sh %{mingw64_includedir}/libvirt/libvirt.h %{mingw64_includedir}/libvirt/libvirt-common.h %{mingw64_includedir}/libvirt/libvirt-domain.h +%{mingw64_includedir}/libvirt/libvirt-domain-checkpoint.h %{mingw64_includedir}/libvirt/libvirt-domain-snapshot.h %{mingw64_includedir}/libvirt/libvirt-event.h %{mingw64_includedir}/libvirt/libvirt-host.h
Not an area I know well, but as long as the various make with rpm options is tested, then great. Seems we always seem to forget something related to some obscure option every time we add something new!
Also I found a lot of these places by grepping for domain snapshots - if a file mentioned one, it should probably mention the other :)
+/** + * virDomainCheckpointGetDomain: + * @checkpoint: a checkpoint object + * + * Provides the domain pointer associated with a checkpoint. The + * reference counter on the domain is not increased by this + * call.
Seems call could fit on previous line.
Maybe. And sometimes emacs is funny when it reflows paragraphs. It doesn't affect the generated html docs, though.
+/** + * virDomainCheckpointCreateXML: + * @domain: a domain object + * @xmlDesc: description of the checkpoint to create + * @flags: bitwise-OR of supported virDomainCheckpointCreateFlags + * + * Create a new checkpoint using @xmlDesc on a running @domain. + * Typically, it is more common to create a new checkpoint as part of + * kicking off a backup job with virDomainBackupBegin(); however, it + * is also possible to start a checkpoint without a backup.
Should we state that the domain needs to have an open connection already or is that for free (too lazy to check right now ;-))
A valid virDomainCheckpointPtr implies an open connection already.
+ * + * See formatcheckpoint.html#CheckpointAttributes document for more + * details on @xmlDesc. + * + * If @flags includes VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE, then this + * is a request to reinstate checkpoint metadata that was previously + * discarded, rather than creating a new checkpoint. When redefining + * checkpoint metadata, the current checkpoint will not be altered + * unless the VIR_DOMAIN_CHECKPOINT_CREATE_CURRENT flag is also + * present. It is an error to request the + * VIR_DOMAIN_CHECKPOINT_CREATE_CURRENT flag without + * VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE. + * + * If @flags includes VIR_DOMAIN_CHECKPOINT_CREATE_NO_METADATA, then + * the domain's disk images are modified according to @xmlDesc, but + * then the just-created checkpoint has its metadata deleted. This + * flag is incompatible with VIR_DOMAIN_CHECKPOINT_CREATE_REDEFINE. + * + * Returns an (opaque) new virDomainCheckpointPtr on success, or NULL + * on failure. + */
Not sure I quite yet understand the "metadata" reference... Not sure how much of this is cut-n-paste from the snapshot world.
I'm not opposed to trimming the METADATA flag from the first revision, then adding it later if it makes sense.
+/** + * virDomainCheckpointGetXMLDesc: + * @checkpoint: a domain checkpoint object + * @flags: bitwise-OR of subset of virDomainXMLFlags + * + * Provide an XML description of the domain checkpoint. + * + * No security-sensitive data will be included unless @flags contains + * VIR_DOMAIN_XML_SECURE; this flag is rejected on read-only + * connections. For this API, @flags should not contain either + * VIR_DOMAIN_XML_INACTIVE or VIR_DOMAIN_XML_UPDATE_CPU.
New paragraph for the last sentence and perhaps just state "This API does not support the xx or xx flags.
Copy-and-paste from snapshots, but sure, I can clean it up.
+/** + * virDomainListCheckpoints: + * @domain: a domain object + * @checkpoints: pointer to variable to store the array containing checkpoint + * objects, or NULL if the list is not required (just returns
s/objects,/objects/
+ * number of checkpoints) + * @flags: bitwise-OR of supported virDomainCheckpoinListFlags
CheckpointListFlags
+ * + * Collect the list of domain checkpoints for the given domain, and allocate + * an array to store those objects. + * + * By default, this command covers all checkpoints; it is also possible to + * limit things to just checkpoints with no parents, when @flags includes + * VIR_DOMAIN_CHECKPOINT_LIST_ROOTS. Additional filters are provided in + * groups, where each group contains bits that describe mutually exclusive + * attributes of a checkpoint, and where all bits within a group describe + * all possible checkpoints. Some hypervisors might reject explicit bits + * from a group where the hypervisor cannot make a distinction. For a + * group supported by a given hypervisor, the behavior when no bits of a + * group are set is identical to the behavior when all bits in that group + * are set. When setting bits from more than one group, it is possible to + * select an impossible combination, in that case a hypervisor may return + * either 0 or an error.
Huh, what? This is really a confusing statement.
Oh well, copied from snapshots.
Considering "other factors" - rather than returning multiple levels of data, maybe we ought to consider simplifying our lives and only return the main/top level checkpoint object. Forcing the consumer to handle all the iteration logic if they so desire. The concern being you end up 100 levels deep, too much data, and timeouts (think backingStore issues). Of course that alters the name of the API a bit.
Assuming the driver level could easily return a count of checkpoints, that allows the consumer all the ammunition they need I would think.
Snapshots already had an API that returned a count - it was racy (by the time you queried the count, another thread could have changed the count). We've already learned that the List* functions should return a filterable list of objects, and then provide enough filters to be useful so that a user can narrow the list rather than getting too much data.
The way logic works now is that none or all type thing. Let the consumer decide how far they want to chase.
The other issue is that if I have a sequence of checkpoints: Mon .. Tue .. Wed where Mon is the root, but I want to perform an incremental backup of just the data modified since Wed, then having to do virDomainListCheckpoints(mydom) to get Mon, then virDomainCheckpointListChildren(Mon) to get Tue, then virDomainCheckpointListChildren(Tue) to get Wed, is slower than just doing virDomainListCheckpoints(mydom) to get the set 'Mon, Tue, Wed' up front.
+ +/** + * virDomainCheckpointListChildren: + * @checkpoint: a domain checkpoint object + * @children: pointer to variable to store the array containing checkpoint + * objects, or NULL if the list is not required (just returns
s/objects,/objects
+ * number of checkpoints) + * @flags: bitwise-OR of supported virDomainCheckpointListFlags + * + * Collect the list of domain checkpoints that are children of the given + * checkpoint, and allocate an array to store those objects.
assuming using the open domain connection pointer for the provided checkpoint.
Yes, and similar to snapshots.
+ + if (conn->driver->domainCheckpointCurrent) { + virDomainCheckpointPtr snap;
s/snap/checkpoint
Yep, there's probably a few of these still lurking in my series.
+/** + * virDomainCheckpointRef: + * @checkpoint: the checkpoint to hold a reference on + * + * Increment the reference count on the checkpoint. For each + * additional call to this method, there shall be a corresponding + * call to virDomainCheckpointFree to release the reference count, once + * the caller no longer needs the reference to this object. + * + * This method is typically useful for applications where multiple + * threads are using a connection, and it is required that the + * connection and domain remain open until all threads have finished + * using the checkpoint. ie, each new thread using a checkpoint would + * increment the reference count.
Kind of the "gray area" when checkpoints use domain objects which would own the connection. Almost makes me wonder if by creating a checkpoint object, then should we just make the domm->conn ref to ensure that conn doesn't get free'd inadvertently.
This one is copied directly from snapshots; if one should grab a reference to the domain and/or connection, then both should (right now, neither do; the object is valid only as long as your domain object and connection also remain valid). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

Introduce a few more new public APIs related to incremental backups. This builds on the previous notion of a checkpoint (without an existing checkpoint, the new API is a full backup, differing only from virDomainCopy in the point of time chosen); and also allows creation of a new checkpoint at the same time as starting the backup (after all, an incremental backup is only useful if it covers the state since the previous backup). It also enhances event reporting for signaling when a push model backup completes (where the hypervisor creates the backup); note that the pull model does not have an event (starting the backup lets a third party access the data, and only the third party knows when it is finished). Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt-domain-checkpoint.h | 11 ++ include/libvirt/libvirt-domain.h | 14 +- src/driver-hypervisor.h | 14 ++ src/libvirt-domain-checkpoint.c | 200 ++++++++++++++++++++++++++++ src/libvirt-domain.c | 8 +- src/libvirt_public.syms | 3 + tools/virsh-domain.c | 3 +- 7 files changed, 249 insertions(+), 4 deletions(-) diff --git a/include/libvirt/libvirt-domain-checkpoint.h b/include/libvirt/libvirt-domain-checkpoint.h index 4a7dc73089..c1d382fddc 100644 --- a/include/libvirt/libvirt-domain-checkpoint.h +++ b/include/libvirt/libvirt-domain-checkpoint.h @@ -144,4 +144,15 @@ int virDomainCheckpointDelete(virDomainCheckpointPtr checkpoint, int virDomainCheckpointRef(virDomainCheckpointPtr checkpoint); int virDomainCheckpointFree(virDomainCheckpointPtr checkpoint); +/* Begin an incremental backup job, possibly creating a checkpoint. */ +int virDomainBackupBegin(virDomainPtr domain, const char *diskXml, + const char *checkpointXml, unsigned int flags); + +/* Learn about an ongoing backup job. */ +char *virDomainBackupGetXMLDesc(virDomainPtr domain, int id, + unsigned int flags); + +/* Complete an incremental backup job. */ +int virDomainBackupEnd(virDomainPtr domain, int id, unsigned int flags); + #endif /* __VIR_LIBVIRT_DOMAIN_CHECKPOINT_H__ */ diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 3ef7c24528..3644056124 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -4,7 +4,7 @@ * Description: Provides APIs for the management of domains * Author: Daniel Veillard <veillard@redhat.com> * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -3159,6 +3159,7 @@ typedef enum { VIR_DOMAIN_JOB_OPERATION_SNAPSHOT = 6, VIR_DOMAIN_JOB_OPERATION_SNAPSHOT_REVERT = 7, VIR_DOMAIN_JOB_OPERATION_DUMP = 8, + VIR_DOMAIN_JOB_OPERATION_BACKUP = 9, # ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_JOB_OPERATION_LAST @@ -3174,6 +3175,14 @@ typedef enum { */ # define VIR_DOMAIN_JOB_OPERATION "operation" +/** + * VIR_DOMAIN_JOB_ID: + * + * virDomainGetJobStats field: the id of the job (so far, only for jobs + * started by virDomainBackupBegin()), as VIR_TYPED_PARAM_INT. + */ +# define VIR_DOMAIN_JOB_ID "id" + /** * VIR_DOMAIN_JOB_TIME_ELAPSED: * @@ -3988,7 +3997,8 @@ typedef void (*virConnectDomainEventMigrationIterationCallback)(virConnectPtr co * @nparams: size of the params array * @opaque: application specific data * - * This callback occurs when a job (such as migration) running on the domain + * This callback occurs when a job (such as migration or push-model + * virDomainBackupBegin()) running on the domain * is completed. The params array will contain statistics of the just completed * job as virDomainGetJobStats would return. The callback must not free @params * (the array will be freed once the callback finishes). diff --git a/src/driver-hypervisor.h b/src/driver-hypervisor.h index ee8a9a3e0e..2ae0b381a9 100644 --- a/src/driver-hypervisor.h +++ b/src/driver-hypervisor.h @@ -1369,6 +1369,17 @@ typedef int (*virDrvDomainCheckpointDelete)(virDomainCheckpointPtr checkpoint, unsigned int flags); +typedef int +(*virDrvDomainBackupBegin)(virDomainPtr domain, const char *diskXml, + const char *checkpointXml, unsigned int flags); + +typedef char * +(*virDrvDomainBackupGetXMLDesc)(virDomainPtr domain, int id, + unsigned int flags); + +typedef int +(*virDrvDomainBackupEnd)(virDomainPtr domain, int id, unsigned int flags); + typedef struct _virHypervisorDriver virHypervisorDriver; typedef virHypervisorDriver *virHypervisorDriverPtr; @@ -1630,6 +1641,9 @@ struct _virHypervisorDriver { virDrvDomainCheckpointIsCurrent domainCheckpointIsCurrent; virDrvDomainCheckpointHasMetadata domainCheckpointHasMetadata; virDrvDomainCheckpointDelete domainCheckpointDelete; + virDrvDomainBackupBegin domainBackupBegin; + virDrvDomainBackupGetXMLDesc domainBackupGetXMLDesc; + virDrvDomainBackupEnd domainBackupEnd; }; diff --git a/src/libvirt-domain-checkpoint.c b/src/libvirt-domain-checkpoint.c index 12511a13ee..0c9c803377 100644 --- a/src/libvirt-domain-checkpoint.c +++ b/src/libvirt-domain-checkpoint.c @@ -706,3 +706,203 @@ virDomainCheckpointFree(virDomainCheckpointPtr checkpoint) virObjectUnref(checkpoint); return 0; } + + +/** + * virDomainBackupBegin: + * @domain: a domain object + * @diskXml: description of storage to utilize and expose during + * the backup, or NULL + * @checkpointXml: description of a checkpoint to create, or NULL + * @flags: not used yet, pass 0 + * + * Start a point-in-time backup job for the specified disks of a + * running domain. + * + * A backup job is mutually exclusive with domain migration + * (particularly when the job sets up an NBD export, since it is not + * possible to tell any NBD clients about a server migrating between + * hosts). For now, backup jobs are also mutually exclusive with any + * other block job on the same device, although this restriction may + * be lifted in a future release. Progress of the backup job can be + * tracked via virDomainGetJobStats(). The job remains active until a + * subsequent call to virDomainBackupEnd(), even if it no longer has + * anything to copy. + * + * This API differs from virDomainBlockCopy() in that it can grab the + * state of more than one disk in parallel, and the state is captured + * as of the start of the job, rather than the end. + * + * There are two fundamental backup approaches. The first, called a + * push model, instructs the hypervisor to copy the state of the guest + * disk to the designated storage destination (which may be on the + * local file system or a network device); in this mode, the + * hypervisor writes the content of the guest disk to the destination, + * then emits VIR_DOMAIN_EVENT_ID_JOB_COMPLETED when the backup is + * either complete or failed (the backup image is invalid if the job + * is ended prior to the event being emitted). The second, called a + * pull model, instructs the hypervisor to expose the state of the + * guest disk over an NBD export; a third-party client can then + * connect to this export, and read whichever portions of the disk it + * desires. In this mode, there is no event; libvirt has to be + * informed when the third-party NBD client is done and the backup + * resources can be released. + * + * The @diskXml parameter is optional but usually provided, and + * contains details about the backup, including which backup mode to + * use, whether the backup is incremental from a previous checkpoint, + * which disks participate in the backup, the destination for a push + * model backup, and the temporary storage and NBD server details for + * a pull model backup. If omitted, the backup attempts to default to + * a push mode full backup of all disks, where libvirt generates a + * filename for each disk by appending a suffix of a timestamp in + * seconds since the Epoch. virDomainBackupGetXMLDesc() can be called + * to actual values selected. For more information, see + * formatcheckpoint.html#BackupAttributes. + * + * The @checkpointXml parameter is optional; if non-NULL, then libvirt + * behaves as if virDomainCheckpointCreateXML() were called with + * @checkpointXml and no flags, atomically covering the same guest state + * that will be part of the backup. The creation of a new checkpoint + * allows for future incremental backups. + * + * Returns a non-negative job id on success, or negative on failure. + * This operation returns quickly, such that a user can choose to + * start a backup job between virDomainFSFreeze() and + * virDomainFSThaw() in order to create the backup while guest I/O is + * quiesced. + */ +int +virDomainBackupBegin(virDomainPtr domain, const char *diskXml, + const char *checkpointXml, unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "diskXml=%s, checkpointXml=%s, flags=0x%x", + NULLSTR(diskXml), NULLSTR(checkpointXml), flags); + + virResetLastError(); + + virCheckDomainReturn(domain, -1); + conn = domain->conn; + + virCheckReadOnlyGoto(conn->flags, error); + + if (conn->driver->domainBackupBegin) { + int ret; + ret = conn->driver->domainBackupBegin(domain, diskXml, checkpointXml, + flags); + if (!ret) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} + + +/** + * virDomainBackupGetXMLDesc: + * @domain: a domain object + * @id: the id of an active backup job previously started with + * virDomainBackupBegin() + * @flags: bitwise-OR of subset of virDomainXMLFlags + * + * In some cases, a user can start a backup job without supplying all + * details, and rely on libvirt to fill in the rest (for example, + * selecting the port used for an NBD export). This API can then be + * used to learn what default values were chosen. + * + * No security-sensitive data will be included unless @flags contains + * VIR_DOMAIN_XML_SECURE; this flag is rejected on read-only + * connections. For this API, @flags should not contain either + * VIR_DOMAIN_XML_INACTIVE or VIR_DOMAIN_XML_UPDATE_CPU. + * + * Returns a NUL-terminated UTF-8 encoded XML instance, or NULL in + * case of error. The caller must free() the returned value. + */ +char * +virDomainBackupGetXMLDesc(virDomainPtr domain, int id, unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "id=%d, flags=0x%x", id, flags); + + virResetLastError(); + + virCheckDomainReturn(domain, NULL); + conn = domain->conn; + + if ((conn->flags & VIR_CONNECT_RO) && (flags & VIR_DOMAIN_XML_SECURE)) { + virReportError(VIR_ERR_OPERATION_DENIED, "%s", + _("virDomainCheckpointGetXMLDesc with secure flag")); + goto error; + } + virCheckNonNegativeArgGoto(id, error); + + if (conn->driver->domainBackupGetXMLDesc) { + char *ret; + ret = conn->driver->domainBackupGetXMLDesc(domain, id, flags); + if (!ret) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return NULL; +} + + +/** + * virDomainBackupEnd: + * @domain: a domain object + * @id: the id of an active backup job previously started with + * virDomainBackupBegin() + * @flags: bitwise-OR of supported virDomainBackupEndFlags + * + * Conclude a point-in-time backup job @id on the given domain. + * + * If the backup job uses the push model, but the event marking that + * all data has been copied has not yet been emitted, then the command + * fails unless @flags includes VIR_DOMAIN_BACKUP_END_ABORT. If the + * event has been issued, or if the backup uses the pull model, the + * flag has no effect. + * + * Returns 1 if the backup job completed successfully (the backup + * destination file in a push model is consistent), 0 if the job was + * aborted successfully (only when VIR_DOMAIN_BACKUP_END_ABORT is + * passed; the destination file is unusable), and -1 on failure. + */ +int +virDomainBackupEnd(virDomainPtr domain, int id, unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "id=%d, flags=0x%x", id, flags); + + virResetLastError(); + + virCheckDomainReturn(domain, -1); + conn = domain->conn; + + virCheckReadOnlyGoto(conn->flags, error); + virCheckNonNegativeArgGoto(id, error); + + if (conn->driver->domainBackupEnd) { + int ret; + ret = conn->driver->domainBackupEnd(domain, id, flags); + if (!ret) + goto error; + return ret; + } + + virReportUnsupportedError(); + error: + virDispatchError(conn); + return -1; +} diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index 4a899f31c8..c05e96ba7f 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -1,7 +1,7 @@ /* * libvirt-domain.c: entry points for virDomainPtr APIs * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -10248,6 +10248,12 @@ virDomainBlockRebase(virDomainPtr dom, const char *disk, * over the destination format, the ability to copy to a destination that * is not a local file, and the possibility of additional tuning parameters. * + * The copy created by this API is not finalized until the job ends, + * and does not lend itself to incremental backups (beyond what + * VIR_DOMAIN_BLOCK_COPY_SHALLOW provides) nor to third-party control + * over the data being copied. For those features, use + * virDomainBackupBegin(). + * * Returns 0 if the operation has started, -1 on failure. */ int diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index a3e12b9a12..212bfd691d 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -794,6 +794,9 @@ LIBVIRT_4.4.0 { LIBVIRT_4.5.0 { global: + virDomainBackupBegin; + virDomainBackupEnd; + virDomainBackupGetXMLDesc; virDomainCheckpointCreateXML; virDomainCheckpointCurrent; virDomainCheckpointDelete; diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 6aa79f11b9..ef99fdb3c2 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -5971,7 +5971,8 @@ VIR_ENUM_IMPL(virshDomainJobOperation, N_("Outgoing migration"), N_("Snapshot"), N_("Snapshot revert"), - N_("Dump")) + N_("Dump"), + N_("Backup")) static const char * virshDomainJobOperationToString(int op) -- 2.14.4

On Wed, Jun 13, 2018 at 11:42:27AM -0500, Eric Blake wrote:
Introduce a few more new public APIs related to incremental backups. This builds on the previous notion of a checkpoint (without an existing checkpoint, the new API is a full backup, differing only from virDomainCopy in the point of time chosen); and also allows creation of a new checkpoint at the same time as starting the backup (after all, an incremental backup is only useful if it covers the state since the previous backup). It also enhances event reporting for signaling when a push model backup completes (where the hypervisor creates the backup); note that the pull model does not have an event (starting the backup lets a third party access the data, and only the third party knows when it is finished).
First, thanks for the work! (And for doing the detailed write-ups, as is your wont.) A super minor note: I hope you'll also add the API names in the commit message itself (like you did in the past, for the older APIs); it will be handy when browsing `git log` later. So far I see the new APIs are: - virDomainBackupBegin() - virDomainBackupGetXMLDesc() - virDomainBackupEnd() So, OpenStack Nova currently still uses virDomainBlockRebase(); it hasn't even moved to the newer virDomainBlockCopy(). But as we know, currently both of them have the limitation of having to undefine and then re-define the guest XML. As you suggested elsewhere, probably I could explore (once they are 'frozen') moving to these proposed APIs, which will work without having to do the undefine + re-define dance.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt-domain-checkpoint.h | 11 ++ include/libvirt/libvirt-domain.h | 14 +- src/driver-hypervisor.h | 14 ++ src/libvirt-domain-checkpoint.c | 200 ++++++++++++++++++++++++++++ src/libvirt-domain.c | 8 +- src/libvirt_public.syms | 3 + tools/virsh-domain.c | 3 +- 7 files changed, 249 insertions(+), 4 deletions(-)
diff --git a/include/libvirt/libvirt-domain-checkpoint.h b/include/libvirt/libvirt-domain-checkpoint.h index 4a7dc73089..c1d382fddc 100644 --- a/include/libvirt/libvirt-domain-checkpoint.h +++ b/include/libvirt/libvirt-domain-checkpoint.h @@ -144,4 +144,15 @@ int virDomainCheckpointDelete(virDomainCheckpointPtr checkpoint, int virDomainCheckpointRef(virDomainCheckpointPtr checkpoint); int virDomainCheckpointFree(virDomainCheckpointPtr checkpoint);
+/* Begin an incremental backup job, possibly creating a checkpoint. */ +int virDomainBackupBegin(virDomainPtr domain, const char *diskXml, + const char *checkpointXml, unsigned int flags); + +/* Learn about an ongoing backup job. */ +char *virDomainBackupGetXMLDesc(virDomainPtr domain, int id, + unsigned int flags); + +/* Complete an incremental backup job. */ +int virDomainBackupEnd(virDomainPtr domain, int id, unsigned int flags);
[...] -- /kashyap

Creating a checkpoint does not modify guest-visible state, but does modify host resources. Rather than reuse existing domain:write, domain:block_write, or domain:snapshot access controls, it seems better to introduce a new access control specific to tasks related to checkpoints and incremental backups of guest disk state. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/access/viraccessperm.c | 5 +++-- src/access/viraccessperm.h | 8 +++++++- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/src/access/viraccessperm.c b/src/access/viraccessperm.c index 0f58290173..cba3c556d8 100644 --- a/src/access/viraccessperm.c +++ b/src/access/viraccessperm.c @@ -1,7 +1,7 @@ /* * viraccessperm.c: access control permissions * - * Copyright (C) 2012-2014 Red Hat, Inc. + * Copyright (C) 2012-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -38,7 +38,8 @@ VIR_ENUM_IMPL(virAccessPermDomain, "getattr", "read", "write", "read_secure", "start", "stop", "reset", "save", "delete", - "migrate", "snapshot", "suspend", "hibernate", "core_dump", "pm_control", + "migrate", "checkpoint", "snapshot", "suspend", "hibernate", + "core_dump", "pm_control", "init_control", "inject_nmi", "send_input", "send_signal", "fs_trim", "fs_freeze", "block_read", "block_write", "mem_read", diff --git a/src/access/viraccessperm.h b/src/access/viraccessperm.h index 1817da73bc..373c76859b 100644 --- a/src/access/viraccessperm.h +++ b/src/access/viraccessperm.h @@ -1,7 +1,7 @@ /* * viraccessperm.h: access control permissions * - * Copyright (C) 2012-2014 Red Hat, Inc. + * Copyright (C) 2012-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -180,6 +180,12 @@ typedef enum { */ VIR_ACCESS_PERM_DOMAIN_MIGRATE, /* Host migration */ + /** + * @desc: Checkpoint domain + * @message: Checkpointing domain requires authorization + */ + VIR_ACCESS_PERM_DOMAIN_CHECKPOINT, /* Checkpoint disks */ + /** * @desc: Snapshot domain * @message: Snapshotting domain requires authorization -- 2.14.4

The remote code generator had to be taught about the new virDomainCheckpointPtr type, at which point the remote driver code for backups can be generated. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/remote/remote_daemon_dispatch.c | 15 +++ src/remote/remote_driver.c | 31 ++++- src/remote/remote_protocol.x | 237 +++++++++++++++++++++++++++++++++++- src/remote_protocol-structs | 129 ++++++++++++++++++++ src/rpc/gendispatch.pl | 32 ++--- 5 files changed, 426 insertions(+), 18 deletions(-) diff --git a/src/remote/remote_daemon_dispatch.c b/src/remote/remote_daemon_dispatch.c index f1a5ba2590..b8247c34d9 100644 --- a/src/remote/remote_daemon_dispatch.c +++ b/src/remote/remote_daemon_dispatch.c @@ -90,6 +90,7 @@ static virStoragePoolPtr get_nonnull_storage_pool(virConnectPtr conn, remote_non static virStorageVolPtr get_nonnull_storage_vol(virConnectPtr conn, remote_nonnull_storage_vol vol); static virSecretPtr get_nonnull_secret(virConnectPtr conn, remote_nonnull_secret secret); static virNWFilterPtr get_nonnull_nwfilter(virConnectPtr conn, remote_nonnull_nwfilter nwfilter); +static virDomainCheckpointPtr get_nonnull_domain_checkpoint(virDomainPtr dom, remote_nonnull_domain_checkpoint checkpoint); static virDomainSnapshotPtr get_nonnull_domain_snapshot(virDomainPtr dom, remote_nonnull_domain_snapshot snapshot); static virNodeDevicePtr get_nonnull_node_device(virConnectPtr conn, remote_nonnull_node_device dev); static void make_nonnull_domain(remote_nonnull_domain *dom_dst, virDomainPtr dom_src); @@ -100,6 +101,7 @@ static void make_nonnull_storage_vol(remote_nonnull_storage_vol *vol_dst, virSto static void make_nonnull_node_device(remote_nonnull_node_device *dev_dst, virNodeDevicePtr dev_src); static void make_nonnull_secret(remote_nonnull_secret *secret_dst, virSecretPtr secret_src); static void make_nonnull_nwfilter(remote_nonnull_nwfilter *net_dst, virNWFilterPtr nwfilter_src); +static void make_nonnull_domain_checkpoint(remote_nonnull_domain_checkpoint *checkpoint_dst, virDomainCheckpointPtr checkpoint_src); static void make_nonnull_domain_snapshot(remote_nonnull_domain_snapshot *snapshot_dst, virDomainSnapshotPtr snapshot_src); static int @@ -7087,6 +7089,12 @@ get_nonnull_nwfilter(virConnectPtr conn, remote_nonnull_nwfilter nwfilter) return virGetNWFilter(conn, nwfilter.name, BAD_CAST nwfilter.uuid); } +static virDomainCheckpointPtr +get_nonnull_domain_checkpoint(virDomainPtr dom, remote_nonnull_domain_checkpoint checkpoint) +{ + return virGetDomainCheckpoint(dom, checkpoint.name); +} + static virDomainSnapshotPtr get_nonnull_domain_snapshot(virDomainPtr dom, remote_nonnull_domain_snapshot snapshot) { @@ -7159,6 +7167,13 @@ make_nonnull_nwfilter(remote_nonnull_nwfilter *nwfilter_dst, virNWFilterPtr nwfi memcpy(nwfilter_dst->uuid, nwfilter_src->uuid, VIR_UUID_BUFLEN); } +static void +make_nonnull_domain_checkpoint(remote_nonnull_domain_checkpoint *checkpoint_dst, virDomainCheckpointPtr checkpoint_src) +{ + ignore_value(VIR_STRDUP_QUIET(checkpoint_dst->name, checkpoint_src->name)); + make_nonnull_domain(&checkpoint_dst->dom, checkpoint_src->domain); +} + static void make_nonnull_domain_snapshot(remote_nonnull_domain_snapshot *snapshot_dst, virDomainSnapshotPtr snapshot_src) { diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index 1328f910b0..9a4ea68410 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -146,6 +146,7 @@ static virStoragePoolPtr get_nonnull_storage_pool(virConnectPtr conn, remote_non static virStorageVolPtr get_nonnull_storage_vol(virConnectPtr conn, remote_nonnull_storage_vol vol); static virNodeDevicePtr get_nonnull_node_device(virConnectPtr conn, remote_nonnull_node_device dev); static virSecretPtr get_nonnull_secret(virConnectPtr conn, remote_nonnull_secret secret); +static virDomainCheckpointPtr get_nonnull_domain_checkpoint(virDomainPtr domain, remote_nonnull_domain_checkpoint checkpoint); static virDomainSnapshotPtr get_nonnull_domain_snapshot(virDomainPtr domain, remote_nonnull_domain_snapshot snapshot); static void make_nonnull_domain(remote_nonnull_domain *dom_dst, virDomainPtr dom_src); static void make_nonnull_network(remote_nonnull_network *net_dst, virNetworkPtr net_src); @@ -156,6 +157,7 @@ static void make_nonnull_node_device(remote_nonnull_node_device *dev_dst, virNodeDevicePtr dev_src); static void make_nonnull_secret(remote_nonnull_secret *secret_dst, virSecretPtr secret_src); static void make_nonnull_nwfilter(remote_nonnull_nwfilter *nwfilter_dst, virNWFilterPtr nwfilter_src); +static void make_nonnull_domain_checkpoint(remote_nonnull_domain_checkpoint *checkpoint_dst, virDomainCheckpointPtr checkpoint_src); static void make_nonnull_domain_snapshot(remote_nonnull_domain_snapshot *snapshot_dst, virDomainSnapshotPtr snapshot_src); /*----------------------------------------------------------------------*/ @@ -8206,6 +8208,12 @@ get_nonnull_nwfilter(virConnectPtr conn, remote_nonnull_nwfilter nwfilter) return virGetNWFilter(conn, nwfilter.name, BAD_CAST nwfilter.uuid); } +static virDomainCheckpointPtr +get_nonnull_domain_checkpoint(virDomainPtr domain, remote_nonnull_domain_checkpoint checkpoint) +{ + return virGetDomainCheckpoint(domain, checkpoint.name); +} + static virDomainSnapshotPtr get_nonnull_domain_snapshot(virDomainPtr domain, remote_nonnull_domain_snapshot snapshot) { @@ -8273,6 +8281,13 @@ make_nonnull_nwfilter(remote_nonnull_nwfilter *nwfilter_dst, virNWFilterPtr nwfi memcpy(nwfilter_dst->uuid, nwfilter_src->uuid, VIR_UUID_BUFLEN); } +static void +make_nonnull_domain_checkpoint(remote_nonnull_domain_checkpoint *checkpoint_dst, virDomainCheckpointPtr checkpoint_src) +{ + checkpoint_dst->name = checkpoint_src->name; + make_nonnull_domain(&checkpoint_dst->dom, checkpoint_src->domain); +} + static void make_nonnull_domain_snapshot(remote_nonnull_domain_snapshot *snapshot_dst, virDomainSnapshotPtr snapshot_src) { @@ -8521,7 +8536,21 @@ static virHypervisorDriver hypervisor_driver = { .connectCompareHypervisorCPU = remoteConnectCompareHypervisorCPU, /* 4.4.0 */ .connectBaselineHypervisorCPU = remoteConnectBaselineHypervisorCPU, /* 4.4.0 */ .nodeGetSEVInfo = remoteNodeGetSEVInfo, /* 4.5.0 */ - .domainGetLaunchSecurityInfo = remoteDomainGetLaunchSecurityInfo /* 4.5.0 */ + .domainGetLaunchSecurityInfo = remoteDomainGetLaunchSecurityInfo, /* 4.5.0 */ + .domainCheckpointCreateXML = remoteDomainCheckpointCreateXML, /* 4.5.0 */ + .domainCheckpointGetXMLDesc = remoteDomainCheckpointGetXMLDesc, /* 4.5.0 */ + .domainListCheckpoints = remoteDomainListCheckpoints, /* 4.5.0 */ + .domainCheckpointListChildren = remoteDomainCheckpointListChildren, /* 4.5.0 */ + .domainCheckpointLookupByName = remoteDomainCheckpointLookupByName, /* 4.5.0 */ + .domainHasCurrentCheckpoint = remoteDomainHasCurrentCheckpoint, /* 4.5.0 */ + .domainCheckpointGetParent = remoteDomainCheckpointGetParent, /* 4.5.0 */ + .domainCheckpointCurrent = remoteDomainCheckpointCurrent, /* 4.5.0 */ + .domainCheckpointIsCurrent = remoteDomainCheckpointIsCurrent, /* 4.5.0 */ + .domainCheckpointHasMetadata = remoteDomainCheckpointHasMetadata, /* 4.5.0 */ + .domainCheckpointDelete = remoteDomainCheckpointDelete, /* 4.5.0 */ + .domainBackupBegin = remoteDomainBackupBegin, /* 4.5.0 */ + .domainBackupGetXMLDesc = remoteDomainBackupGetXMLDesc, /* 4.5.0 */ + .domainBackupEnd = remoteDomainBackupEnd, /* 4.5.0 */ }; static virNetworkDriver network_driver = { diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 162cf5e61b..a306a46435 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -3,7 +3,7 @@ * remote_internal driver and libvirtd. This protocol is * internal and may change at any time. * - * Copyright (C) 2006-2015 Red Hat, Inc. + * Copyright (C) 2006-2018 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -136,6 +136,9 @@ const REMOTE_AUTH_TYPE_LIST_MAX = 20; /* Upper limit on list of memory stats */ const REMOTE_DOMAIN_MEMORY_STATS_MAX = 1024; +/* Upper limit on lists of domain checkpoints. */ +const REMOTE_DOMAIN_CHECKPOINT_LIST_MAX = 16384; + /* Upper limit on lists of domain snapshots. */ const REMOTE_DOMAIN_SNAPSHOT_LIST_MAX = 16384; @@ -312,6 +315,12 @@ struct remote_nonnull_secret { remote_nonnull_string usageID; }; +/* A checkpoint which may not be NULL. */ +struct remote_nonnull_domain_checkpoint { + remote_nonnull_string name; + remote_nonnull_domain dom; +}; + /* A snapshot which may not be NULL. */ struct remote_nonnull_domain_snapshot { remote_nonnull_string name; @@ -3505,6 +3514,138 @@ struct remote_domain_get_launch_security_info_ret { remote_typed_param params<REMOTE_DOMAIN_LAUNCH_SECURITY_INFO_PARAMS_MAX>; }; +struct remote_domain_checkpoint_create_xml_args { + remote_nonnull_domain dom; + remote_nonnull_string xml_desc; + unsigned int flags; +}; + +struct remote_domain_checkpoint_create_xml_ret { + remote_nonnull_domain_checkpoint checkpoint; +}; + +struct remote_domain_checkpoint_get_xml_desc_args { + remote_nonnull_domain_checkpoint checkpoint; + unsigned int flags; +}; + +struct remote_domain_checkpoint_get_xml_desc_ret { + remote_nonnull_string xml; +}; + +struct remote_domain_list_checkpoints_args { + remote_nonnull_domain dom; + int need_results; + unsigned int flags; +}; + +struct remote_domain_list_checkpoints_ret { /* insert@1 */ + remote_nonnull_domain_checkpoint checkpoints<REMOTE_DOMAIN_CHECKPOINT_LIST_MAX>; + int ret; +}; + +struct remote_domain_checkpoint_list_children_args { + remote_nonnull_domain_checkpoint checkpoint; + int need_results; + unsigned int flags; +}; + +struct remote_domain_checkpoint_list_children_ret { /* insert@1 */ + remote_nonnull_domain_checkpoint checkpoints<REMOTE_DOMAIN_CHECKPOINT_LIST_MAX>; + int ret; +}; + +struct remote_domain_checkpoint_lookup_by_name_args { + remote_nonnull_domain dom; + remote_nonnull_string name; + unsigned int flags; +}; + +struct remote_domain_checkpoint_lookup_by_name_ret { + remote_nonnull_domain_checkpoint checkpoint; +}; + +struct remote_domain_has_current_checkpoint_args { + remote_nonnull_domain dom; + unsigned int flags; +}; + +struct remote_domain_has_current_checkpoint_ret { + int result; +}; + +struct remote_domain_checkpoint_get_parent_args { + remote_nonnull_domain_checkpoint checkpoint; + unsigned int flags; +}; + +struct remote_domain_checkpoint_get_parent_ret { + remote_nonnull_domain_checkpoint parent; +}; + +struct remote_domain_checkpoint_current_args { + remote_nonnull_domain dom; + unsigned int flags; +}; + +struct remote_domain_checkpoint_current_ret { + remote_nonnull_domain_checkpoint checkpoint; +}; + +struct remote_domain_checkpoint_is_current_args { + remote_nonnull_domain_checkpoint checkpoint; + unsigned int flags; +}; + +struct remote_domain_checkpoint_is_current_ret { + int current; +}; + +struct remote_domain_checkpoint_has_metadata_args { + remote_nonnull_domain_checkpoint checkpoint; + unsigned int flags; +}; + +struct remote_domain_checkpoint_has_metadata_ret { + int metadata; +}; + +struct remote_domain_checkpoint_delete_args { + remote_nonnull_domain_checkpoint checkpoint; + unsigned int flags; +}; + +struct remote_domain_backup_begin_args { + remote_nonnull_domain dom; + remote_string disk_xml; + remote_string checkpoint_xml; + unsigned int flags; +}; + +struct remote_domain_backup_begin_ret { + int result; +}; + +struct remote_domain_backup_get_xml_desc_args { + remote_nonnull_domain dom; + int id; + unsigned int flags; +}; + +struct remote_domain_backup_get_xml_desc_ret { + remote_nonnull_string xml; +}; + +struct remote_domain_backup_end_args { + remote_nonnull_domain dom; + int id; + unsigned int flags; +}; + +struct remote_domain_backup_end_ret { + int retcode; +}; + /*----- Protocol. -----*/ /* Define the program number, protocol version and procedure numbers here. */ @@ -6224,5 +6365,97 @@ enum remote_procedure { * @generate: none * @acl: domain:read */ - REMOTE_PROC_DOMAIN_GET_LAUNCH_SECURITY_INFO = 396 + REMOTE_PROC_DOMAIN_GET_LAUNCH_SECURITY_INFO = 396, + + /** + * @generate: both + * @acl: domain:checkpoint + */ + REMOTE_PROC_DOMAIN_CHECKPOINT_CREATE_XML = 397, + + /** + * @generate: both + * @priority: high + * @acl: domain:read + * @acl: domain:read_secure:VIR_DOMAIN_XML_SECURE + */ + REMOTE_PROC_DOMAIN_CHECKPOINT_GET_XML_DESC = 398, + + /** + * @generate: both + * @priority: high + * @acl: domain:read + */ + REMOTE_PROC_DOMAIN_LIST_CHECKPOINTS = 399, + + /** + * @generate: both + * @priority: high + * @acl: domain:read + */ + REMOTE_PROC_DOMAIN_CHECKPOINT_LIST_CHILDREN = 400, + + /** + * @generate: both + * @priority: high + * @acl: domain:read + */ + REMOTE_PROC_DOMAIN_CHECKPOINT_LOOKUP_BY_NAME = 401, + + /** + * @generate: both + * @acl: domain:read + */ + REMOTE_PROC_DOMAIN_HAS_CURRENT_CHECKPOINT = 402, + + /** + * @generate: both + * @acl: domain:read + */ + REMOTE_PROC_DOMAIN_CHECKPOINT_CURRENT = 403, + + /** + * @generate: both + * @priority: high + * @acl: domain:read + */ + REMOTE_PROC_DOMAIN_CHECKPOINT_GET_PARENT = 404, + + /** + * @generate: both + * @acl: domain:read + */ + REMOTE_PROC_DOMAIN_CHECKPOINT_IS_CURRENT = 405, + + /** + * @generate: both + * @acl: domain:read + */ + REMOTE_PROC_DOMAIN_CHECKPOINT_HAS_METADATA = 406, + + /** + * @generate: both + * @acl: domain:checkpoint + */ + REMOTE_PROC_DOMAIN_CHECKPOINT_DELETE = 407, + + /** + * @generate: both + * @acl: domain:checkpoint + * @acl: domain:block_read + */ + REMOTE_PROC_DOMAIN_BACKUP_BEGIN = 408, + + /** + * @generate: both + * @acl: domain:read + * @acl: domain:read_secure:VIR_DOMAIN_XML_SECURE + */ + REMOTE_PROC_DOMAIN_BACKUP_GET_XML_DESC = 409, + + /** + * @generate: both + * @acl: domain:checkpoint + */ + REMOTE_PROC_DOMAIN_BACKUP_END = 410 }; diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 0c75ad2305..9c7b116151 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -42,6 +42,10 @@ struct remote_nonnull_secret { int usageType; remote_nonnull_string usageID; }; +struct remote_nonnull_domain_checkpoint { + remote_nonnull_string name; + remote_nonnull_domain dom; +}; struct remote_nonnull_domain_snapshot { remote_nonnull_string name; remote_nonnull_domain dom; @@ -2928,6 +2932,117 @@ struct remote_domain_get_launch_security_info_ret { remote_typed_param * params_val; } params; }; +struct remote_domain_checkpoint_create_xml_args { + remote_nonnull_domain dom; + remote_nonnull_string xml_desc; + u_int flags; +}; +struct remote_domain_checkpoint_create_xml_ret { + remote_nonnull_domain_checkpoint checkpoint; +}; +struct remote_domain_checkpoint_get_xml_desc_args { + remote_nonnull_domain_checkpoint checkpoint; + u_int flags; +}; +struct remote_domain_checkpoint_get_xml_desc_ret { + remote_nonnull_string xml; +}; +struct remote_domain_list_checkpoints_args { + remote_nonnull_domain dom; + int need_results; + u_int flags; +}; +struct remote_domain_list_checkpoints_ret { + struct { + u_int checkpoints_len; + remote_nonnull_domain_checkpoint * checkpoints_val; + } checkpoints; + int ret; +}; +struct remote_domain_checkpoint_list_children_args { + remote_nonnull_domain_checkpoint checkpoint; + int need_results; + u_int flags; +}; +struct remote_domain_checkpoint_list_children_ret { + struct { + u_int checkpoints_len; + remote_nonnull_domain_checkpoint * checkpoints_val; + } checkpoints; + int ret; +}; +struct remote_domain_checkpoint_lookup_by_name_args { + remote_nonnull_domain dom; + remote_nonnull_string name; + u_int flags; +}; +struct remote_domain_checkpoint_lookup_by_name_ret { + remote_nonnull_domain_checkpoint checkpoint; +}; +struct remote_domain_has_current_checkpoint_args { + remote_nonnull_domain dom; + u_int flags; +}; +struct remote_domain_has_current_checkpoint_ret { + int result; +}; +struct remote_domain_checkpoint_get_parent_args { + remote_nonnull_domain_checkpoint checkpoint; + u_int flags; +}; +struct remote_domain_checkpoint_get_parent_ret { + remote_nonnull_domain_checkpoint parent; +}; +struct remote_domain_checkpoint_current_args { + remote_nonnull_domain dom; + u_int flags; +}; +struct remote_domain_checkpoint_current_ret { + remote_nonnull_domain_checkpoint checkpoint; +}; +struct remote_domain_checkpoint_is_current_args { + remote_nonnull_domain_checkpoint checkpoint; + u_int flags; +}; +struct remote_domain_checkpoint_is_current_ret { + int current; +}; +struct remote_domain_checkpoint_has_metadata_args { + remote_nonnull_domain_checkpoint checkpoint; + u_int flags; +}; +struct remote_domain_checkpoint_has_metadata_ret { + int metadata; +}; +struct remote_domain_checkpoint_delete_args { + remote_nonnull_domain_checkpoint checkpoint; + u_int flags; +}; +struct remote_domain_backup_begin_args { + remote_nonnull_domain dom; + remote_string disk_xml; + remote_string checkpoint_xml; + u_int flags; +}; +struct remote_domain_backup_begin_ret { + int result; +}; +struct remote_domain_backup_get_xml_desc_args { + remote_nonnull_domain dom; + int id; + u_int flags; +}; +struct remote_domain_backup_get_xml_desc_ret { + remote_nonnull_string xml; +}; +struct remote_domain_backup_end_args { + remote_nonnull_domain dom; + int id; + u_int flags; +}; +struct remote_domain_backup_end_ret { + int retcode; +}; enum remote_procedure { REMOTE_PROC_CONNECT_OPEN = 1, REMOTE_PROC_CONNECT_CLOSE = 2, @@ -3325,4 +3440,18 @@ enum remote_procedure { REMOTE_PROC_CONNECT_BASELINE_HYPERVISOR_CPU = 394, REMOTE_PROC_NODE_GET_SEV_INFO = 395, REMOTE_PROC_DOMAIN_GET_LAUNCH_SECURITY_INFO = 396, + REMOTE_PROC_DOMAIN_CHECKPOINT_CREATE_XML = 397, + REMOTE_PROC_DOMAIN_CHECKPOINT_GET_XML_DESC = 398, + REMOTE_PROC_DOMAIN_LIST_CHECKPOINTS = 399, + REMOTE_PROC_DOMAIN_CHECKPOINT_LIST_CHILDREN = 400, + REMOTE_PROC_DOMAIN_CHECKPOINT_LOOKUP_BY_NAME = 401, + REMOTE_PROC_DOMAIN_HAS_CURRENT_CHECKPOINT = 402, + REMOTE_PROC_DOMAIN_CHECKPOINT_CURRENT = 403, + REMOTE_PROC_DOMAIN_CHECKPOINT_GET_PARENT = 404, + REMOTE_PROC_DOMAIN_CHECKPOINT_IS_CURRENT = 405, + REMOTE_PROC_DOMAIN_CHECKPOINT_HAS_METADATA = 406, + REMOTE_PROC_DOMAIN_CHECKPOINT_DELETE = 407, + REMOTE_PROC_DOMAIN_BACKUP_BEGIN = 408, + REMOTE_PROC_DOMAIN_BACKUP_GET_XML_DESC = 409, + REMOTE_PROC_DOMAIN_BACKUP_END = 410, }; diff --git a/src/rpc/gendispatch.pl b/src/rpc/gendispatch.pl index b8b83b6b40..b40efd91d0 100755 --- a/src/rpc/gendispatch.pl +++ b/src/rpc/gendispatch.pl @@ -1,6 +1,6 @@ #!/usr/bin/env perl # -# Copyright (C) 2010-2015 Red Hat, Inc. +# Copyright (C) 2010-2018 Red Hat, Inc. # # This library is free software; you can redistribute it and/or # modify it under the terms of the GNU Lesser General Public @@ -567,18 +567,20 @@ elsif ($mode eq "server") { push(@args_list, "$2"); push(@free_list, " virObjectUnref($2);"); - } elsif ($args_member =~ m/^remote_nonnull_domain_snapshot (\S+);$/) { + } elsif ($args_member =~ m/^remote_nonnull_domain_(checkpoint|snapshot) (\S+);$/) { + my $type_name = name_to_TypeName($1); + push(@vars_list, "virDomainPtr dom = NULL"); - push(@vars_list, "virDomainSnapshotPtr snapshot = NULL"); + push(@vars_list, "virDomain${type_name}Ptr ${1} = NULL"); push(@getters_list, - " if (!(dom = get_nonnull_domain($conn, args->${1}.dom)))\n" . + " if (!(dom = get_nonnull_domain($conn, args->${2}.dom)))\n" . " goto cleanup;\n" . "\n" . - " if (!(snapshot = get_nonnull_domain_snapshot(dom, args->${1})))\n" . + " if (!($1 = get_nonnull_domain_${1}(dom, args->$2)))\n" . " goto cleanup;\n"); - push(@args_list, "snapshot"); + push(@args_list, "$1"); push(@free_list, - " virObjectUnref(snapshot);\n" . + " virObjectUnref($1);\n" . " virObjectUnref(dom);"); } elsif ($args_member =~ m/^(?:(?:admin|remote)_string|remote_uuid) (\S+)<\S+>;/) { push(@args_list, $conn) if !@args_list; @@ -722,7 +724,7 @@ elsif ($mode eq "server") { if (!$modern_ret_as_list) { push(@ret_list, "ret->$3 = tmp.$3;"); } - } elsif ($ret_member =~ m/(?:admin|remote)_nonnull_(secret|nwfilter|node_device|interface|network|storage_vol|storage_pool|domain_snapshot|domain|server|client) (\S+)<(\S+)>;/) { + } elsif ($ret_member =~ m/(?:admin|remote)_nonnull_(secret|nwfilter|node_device|interface|network|storage_vol|storage_pool|domain_checkpoint|domain_snapshot|domain|server|client) (\S+)<(\S+)>;/) { $modern_ret_struct_name = $1; $single_ret_list_error_msg_type = $1; $single_ret_list_name = $2; @@ -780,7 +782,7 @@ elsif ($mode eq "server") { $single_ret_var = $1; $single_ret_by_ref = 0; $single_ret_check = " == NULL"; - } elsif ($ret_member =~ m/^remote_nonnull_(domain|network|storage_pool|storage_vol|interface|node_device|secret|nwfilter|domain_snapshot) (\S+);/) { + } elsif ($ret_member =~ m/^remote_nonnull_(domain|network|storage_pool|storage_vol|interface|node_device|secret|nwfilter|domain_checkpoint|domain_snapshot) (\S+);/) { my $type_name = name_to_TypeName($1); if ($call->{ProcName} eq "DomainCreateWithFlags") { @@ -1325,13 +1327,13 @@ elsif ($mode eq "client") { $priv_src = "dev->conn"; push(@args_list, "virNodeDevicePtr dev"); push(@setters_list, "args.name = dev->name;"); - } elsif ($args_member =~ m/^remote_nonnull_(domain|network|storage_pool|storage_vol|interface|secret|nwfilter|domain_snapshot) (\S+);/) { + } elsif ($args_member =~ m/^remote_nonnull_(domain|network|storage_pool|storage_vol|interface|secret|nwfilter|domain_checkpoint|domain_snapshot) (\S+);/) { my $name = $1; my $arg_name = $2; my $type_name = name_to_TypeName($name); if ($is_first_arg) { - if ($name eq "domain_snapshot") { + if ($name =~ m/^domain_.*/) { $priv_src = "$arg_name->domain->conn"; } else { $priv_src = "$arg_name->conn"; @@ -1518,7 +1520,7 @@ elsif ($mode eq "client") { } push(@ret_list, "memcpy(result->$3, ret.$3, sizeof(result->$3));"); - } elsif ($ret_member =~ m/(?:admin|remote)_nonnull_(secret|nwfilter|node_device|interface|network|storage_vol|storage_pool|domain_snapshot|domain|server|client) (\S+)<(\S+)>;/) { + } elsif ($ret_member =~ m/(?:admin|remote)_nonnull_(secret|nwfilter|node_device|interface|network|storage_vol|storage_pool|domain_checkpoint|domain_snapshot|domain|server|client) (\S+)<(\S+)>;/) { my $proc_name = name_to_TypeName($1); if ($structprefix eq "admin") { @@ -1571,7 +1573,7 @@ elsif ($mode eq "client") { push(@ret_list, "VIR_FREE(ret.$1);"); $single_ret_var = "char *rv = NULL"; $single_ret_type = "char *"; - } elsif ($ret_member =~ m/^remote_nonnull_(domain|network|storage_pool|storage_vol|node_device|interface|secret|nwfilter|domain_snapshot) (\S+);/) { + } elsif ($ret_member =~ m/^remote_nonnull_(domain|network|storage_pool|storage_vol|node_device|interface|secret|nwfilter|domain_checkpoint|domain_snapshot) (\S+);/) { my $name = $1; my $arg_name = $2; my $type_name = name_to_TypeName($name); @@ -1585,7 +1587,7 @@ elsif ($mode eq "client") { $single_ret_var = "int rv = -1"; $single_ret_type = "int"; } else { - if ($name eq "domain_snapshot") { + if ($name =~ m/^domain_.*/) { my $dom = "$priv_src"; $dom =~ s/->conn//; push(@ret_list, "rv = get_nonnull_$name($dom, ret.$arg_name);"); @@ -1928,7 +1930,7 @@ elsif ($mode eq "client") { print " }\n"; print "\n"; } elsif ($modern_ret_as_list) { - if ($modern_ret_struct_name =~ m/domain_snapshot|client/) { + if ($modern_ret_struct_name =~ m/domain_checkpoint|domain_snapshot|client/) { $priv_src =~ s/->conn//; } print " if (result) {\n"; -- 2.14.4

CC: Daniel Erez <derez@redhat.com> CC: Yaniv Dary <ydary@redhat.com> CC: Allon Mureinik <amureini@redhat.com> full thread: https://www.redhat.com/archives/libvir-list/2018-June/msg01066.html On 06/13/2018 12:42 PM, Eric Blake wrote:
I'm offline the rest of this week, but wanted to post the progress I've made on patches towards the Incremental Backup RFC: https://www.redhat.com/archives/libvir-list/2018-May/msg01403.html
Comments welcome, including any naming suggestions
Still to go: - Add .rng file for validating the XML format used in virDomainBackupBegin() - Add flags for validating XML - Add src/conf/checkpoint_conf.c mirroring src/conf/snapshot_conf.c for tracking tree of checkpoints - Add virsh wrappers for calling everything - Add qemu implementation - my first addition will probably just be for push model full backups, then additional patches to expand into pull model (on the qemu list, I still need to review and incorporate Vladimir's patches for exporting a bitmap over NBD) - Bug fixes (but why would there be any bugs in the first place? :)
I've got portions of the qemu code working locally, but not polished enough to post as a patch yet; my end goal is to have a working demo against current qemu.git showing the use of virDomainBackupBegin() for incremental backups with the push model prior to the code freeze for 4.5.0 this month, even if that code doesn't get checked into libvirt until later when the qemu code is changed to drop x- prefixes. (That is, I'm hoping to demo that my API is sound, and thus we can include the entrypoints in the libvirt.so for this release, even if the libvirt code for driving pull mode over qemu waits until after a qemu release where the pieces are promoted to a stable form.)
Eric Blake (8): snapshots: Avoid term 'checkpoint' for full system snapshot backup: Document nuances between different state capture APIs backup: Introduce virDomainCheckpointPtr backup: Document new XML for backups backup: Introduce virDomainCheckpoint APIs backup: Introduce virDomainBackup APIs backup: Add new domain:checkpoint access control backup: Implement backup APIs for remote driver
docs/Makefile.am | 3 + docs/apibuild.py | 2 + docs/docs.html.in | 9 +- docs/domainstatecapture.html.in | 190 ++++++ docs/formatcheckpoint.html.in | 273 +++++++++ docs/formatsnapshot.html.in | 16 +- docs/schemas/domaincheckpoint.rng | 89 +++ include/libvirt/libvirt-domain-checkpoint.h | 158 +++++ include/libvirt/libvirt-domain-snapshot.h | 10 +- include/libvirt/libvirt-domain.h | 14 +- include/libvirt/libvirt.h | 3 +- include/libvirt/virterror.h | 5 +- libvirt.spec.in | 2 + mingw-libvirt.spec.in | 4 + po/POTFILES | 1 + src/Makefile.am | 2 + src/access/viraccessperm.c | 5 +- src/access/viraccessperm.h | 8 +- src/conf/snapshot_conf.c | 2 +- src/datatypes.c | 62 +- src/datatypes.h | 31 +- src/driver-hypervisor.h | 74 ++- src/libvirt-domain-checkpoint.c | 908 ++++++++++++++++++++++++++++ src/libvirt-domain-snapshot.c | 4 +- src/libvirt-domain.c | 8 +- src/libvirt_private.syms | 2 + src/libvirt_public.syms | 19 + src/qemu/qemu_driver.c | 12 +- src/remote/remote_daemon_dispatch.c | 15 + src/remote/remote_driver.c | 31 +- src/remote/remote_protocol.x | 237 +++++++- src/remote_protocol-structs | 129 ++++ src/rpc/gendispatch.pl | 32 +- src/util/virerror.c | 15 +- tests/domaincheckpointxml2xmlin/empty.xml | 1 + tests/domaincheckpointxml2xmlout/empty.xml | 10 + tests/virschematest.c | 2 + tools/virsh-domain.c | 3 +- tools/virsh-snapshot.c | 2 +- tools/virsh.pod | 14 +- 40 files changed, 2347 insertions(+), 60 deletions(-) create mode 100644 docs/domainstatecapture.html.in create mode 100644 docs/formatcheckpoint.html.in create mode 100644 docs/schemas/domaincheckpoint.rng create mode 100644 include/libvirt/libvirt-domain-checkpoint.h create mode 100644 src/libvirt-domain-checkpoint.c create mode 100644 tests/domaincheckpointxml2xmlin/empty.xml create mode 100644 tests/domaincheckpointxml2xmlout/empty.xml
participants (5)
-
Eric Blake
-
John Ferlan
-
John Snow
-
Kashyap Chamarthy
-
Nir Soffer