[libvirt] [PATCHv2 00/20] External snapshot support

This is a second spin of the patches: Changes to previous version: - pushed fix for private_syms file patches 1-4 are Eric's patches that this series builds upon patches 5,were ACKed in v1 patches 6,7,9,10,14 are fixed versions after review patch 8 is new in the series the rest needs review. You can fetch the changes at: git fetch git://pipo.sk/pipo/libvirt.git snap-revert Eric Blake (4): snapshot: new XML for external system checkpoint snapshot: improve disk align checking snapshot: populate new XML info for qemu snapshots snapshot: merge pre-snapshot checks Peter Krempa (16): qemu: Split out code to save domain memory to allow reuse snapshot: Add flag to enable creating checkpoints in live state snapshot: qemu: Add async job type for snapshots snapshot: qemu: Rename qemuDomainSnapshotCreateActive qemu: Fix possible race when pausing guest snapshot: qemu: Add support for external checkpoints snapshot: qemu: Remove restrictions preventing external snapshots qemu: snapshot: Clean up snapshot retrieval to use the new helper qemu: Split out guts of qemuDomainSaveImageStartVM() to allow reuse snapshot: qemu: Add flag VIR_DOMAIN_SNAPSHOT_REVERT_STOPPED snapshot: qemu: Add support for external inactive snapshots conf: Add helper to determine if snapshot is external snapshot: qemu: Add detail option for PMSUSPENDED event. snapshot: qemu: Fix detection of external snapshots when deleting snapshot: qemu: Add support for external snapshot deletion. snapshot: qemu: Implement reverting of external snapshots docs/formatsnapshot.html.in | 11 + docs/schemas/domainsnapshot.rng | 23 + examples/domain-events/events-c/event-test.c | 3 + include/libvirt/libvirt.h.in | 7 + src/conf/snapshot_conf.c | 102 +- src/conf/snapshot_conf.h | 6 + src/libvirt.c | 15 +- src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 125 +- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_driver.c | 1203 ++++++++++++++------ src/qemu/qemu_process.c | 19 + tests/domainsnapshotxml2xmlin/external_vm.xml | 10 + tests/domainsnapshotxml2xmlin/noparent.xml | 9 + tests/domainsnapshotxml2xmlout/all_parameters.xml | 1 + tests/domainsnapshotxml2xmlout/disk_snapshot.xml | 1 + tests/domainsnapshotxml2xmlout/external_vm.xml | 43 + tests/domainsnapshotxml2xmlout/full_domain.xml | 1 + tests/domainsnapshotxml2xmlout/metadata.xml | 1 + tests/domainsnapshotxml2xmlout/noparent.xml | 1 + .../noparent_nodescription.xml | 1 + .../noparent_nodescription_noactive.xml | 1 + tests/domainsnapshotxml2xmltest.c | 1 + tools/virsh-domain-monitor.c | 2 + tools/virsh-snapshot.c | 9 + tools/virsh.pod | 27 +- 26 files changed, 1259 insertions(+), 365 deletions(-) create mode 100644 tests/domainsnapshotxml2xmlin/external_vm.xml create mode 100644 tests/domainsnapshotxml2xmlin/noparent.xml create mode 100644 tests/domainsnapshotxml2xmlout/external_vm.xml -- 1.7.12.4

From: Eric Blake <eblake@redhat.com> Each <domainsnapshot> can now contain an optional <memory> element that describes how the VM state was handled, similar to disk snapshots. The new element will always appear in output; for back-compat, an input that lacks the element will assume 'no' or 'internal' according to the domain state. Along with this change, it is now possible to pass <disks> in the XML for an offline snapshot; this also needs to be wired up in a future patch, to make it possible to choose internal vs. external on a per-disk basis for each disk in an offline domain. At that point, using the --disk-only flag for an offline domain will be able to work. For some examples below, remember that qemu supports the following snapshot actions: qemu-img: offline external and internal disk savevm: online internal VM and disk migrate: online external VM transaction: online external disk ===== <domainsnapshot> <memory snapshot='no'/> ... </domainsnapshot> implies that there is no VM state saved (mandatory for offline and disk-only snapshots, not possible otherwise); using qemu-img for offline domains and transaction for online. ===== <domainsnapshot> <memory snapshot='internal'/> ... </domainsnapshot> state is saved inside one of the disks (as in qemu's 'savevm' system checkpoint implementation). If needed in the future, we can also add an attribute pointing out _which_ disk saved the internal state; maybe disk='vda'. ===== <domainsnapshot> <memory snapshot='external' file='/path/to/state'/> ... </domainsnapshot> This is not wired up yet, but future patches will allow this to control a combination of 'virsh save /path/to/state' plus disk snapshots from the same point in time. ===== So for 0.10.2, I plan to implement this table of combinations, with '*' designating new code and '+' designating existing code reached through new combinations of xml and/or the existing DISK_ONLY flag: domain memory disk disk-only | result ----------------------------------------- offline omit omit any | memory=no disk=int, via qemu-img offline no omit any |+memory=no disk=int, via qemu-img offline omit/no no any | invalid combination (nothing to snapshot) offline omit/no int any |+memory=no disk=int, via qemu-img offline omit/no ext any |*memory=no disk=ext, via qemu-img offline int/ext any any | invalid combination (no memory to save) online omit omit off | memory=int disk=int, via savevm online omit omit on | memory=no disk=default, via transaction online omit no/ext off | unsupported for now online omit no on | invalid combination (nothing to snapshot) online omit ext on | memory=no disk=ext, via transaction online omit int off |+memory=int disk=int, via savevm online omit int on | unsupported for now online no omit any |+memory=no disk=default, via transaction online no no any | invalid combination (nothing to snapshot) online no int any | unsupported for now online no ext any |+memory=no disk=ext, via transaction online int/ext any on | invalid combination (disk-only vs. memory) online int omit off |+memory=int disk=int, via savevm online int no/ext off | unsupported for now online int int off |+memory=int disk=int, via savevm online ext omit off |*memory=ext disk=default, via migrate+trans online ext no off |+memory=ext disk=no, via migrate online ext int off | unsupported for now online ext ext off |*memory=ext disk=ext, via migrate+transaction * docs/schemas/domainsnapshot.rng (memory): New RNG element. * docs/formatsnapshot.html.in: Document it. * src/conf/snapshot_conf.h (virDomainSnapshotDef): New fields. * src/conf/domain_conf.c (virDomainSnapshotDefFree) (virDomainSnapshotDefParseString, virDomainSnapshotDefFormat): Manage new fields. * tests/domainsnapshotxml2xmltest.c: New test. * tests/domainsnapshotxml2xmlin/*.xml: Update existing tests. * tests/domainsnapshotxml2xmlout/*.xml: Likewise. --- docs/formatsnapshot.html.in | 11 ++++ docs/schemas/domainsnapshot.rng | 23 ++++++++ src/conf/snapshot_conf.c | 66 +++++++++++++++++----- src/conf/snapshot_conf.h | 4 ++ tests/domainsnapshotxml2xmlin/external_vm.xml | 10 ++++ tests/domainsnapshotxml2xmlin/noparent.xml | 9 +++ tests/domainsnapshotxml2xmlout/all_parameters.xml | 1 + tests/domainsnapshotxml2xmlout/disk_snapshot.xml | 1 + tests/domainsnapshotxml2xmlout/external_vm.xml | 43 ++++++++++++++ tests/domainsnapshotxml2xmlout/full_domain.xml | 1 + tests/domainsnapshotxml2xmlout/metadata.xml | 1 + tests/domainsnapshotxml2xmlout/noparent.xml | 1 + .../noparent_nodescription.xml | 1 + .../noparent_nodescription_noactive.xml | 1 + tests/domainsnapshotxml2xmltest.c | 1 + 15 files changed, 161 insertions(+), 13 deletions(-) create mode 100644 tests/domainsnapshotxml2xmlin/external_vm.xml create mode 100644 tests/domainsnapshotxml2xmlin/noparent.xml create mode 100644 tests/domainsnapshotxml2xmlout/external_vm.xml diff --git a/docs/formatsnapshot.html.in b/docs/formatsnapshot.html.in index ec5ebf3..2af32bc 100644 --- a/docs/formatsnapshot.html.in +++ b/docs/formatsnapshot.html.in @@ -106,6 +106,17 @@ description is omitted when initially creating the snapshot, then this field will be empty. </dd> + <dt><code>memory</code></dt> + <dd>On input, this is an optional request for how to handle VM + state. For an offline domain or a disk-only snapshot, + attribute <code>snapshot</code> must be <code>no</code>, since + there is no VM state saved; otherwise, the attribute can + be <code>internal</code> if the VM state is piggy-backed with + other internal disk state, or <code>external</code> along with + a second attribute <code>file</code> giving the absolute path + of the file holding the VM state. <span class="since">Since + 0.10.2</span> + </dd> <dt><code>disks</code></dt> <dd>On input, this is an optional listing of specific instructions for disk snapshots; it is needed when making a diff --git a/docs/schemas/domainsnapshot.rng b/docs/schemas/domainsnapshot.rng index ecaafe9..45d55b5 100644 --- a/docs/schemas/domainsnapshot.rng +++ b/docs/schemas/domainsnapshot.rng @@ -31,6 +31,29 @@ </element> </optional> <optional> + <element name='memory'> + <choice> + <attribute name='snapshot'> + <choice> + <value>no</value> + <value>internal</value> + </choice> + </attribute> + <group> + <optional> + <attribute name='snapshot'> + <value>external</value> + </attribute> + </optional> + <attribute name='file'> + <ref name='absFilePath'/> + </attribute> + </group> + </choice> + <empty/> + </element> + </optional> + <optional> <element name='disks'> <zeroOrMore> <ref name='disksnapshot'/> diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c index 16c844d..6f77026 100644 --- a/src/conf/snapshot_conf.c +++ b/src/conf/snapshot_conf.c @@ -94,6 +94,7 @@ void virDomainSnapshotDefFree(virDomainSnapshotDefPtr def) VIR_FREE(def->name); VIR_FREE(def->description); VIR_FREE(def->parent); + VIR_FREE(def->file); for (i = 0; i < def->ndisks; i++) virDomainSnapshotDiskDefClear(&def->disks[i]); VIR_FREE(def->disks); @@ -182,6 +183,9 @@ virDomainSnapshotDefParseString(const char *xmlStr, int active; char *tmp; int keepBlanksDefault = xmlKeepBlanksDefault(0); + char *memorySnapshot = NULL; + char *memoryFile = NULL; + bool offline = !!(flags & VIR_DOMAIN_SNAPSHOT_PARSE_OFFLINE); xml = virXMLParseCtxt(NULL, xmlStr, _("(domain_snapshot)"), &ctxt); if (!xml) { @@ -208,15 +212,11 @@ virDomainSnapshotDefParseString(const char *xmlStr, virReportError(VIR_ERR_XML_ERROR, "%s", _("a redefined snapshot must have a name")); goto cleanup; - } else { - ignore_value(virAsprintf(&def->name, "%lld", - (long long)tv.tv_sec)); } - } - - if (def->name == NULL) { - virReportOOMError(); - goto cleanup; + if (virAsprintf(&def->name, "%lld", (long long)tv.tv_sec) < 0) { + virReportOOMError(); + goto cleanup; + } } def->description = virXPathString("string(./description)", ctxt); @@ -247,6 +247,8 @@ virDomainSnapshotDefParseString(const char *xmlStr, state); goto cleanup; } + offline = (def->state == VIR_DOMAIN_SHUTOFF || + def->state == VIR_DOMAIN_DISK_SNAPSHOT); /* Older snapshots were created with just <domain>/<uuid>, and * lack domain/@type. In that case, leave dom NULL, and @@ -274,11 +276,42 @@ virDomainSnapshotDefParseString(const char *xmlStr, def->creationTime = tv.tv_sec; } + memorySnapshot = virXPathString("string(./memory/@snapshot)", ctxt); + memoryFile = virXPathString("string(./memory/@file)", ctxt); + if (memorySnapshot) { + def->memory = virDomainSnapshotLocationTypeFromString(memorySnapshot); + if (def->memory <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("unknown memory snapshot setting '%s'"), + memorySnapshot); + goto cleanup; + } + if (memoryFile && + def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + virReportError(VIR_ERR_XML_ERROR, + _("memory filename '%s' requires external snapshot"), + memoryFile); + goto cleanup; + } + } else if (memoryFile) { + def->memory = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; + } else if (flags & VIR_DOMAIN_SNAPSHOT_PARSE_REDEFINE) { + def->memory = (offline ? + VIR_DOMAIN_SNAPSHOT_LOCATION_NONE : + VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL); + } + if (offline && def->memory && + def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_NONE) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("memory state cannot be saved with offline snapshot")); + goto cleanup; + } + def->file = memoryFile; + memoryFile = NULL; + if ((i = virXPathNodeSet("./disks/*", ctxt, &nodes)) < 0) goto cleanup; - if (flags & VIR_DOMAIN_SNAPSHOT_PARSE_DISKS || - (flags & VIR_DOMAIN_SNAPSHOT_PARSE_REDEFINE && - def->state == VIR_DOMAIN_DISK_SNAPSHOT)) { + if (flags & VIR_DOMAIN_SNAPSHOT_PARSE_DISKS) { def->ndisks = i; if (def->ndisks && VIR_ALLOC_N(def->disks, def->ndisks) < 0) { virReportOOMError(); @@ -310,6 +343,8 @@ cleanup: VIR_FREE(creation); VIR_FREE(state); VIR_FREE(nodes); + VIR_FREE(memorySnapshot); + VIR_FREE(memoryFile); xmlXPathFreeContext(ctxt); if (ret == NULL) virDomainSnapshotDefFree(def); @@ -531,8 +566,13 @@ char *virDomainSnapshotDefFormat(const char *domain_uuid, } virBufferAsprintf(&buf, " <creationTime>%lld</creationTime>\n", def->creationTime); - /* For now, only output <disks> on disk-snapshot */ - if (def->state == VIR_DOMAIN_DISK_SNAPSHOT) { + if (def->memory) { + virBufferAsprintf(&buf, " <memory snapshot='%s'", + virDomainSnapshotLocationTypeToString(def->memory)); + virBufferEscapeString(&buf, " file='%s'", def->file); + virBufferAddLit(&buf, "/>\n"); + } + if (def->ndisks) { virBufferAddLit(&buf, " <disks>\n"); for (i = 0; i < def->ndisks; i++) { virDomainSnapshotDiskDefPtr disk = &def->disks[i]; diff --git a/src/conf/snapshot_conf.h b/src/conf/snapshot_conf.h index c00347f..00775ae 100644 --- a/src/conf/snapshot_conf.h +++ b/src/conf/snapshot_conf.h @@ -66,6 +66,9 @@ struct _virDomainSnapshotDef { long long creationTime; /* in seconds */ int state; /* enum virDomainSnapshotState */ + int memory; /* enum virDomainMemorySnapshot */ + char *file; /* memory state file when snapshot is external */ + size_t ndisks; /* should not exceed dom->ndisks */ virDomainSnapshotDiskDef *disks; @@ -93,6 +96,7 @@ typedef enum { VIR_DOMAIN_SNAPSHOT_PARSE_REDEFINE = 1 << 0, VIR_DOMAIN_SNAPSHOT_PARSE_DISKS = 1 << 1, VIR_DOMAIN_SNAPSHOT_PARSE_INTERNAL = 1 << 2, + VIR_DOMAIN_SNAPSHOT_PARSE_OFFLINE = 1 << 3, } virDomainSnapshotParseFlags; virDomainSnapshotDefPtr virDomainSnapshotDefParseString(const char *xmlStr, diff --git a/tests/domainsnapshotxml2xmlin/external_vm.xml b/tests/domainsnapshotxml2xmlin/external_vm.xml new file mode 100644 index 0000000..3bcd150 --- /dev/null +++ b/tests/domainsnapshotxml2xmlin/external_vm.xml @@ -0,0 +1,10 @@ +<domainsnapshot> + <name>my snap name</name> + <description>!@#$%^</description> + <state>running</state> + <memory snapshot='external' file='/dev/HostVG/GuestMemory'/> + <parent> + <name>earlier_snap</name> + </parent> + <creationTime>1272917631</creationTime> +</domainsnapshot> diff --git a/tests/domainsnapshotxml2xmlin/noparent.xml b/tests/domainsnapshotxml2xmlin/noparent.xml new file mode 100644 index 0000000..cbac0d8 --- /dev/null +++ b/tests/domainsnapshotxml2xmlin/noparent.xml @@ -0,0 +1,9 @@ +<domainsnapshot> + <name>my snap name</name> + <description>!@#$%^</description> + <state>running</state> + <creationTime>1272917631</creationTime> + <domain> + <uuid>9d37b878-a7cc-9f9a-b78f-49b3abad25a8</uuid> + </domain> +</domainsnapshot> diff --git a/tests/domainsnapshotxml2xmlout/all_parameters.xml b/tests/domainsnapshotxml2xmlout/all_parameters.xml index eb2ee85..4178ac6 100644 --- a/tests/domainsnapshotxml2xmlout/all_parameters.xml +++ b/tests/domainsnapshotxml2xmlout/all_parameters.xml @@ -6,6 +6,7 @@ <name>earlier_snap</name> </parent> <creationTime>1272917631</creationTime> + <memory snapshot='internal'/> <domain> <uuid>9d37b878-a7cc-9f9a-b78f-49b3abad25a8</uuid> </domain> diff --git a/tests/domainsnapshotxml2xmlout/disk_snapshot.xml b/tests/domainsnapshotxml2xmlout/disk_snapshot.xml index 0a4b179..57aef16 100644 --- a/tests/domainsnapshotxml2xmlout/disk_snapshot.xml +++ b/tests/domainsnapshotxml2xmlout/disk_snapshot.xml @@ -6,6 +6,7 @@ <name>earlier_snap</name> </parent> <creationTime>1272917631</creationTime> + <memory snapshot='no'/> <disks> <disk name='hda' snapshot='no'/> <disk name='hdb' snapshot='no'/> diff --git a/tests/domainsnapshotxml2xmlout/external_vm.xml b/tests/domainsnapshotxml2xmlout/external_vm.xml new file mode 100644 index 0000000..8814bce --- /dev/null +++ b/tests/domainsnapshotxml2xmlout/external_vm.xml @@ -0,0 +1,43 @@ +<domainsnapshot> + <name>my snap name</name> + <description>!@#$%^</description> + <state>running</state> + <parent> + <name>earlier_snap</name> + </parent> + <creationTime>1272917631</creationTime> + <memory snapshot='external' file='/dev/HostVG/GuestMemory'/> + <disks> + <disk name='hda' snapshot='no'/> + </disks> + <domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <metadata> + <app1:foo xmlns:app1="http://foo.org/">fooish</app1:foo> + <app2:bar xmlns:app2="http://bar.com/" maman="baz">barish</app2:bar> + </metadata> + <memory unit='KiB'>219100</memory> + <currentMemory unit='KiB'>219100</currentMemory> + <vcpu placement='static' cpuset='1-4,8-20,525'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk' snapshot='no'> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <memballoon model='virtio'/> + </devices> + </domain> +</domainsnapshot> diff --git a/tests/domainsnapshotxml2xmlout/full_domain.xml b/tests/domainsnapshotxml2xmlout/full_domain.xml index 27cf41d..65d1469 100644 --- a/tests/domainsnapshotxml2xmlout/full_domain.xml +++ b/tests/domainsnapshotxml2xmlout/full_domain.xml @@ -6,6 +6,7 @@ <name>earlier_snap</name> </parent> <creationTime>1272917631</creationTime> + <memory snapshot='internal'/> <domain type='qemu'> <name>QEMUGuest1</name> <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> diff --git a/tests/domainsnapshotxml2xmlout/metadata.xml b/tests/domainsnapshotxml2xmlout/metadata.xml index 93c9f39..f961458 100644 --- a/tests/domainsnapshotxml2xmlout/metadata.xml +++ b/tests/domainsnapshotxml2xmlout/metadata.xml @@ -6,6 +6,7 @@ <name>earlier_snap</name> </parent> <creationTime>1272917631</creationTime> + <memory snapshot='internal'/> <domain type='qemu'> <name>QEMUGuest1</name> <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> diff --git a/tests/domainsnapshotxml2xmlout/noparent.xml b/tests/domainsnapshotxml2xmlout/noparent.xml index cbac0d8..0cbbb65 100644 --- a/tests/domainsnapshotxml2xmlout/noparent.xml +++ b/tests/domainsnapshotxml2xmlout/noparent.xml @@ -3,6 +3,7 @@ <description>!@#$%^</description> <state>running</state> <creationTime>1272917631</creationTime> + <memory snapshot='internal'/> <domain> <uuid>9d37b878-a7cc-9f9a-b78f-49b3abad25a8</uuid> </domain> diff --git a/tests/domainsnapshotxml2xmlout/noparent_nodescription.xml b/tests/domainsnapshotxml2xmlout/noparent_nodescription.xml index 0de202d..4eb4016 100644 --- a/tests/domainsnapshotxml2xmlout/noparent_nodescription.xml +++ b/tests/domainsnapshotxml2xmlout/noparent_nodescription.xml @@ -3,5 +3,6 @@ <description>!@#$%^</description> <state>running</state> <creationTime>1272917631</creationTime> + <memory snapshot='internal'/> <active>1</active> </domainsnapshot> diff --git a/tests/domainsnapshotxml2xmlout/noparent_nodescription_noactive.xml b/tests/domainsnapshotxml2xmlout/noparent_nodescription_noactive.xml index 25b60f3..94d59a3 100644 --- a/tests/domainsnapshotxml2xmlout/noparent_nodescription_noactive.xml +++ b/tests/domainsnapshotxml2xmlout/noparent_nodescription_noactive.xml @@ -3,4 +3,5 @@ <description>!@#$%^</description> <state>running</state> <creationTime>1272917631</creationTime> + <memory snapshot='no'/> </domainsnapshot> diff --git a/tests/domainsnapshotxml2xmltest.c b/tests/domainsnapshotxml2xmltest.c index e363c99..84278d6 100644 --- a/tests/domainsnapshotxml2xmltest.c +++ b/tests/domainsnapshotxml2xmltest.c @@ -110,6 +110,7 @@ mymain(void) DO_TEST("noparent_nodescription", NULL, 1); DO_TEST("noparent", "9d37b878-a7cc-9f9a-b78f-49b3abad25a8", 0); DO_TEST("metadata", "c7a5fdbd-edaf-9455-926a-d65c16db1809", 0); + DO_TEST("external_vm", "c7a5fdbd-edaf-9455-926a-d65c16db1809", 0); virCapabilitiesFree(driver.caps); -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
From: Eric Blake <eblake@redhat.com>
Each <domainsnapshot> can now contain an optional <memory> element that describes how the VM state was handled, similar to disk snapshots. The new element will always appear in output; for back-compat, an input that lacks the element will assume 'no' or 'internal' according to the domain state.
So for 0.10.2, I plan to implement this table of combinations,
1.0.1, now. I'll go ahead and push the first four patches with all comments addressed (Osier had some comments on v1), once the release is out the door; I don't think it will introduce too many problems into the rest of your series. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

From: Eric Blake <eblake@redhat.com> There were not previous callers with require_match set to true. I originally implemented this bool with the intent of supporting ESX snapshot semantics, where the choice of internal vs. external vs. non-checkpointable must be made at domain start, but as ESX has not been wired up to use it yet, we might as well fix it to work with our next qemu patch for now, and worry about any further improvements (changing the bool to a flags argument) if the ESX driver decides to use this function in the future. * src/conf/snapshot_conf.c (virDomainSnapshotAlignDisks): Alter logic when require_match is true to deal with new XML. --- src/conf/snapshot_conf.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c index 6f77026..691950a 100644 --- a/src/conf/snapshot_conf.c +++ b/src/conf/snapshot_conf.c @@ -368,9 +368,8 @@ disksorter(const void *a, const void *b) * the domain, with a fallback to a passed in default. Convert paths * to disk targets for uniformity. Issue an error and return -1 if * any def->disks[n]->name appears more than once or does not map to - * dom->disks. If require_match, also require that existing - * def->disks snapshot states do not override explicit def->dom - * settings. */ + * dom->disks. If require_match, also ensure that there is no + * conflicting requests for both internal and external snapshots. */ int virDomainSnapshotAlignDisks(virDomainSnapshotDefPtr def, int default_snapshot, @@ -416,7 +415,6 @@ virDomainSnapshotAlignDisks(virDomainSnapshotDefPtr def, _("no disk named '%s'"), disk->name); goto cleanup; } - disk_snapshot = def->dom->disks[idx]->snapshot; if (virBitmapGetBit(map, idx, &inuse) < 0 || inuse) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -426,15 +424,22 @@ virDomainSnapshotAlignDisks(virDomainSnapshotDefPtr def, } ignore_value(virBitmapSetBit(map, idx)); disk->index = idx; - if (!disk_snapshot) - disk_snapshot = default_snapshot; + + disk_snapshot = def->dom->disks[idx]->snapshot; if (!disk->snapshot) { - disk->snapshot = disk_snapshot; - } else if (disk_snapshot && require_match && - disk->snapshot != disk_snapshot) { + if (disk_snapshot && + (!require_match || + disk_snapshot == VIR_DOMAIN_SNAPSHOT_LOCATION_NONE)) + disk->snapshot = disk_snapshot; + else + disk->snapshot = default_snapshot; + } else if (require_match && + disk->snapshot != default_snapshot && + !(disk->snapshot == VIR_DOMAIN_SNAPSHOT_LOCATION_NONE && + disk_snapshot == VIR_DOMAIN_SNAPSHOT_LOCATION_NONE)) { const char *tmp; - tmp = virDomainSnapshotLocationTypeToString(disk_snapshot); + tmp = virDomainSnapshotLocationTypeToString(default_snapshot); virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("disk '%s' must use snapshot mode '%s'"), disk->name, tmp); -- 1.7.12.4

From: Eric Blake <eblake@redhat.com> Now that the XML supports listing internal snapshots, it is worth always populating the <memory> and <disks> element to match. * src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Always parse disk info and set memory info. --- src/qemu/qemu_driver.c | 47 ++++++++++++++++++++++++++++++++++------------- 1 file changed, 34 insertions(+), 13 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 267bbf1..39a8eba 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11189,8 +11189,10 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, char uuidstr[VIR_UUID_STRING_BUFLEN]; virDomainSnapshotDefPtr def = NULL; bool update_current = true; - unsigned int parse_flags = 0; + unsigned int parse_flags = VIR_DOMAIN_SNAPSHOT_PARSE_DISKS; virDomainSnapshotObjPtr other = NULL; + int align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL; + int align_match = true; virCheckFlags(VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE | VIR_DOMAIN_SNAPSHOT_CREATE_CURRENT | @@ -11214,8 +11216,6 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, update_current = false; if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE) parse_flags |= VIR_DOMAIN_SNAPSHOT_PARSE_REDEFINE; - if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) - parse_flags |= VIR_DOMAIN_SNAPSHOT_PARSE_DISKS; qemuDriverLock(driver); virUUIDFormat(domain->uuid, uuidstr); @@ -11242,6 +11242,9 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, _("cannot halt after transient domain snapshot")); goto cleanup; } + if ((flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) || + !virDomainObjIsActive(vm)) + parse_flags |= VIR_DOMAIN_SNAPSHOT_PARSE_OFFLINE; if (!(def = virDomainSnapshotDefParseString(xmlDesc, driver->caps, QEMU_EXPECTED_VIRT_TYPES, @@ -11282,6 +11285,15 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } /* Check that any replacement is compatible */ + if ((flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) && + def->state != VIR_DOMAIN_DISK_SNAPSHOT) { + virReportError(VIR_ERR_INVALID_ARG, + _("disk-only flag for snapshot %s requires " + "disk-snapshot state"), + def->name); + goto cleanup; + + } if (def->dom && memcmp(def->dom->uuid, domain->uuid, VIR_UUID_BUFLEN)) { virReportError(VIR_ERR_INVALID_ARG, @@ -11331,10 +11343,13 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, other->def = NULL; snap = other; } - if (def->state == VIR_DOMAIN_DISK_SNAPSHOT && def->dom) { - if (virDomainSnapshotAlignDisks(def, - VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL, - false) < 0) + if (def->dom) { + if (def->state == VIR_DOMAIN_DISK_SNAPSHOT) { + align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; + align_match = false; + } + if (virDomainSnapshotAlignDisks(def, align_location, + align_match) < 0) goto cleanup; } } else { @@ -11353,13 +11368,14 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, "implemented yet")); goto cleanup; } - if (virDomainSnapshotAlignDisks(def, - VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL, - false) < 0) - goto cleanup; - if (qemuDomainSnapshotDiskPrepare(vm, def, &flags) < 0) + align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; + align_match = false; + if (virDomainSnapshotAlignDisks(def, align_location, + align_match) < 0 || + qemuDomainSnapshotDiskPrepare(vm, def, &flags) < 0) goto cleanup; def->state = VIR_DOMAIN_DISK_SNAPSHOT; + def->memory = VIR_DOMAIN_SNAPSHOT_LOCATION_NONE; } else { /* In a perfect world, we would allow qemu to tell us this. * The problem is that qemu only does this check @@ -11370,9 +11386,14 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, * the boot device. This is probably a bug in qemu, but we'll * work around it here for now. */ - if (!qemuDomainSnapshotIsAllowed(vm)) + if (!qemuDomainSnapshotIsAllowed(vm) || + virDomainSnapshotAlignDisks(def, align_location, + align_match) < 0) goto cleanup; def->state = virDomainObjGetState(vm, NULL); + def->memory = (def->state == VIR_DOMAIN_SHUTOFF ? + VIR_DOMAIN_SNAPSHOT_LOCATION_NONE : + VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL); } } -- 1.7.12.4

From: Eric Blake <eblake@redhat.com> Both system checkpoint snapshots and disk snapshots were iterating over all disks, doing a final sanity check before doing any work. But since future patches will allow offline snapshots to be either external or internal, it makes sense to share the pass over all disks, and then relax restrictions in that pass as new modes are implemented. Future patches can then handle external disks when the domain is offline, then handle offline --disk-snapshot, and finally, combine with migration to file to gain a complete external system checkpoint snapshot of an active domain without using 'savevm'. * src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare) (qemuDomainSnapshotIsAllowed): Merge... (qemuDomainSnapshotPrepare): ...into one function. (qemuDomainSnapshotCreateXML): Update caller. --- src/qemu/qemu_driver.c | 92 +++++++++++++++++++------------------------------- 1 file changed, 34 insertions(+), 58 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 39a8eba..69474de 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10546,34 +10546,6 @@ cleanup: } -static bool -qemuDomainSnapshotIsAllowed(virDomainObjPtr vm) -{ - int i; - - /* FIXME: we need to figure out what else here might succeed; in - * particular, if it's a raw device but on LVM, we could probably make - * that succeed as well - */ - for (i = 0; i < vm->def->ndisks; i++) { - virDomainDiskDefPtr disk = vm->def->disks[i]; - if (disk->type == VIR_DOMAIN_DISK_TYPE_NETWORK && - (disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_SHEEPDOG || - disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_RBD)) - continue; - - if ((disk->device == VIR_DOMAIN_DISK_DEVICE_LUN) || - (disk->device == VIR_DOMAIN_DISK_DEVICE_DISK && - disk->format > 0 && disk->format != VIR_STORAGE_FILE_QCOW2)) { - virReportError(VIR_ERR_OPERATION_INVALID, - _("Disk '%s' does not support snapshotting"), - disk->src); - return false; - } - } - - return true; -} /* this function expects the driver lock to be held by the caller */ static int @@ -10727,8 +10699,8 @@ endjob: } static int -qemuDomainSnapshotDiskPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, - unsigned int *flags) +qemuDomainSnapshotPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, + unsigned int *flags) { int ret = -1; int i; @@ -10740,7 +10712,8 @@ qemuDomainSnapshotDiskPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, int external = 0; qemuDomainObjPrivatePtr priv = vm->privateData; - if (reuse && !qemuCapsGet(priv->caps, QEMU_CAPS_TRANSACTION)) { + if (def->state == VIR_DOMAIN_DISK_SNAPSHOT && + reuse && !qemuCapsGet(priv->caps, QEMU_CAPS_TRANSACTION)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("reuse is not supported with this QEMU binary")); goto cleanup; @@ -10748,15 +10721,16 @@ qemuDomainSnapshotDiskPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, for (i = 0; i < def->ndisks; i++) { virDomainSnapshotDiskDefPtr disk = &def->disks[i]; + virDomainDiskDefPtr dom_disk = vm->def->disks[i]; switch (disk->snapshot) { case VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL: - if (active) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("active qemu domains require external disk " - "snapshots; disk %s requested internal"), - disk->name); - goto cleanup; + if (def->state != VIR_DOMAIN_DISK_SNAPSHOT && + dom_disk->type == VIR_DOMAIN_DISK_TYPE_NETWORK && + (dom_disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_SHEEPDOG || + dom_disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_RBD)) { + found = true; + break; } if (vm->def->disks[i]->format > 0 && vm->def->disks[i]->format != VIR_STORAGE_FILE_QCOW2) { @@ -10768,10 +10742,25 @@ qemuDomainSnapshotDiskPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, vm->def->disks[i]->format)); goto cleanup; } + if (def->state == VIR_DOMAIN_DISK_SNAPSHOT && active) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("active qemu domains require external disk " + "snapshots; disk %s requested internal"), + disk->name); + goto cleanup; + } found = true; break; case VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL: + if (def->state != VIR_DOMAIN_DISK_SNAPSHOT) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("system checkpoint external snapshot for " + "disk %s not implemented yet"), + disk->name); + goto cleanup; + } + if (!disk->format) { disk->format = VIR_STORAGE_FILE_QCOW2; } else if (disk->format != VIR_STORAGE_FILE_QCOW2 && @@ -10819,11 +10808,11 @@ qemuDomainSnapshotDiskPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, if (!found) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("disk snapshots require at least one disk to be " + _("snapshots require at least one disk to be " "selected for snapshot")); goto cleanup; } - if (active) { + if (def->state != VIR_DOMAIN_DISK_SNAPSHOT && active) { if (external == 1 || qemuCapsGet(priv->caps, QEMU_CAPS_TRANSACTION)) { *flags |= VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC; @@ -11034,7 +11023,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, /* For multiple disks, libvirt must pause externally to get all * snapshots to be at the same point in time, unless qemu supports * transactions. For a single disk, snapshot is atomic without - * requiring a pause. Thanks to qemuDomainSnapshotDiskPrepare, if + * requiring a pause. Thanks to qemuDomainSnapshotPrepare, if * we got to this point, the atomic flag now says whether we need * to pause, and a capability bit says whether to use transaction. */ @@ -11060,7 +11049,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, /* No way to roll back if first disk succeeds but later disks * fail, unless we have transaction support. - * Based on earlier qemuDomainSnapshotDiskPrepare, all + * Based on earlier qemuDomainSnapshotPrepare, all * disks in this list are now either SNAPSHOT_NO, or * SNAPSHOT_EXTERNAL with a valid file name and qcow2 format. */ qemuDomainObjEnterMonitorWithDriver(driver, vm); @@ -11370,31 +11359,18 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; align_match = false; - if (virDomainSnapshotAlignDisks(def, align_location, - align_match) < 0 || - qemuDomainSnapshotDiskPrepare(vm, def, &flags) < 0) - goto cleanup; def->state = VIR_DOMAIN_DISK_SNAPSHOT; def->memory = VIR_DOMAIN_SNAPSHOT_LOCATION_NONE; } else { - /* In a perfect world, we would allow qemu to tell us this. - * The problem is that qemu only does this check - * device-by-device; so if you had a domain that booted from a - * large qcow2 device, but had a secondary raw device - * attached, you wouldn't find out that you can't snapshot - * your guest until *after* it had spent the time to snapshot - * the boot device. This is probably a bug in qemu, but we'll - * work around it here for now. - */ - if (!qemuDomainSnapshotIsAllowed(vm) || - virDomainSnapshotAlignDisks(def, align_location, - align_match) < 0) - goto cleanup; def->state = virDomainObjGetState(vm, NULL); def->memory = (def->state == VIR_DOMAIN_SHUTOFF ? VIR_DOMAIN_SNAPSHOT_LOCATION_NONE : VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL); } + if (virDomainSnapshotAlignDisks(def, align_location, + align_match) < 0 || + qemuDomainSnapshotPrepare(vm, def, &flags) < 0) + goto cleanup; } if (snap) -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
From: Eric Blake <eblake@redhat.com>
Both system checkpoint snapshots and disk snapshots were iterating over all disks, doing a final sanity check before doing any work. But since future patches will allow offline snapshots to be either external or internal, it makes sense to share the pass over all disks, and then relax restrictions in that pass as new modes are implemented. Future patches can then handle external disks when the domain is offline, then handle offline --disk-snapshot, and finally, combine with migration to file to gain a complete external system checkpoint snapshot of an active domain without using 'savevm'.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare) (qemuDomainSnapshotIsAllowed): Merge... (qemuDomainSnapshotPrepare): ...into one function. (qemuDomainSnapshotCreateXML): Update caller. --- src/qemu/qemu_driver.c | 92 +++++++++++++++++++------------------------------- 1 file changed, 34 insertions(+), 58 deletions(-)
I've pushed 1-4, based on ACKs previously received, and after rebasing to mention 1.0.1 where needed. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

The code that saves domain memory by migration to file can be reused while doing external snapshots of a machine. This patch extracts the common code and places it in a separate function. --- src/qemu/qemu_driver.c | 237 +++++++++++++++++++++++++++---------------------- 1 file changed, 133 insertions(+), 104 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 69474de..377439a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2605,7 +2605,7 @@ bswap_header(struct qemud_save_header *hdr) { /* return -errno on failure, or 0 on success */ static int -qemuDomainSaveHeader(int fd, const char *path, char *xml, +qemuDomainSaveHeader(int fd, const char *path, const char *xml, struct qemud_save_header *header) { int ret = 0; @@ -2752,103 +2752,38 @@ cleanup: return fd; } -/* This internal function expects the driver lock to already be held on - * entry and the vm must be active + locked. Vm will be unlocked and - * potentially free'd after this returns (eg transient VMs are freed - * shutdown). So 'vm' must not be referenced by the caller after - * this returns (whether returning success or failure). - */ +/* Helper function to execute a migration to file with a correct save header + * the caller needs to make sure that the processors are stopped and do all other + * actions besides saving memory */ static int -qemuDomainSaveInternal(struct qemud_driver *driver, virDomainPtr dom, - virDomainObjPtr vm, const char *path, - int compressed, const char *xmlin, unsigned int flags) +qemuDomainSaveMemory(struct qemud_driver *driver, + virDomainObjPtr vm, + const char *path, + const char *xml, + int compressed, + bool was_running, + unsigned int flags, + enum qemuDomainAsyncJob asyncJob) { - char *xml = NULL; struct qemud_save_header header; bool bypassSecurityDriver = false; - int ret = -1; - int rc; - virDomainEventPtr event = NULL; - qemuDomainObjPrivatePtr priv; bool needUnlink = false; - size_t len; - unsigned long long offset; - unsigned long long pad; + int ret = -1; int fd = -1; int directFlag = 0; virFileWrapperFdPtr wrapperFd = NULL; unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING; - - if (qemuProcessAutoDestroyActive(driver, vm)) { - virReportError(VIR_ERR_OPERATION_INVALID, - "%s", _("domain is marked for auto destroy")); - goto cleanup; - } - if (virDomainHasDiskMirror(vm)) { - virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", - _("domain has active block copy job")); - goto cleanup; - } + unsigned long long pad; + unsigned long long offset; + size_t len; memset(&header, 0, sizeof(header)); memcpy(header.magic, QEMUD_SAVE_PARTIAL, sizeof(header.magic)); header.version = QEMUD_SAVE_VERSION; + header.was_running = was_running ? 1 : 0; header.compressed = compressed; - priv = vm->privateData; - - if (qemuDomainObjBeginAsyncJobWithDriver(driver, vm, - QEMU_ASYNC_JOB_SAVE) < 0) - goto cleanup; - - memset(&priv->job.info, 0, sizeof(priv->job.info)); - priv->job.info.type = VIR_DOMAIN_JOB_UNBOUNDED; - - /* Pause */ - if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { - header.was_running = 1; - if (qemuProcessStopCPUs(driver, vm, VIR_DOMAIN_PAUSED_SAVE, - QEMU_ASYNC_JOB_SAVE) < 0) - goto endjob; - - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("guest unexpectedly quit")); - goto endjob; - } - } - /* libvirt.c already guaranteed these two flags are exclusive. */ - if (flags & VIR_DOMAIN_SAVE_RUNNING) - header.was_running = 1; - else if (flags & VIR_DOMAIN_SAVE_PAUSED) - header.was_running = 0; - - /* Get XML for the domain. Restore needs only the inactive xml, - * including secure. We should get the same result whether xmlin - * is NULL or whether it was the live xml of the domain moments - * before. */ - if (xmlin) { - virDomainDefPtr def = NULL; - - if (!(def = virDomainDefParseString(driver->caps, xmlin, - QEMU_EXPECTED_VIRT_TYPES, - VIR_DOMAIN_XML_INACTIVE))) { - goto endjob; - } - if (!virDomainDefCheckABIStability(vm->def, def)) { - virDomainDefFree(def); - goto endjob; - } - xml = qemuDomainDefFormatLive(driver, def, true, true); - } else { - xml = qemuDomainDefFormatLive(driver, vm->def, true, true); - } - if (!xml) { - virReportError(VIR_ERR_OPERATION_FAILED, - "%s", _("failed to get domain xml")); - goto endjob; - } len = strlen(xml) + 1; offset = sizeof(header) + len; @@ -2865,7 +2800,7 @@ qemuDomainSaveInternal(struct qemud_driver *driver, virDomainPtr dom, ((offset + pad) % QEMU_MONITOR_MIGRATE_TO_FILE_BS)); if (VIR_EXPAND_N(xml, len, pad) < 0) { virReportOOMError(); - goto endjob; + goto cleanup; } offset += pad; header.xml_len = len; @@ -2883,22 +2818,21 @@ qemuDomainSaveInternal(struct qemud_driver *driver, virDomainPtr dom, fd = qemuOpenFile(driver, path, O_WRONLY | O_TRUNC | O_CREAT | directFlag, &needUnlink, &bypassSecurityDriver); if (fd < 0) - goto endjob; + goto cleanup; + if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags))) - goto endjob; + goto cleanup; /* Write header to file, followed by XML */ - if (qemuDomainSaveHeader(fd, path, xml, &header) < 0) { - VIR_FORCE_CLOSE(fd); - goto endjob; - } + if (qemuDomainSaveHeader(fd, path, xml, &header) < 0) + goto cleanup; /* Perform the migration */ if (qemuMigrationToFile(driver, vm, fd, offset, path, qemuCompressProgramName(compressed), bypassSecurityDriver, - QEMU_ASYNC_JOB_SAVE) < 0) - goto endjob; + asyncJob) < 0) + goto cleanup; /* Touch up file header to mark image complete. */ @@ -2908,26 +2842,126 @@ qemuDomainSaveInternal(struct qemud_driver *driver, virDomainPtr dom, * that's acceptable. */ if (VIR_CLOSE(fd) < 0) { virReportSystemError(errno, _("unable to close %s"), path); - goto endjob; + goto cleanup; } + if (virFileWrapperFdClose(wrapperFd) < 0) - goto endjob; - fd = qemuOpenFile(driver, path, O_WRONLY, NULL, NULL); - if (fd < 0) - goto endjob; + goto cleanup; + + if ((fd = qemuOpenFile(driver, path, O_WRONLY, NULL, NULL)) < 0) + goto cleanup; memcpy(header.magic, QEMUD_SAVE_MAGIC, sizeof(header.magic)); + if (safewrite(fd, &header, sizeof(header)) != sizeof(header)) { virReportSystemError(errno, _("unable to write %s"), path); - goto endjob; + goto cleanup; } + if (VIR_CLOSE(fd) < 0) { virReportSystemError(errno, _("unable to close %s"), path); - goto endjob; + goto cleanup; } ret = 0; +cleanup: + VIR_FORCE_CLOSE(fd); + virFileWrapperFdCatchError(wrapperFd); + virFileWrapperFdFree(wrapperFd); + + if (ret != 0 && needUnlink) + unlink(path); + + return ret; +} + +/* This internal function expects the driver lock to already be held on + * entry and the vm must be active + locked. Vm will be unlocked and + * potentially free'd after this returns (eg transient VMs are freed + * shutdown). So 'vm' must not be referenced by the caller after + * this returns (whether returning success or failure). + */ +static int +qemuDomainSaveInternal(struct qemud_driver *driver, virDomainPtr dom, + virDomainObjPtr vm, const char *path, + int compressed, const char *xmlin, unsigned int flags) +{ + char *xml = NULL; + bool was_running = false; + int ret = -1; + int rc; + virDomainEventPtr event = NULL; + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (qemuProcessAutoDestroyActive(driver, vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("domain is marked for auto destroy")); + goto cleanup; + } + if (virDomainHasDiskMirror(vm)) { + virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", + _("domain has active block copy job")); + goto cleanup; + } + + if (qemuDomainObjBeginAsyncJobWithDriver(driver, vm, + QEMU_ASYNC_JOB_SAVE) < 0) + + memset(&priv->job.info, 0, sizeof(priv->job.info)); + priv->job.info.type = VIR_DOMAIN_JOB_UNBOUNDED; + + /* Pause */ + if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { + was_running = true; + if (qemuProcessStopCPUs(driver, vm, VIR_DOMAIN_PAUSED_SAVE, + QEMU_ASYNC_JOB_SAVE) < 0) + goto endjob; + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("guest unexpectedly quit")); + goto endjob; + } + } + + /* libvirt.c already guaranteed these two flags are exclusive. */ + if (flags & VIR_DOMAIN_SAVE_RUNNING) + was_running = true; + else if (flags & VIR_DOMAIN_SAVE_PAUSED) + was_running = false; + + /* Get XML for the domain. Restore needs only the inactive xml, + * including secure. We should get the same result whether xmlin + * is NULL or whether it was the live xml of the domain moments + * before. */ + if (xmlin) { + virDomainDefPtr def = NULL; + + if (!(def = virDomainDefParseString(driver->caps, xmlin, + QEMU_EXPECTED_VIRT_TYPES, + VIR_DOMAIN_XML_INACTIVE))) { + goto endjob; + } + if (!virDomainDefCheckABIStability(vm->def, def)) { + virDomainDefFree(def); + goto endjob; + } + xml = qemuDomainDefFormatLive(driver, def, true, true); + } else { + xml = qemuDomainDefFormatLive(driver, vm->def, true, true); + } + if (!xml) { + virReportError(VIR_ERR_OPERATION_FAILED, + "%s", _("failed to get domain xml")); + goto endjob; + } + + ret = qemuDomainSaveMemory(driver, vm, path, xml, compressed, + was_running, flags, QEMU_ASYNC_JOB_SAVE); + if (ret < 0) + goto endjob; + /* Shut it down */ qemuProcessStop(driver, vm, VIR_DOMAIN_SHUTOFF_SAVED, 0); virDomainAuditStop(vm, "saved"); @@ -2943,25 +2977,20 @@ qemuDomainSaveInternal(struct qemud_driver *driver, virDomainPtr dom, endjob: if (vm) { if (ret != 0) { - if (header.was_running && virDomainObjIsActive(vm)) { + if (was_running && virDomainObjIsActive(vm)) { rc = qemuProcessStartCPUs(driver, vm, dom->conn, VIR_DOMAIN_RUNNING_SAVE_CANCELED, QEMU_ASYNC_JOB_SAVE); if (rc < 0) VIR_WARN("Unable to resume guest CPUs after save failure"); } - virFileWrapperFdCatchError(wrapperFd); } if (qemuDomainObjEndAsyncJob(driver, vm) == 0) vm = NULL; } cleanup: - VIR_FORCE_CLOSE(fd); - virFileWrapperFdFree(wrapperFd); VIR_FREE(xml); - if (ret != 0 && needUnlink) - unlink(path); if (event) qemuDomainEventQueue(driver, event); if (vm) -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
The code that saves domain memory by migration to file can be reused while doing external snapshots of a machine. This patch extracts the common code and places it in a separate function. --- src/qemu/qemu_driver.c | 237 +++++++++++++++++++++++++++---------------------- 1 file changed, 133 insertions(+), 104 deletions(-)
ACK on this patch still stands. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

The default behavior while creating external checkpoints is to pause the guest while the memory state is captured. We want the users to sacrifice space saving for creating the memory save image while the guest is live to minimize downtime. This patch adds a flag that causes the guest not to be paused before taking the snapshot. *include/libvirt/libvirt.h.in: - add new paused reason: VIR_DOMAIN_PAUSED_SNAPSHOT - add new flag for taking snapshot: VIR_DOMAIN_SNAPSHOT_CREATE_LIVE *tools/virsh-domain-monitor.c: - add string representation for VIR_DOMAIN_PAUSED_SNAPSHOT *tools/virsh-snapshot.c: - add support for VIR_DOMAIN_SNAPSHOT_CREATE_LIVE *tools/virsh.pod: - add docs for --live option added to use VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag --- include/libvirt/libvirt.h.in | 4 ++++ src/libvirt.c | 6 ++++++ tools/virsh-domain-monitor.c | 2 ++ tools/virsh-snapshot.c | 6 ++++++ tools/virsh.pod | 12 ++++++++++-- 5 files changed, 28 insertions(+), 2 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 2b17cef..fe58c08 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -179,6 +179,7 @@ typedef enum { VIR_DOMAIN_PAUSED_WATCHDOG = 6, /* paused due to a watchdog event */ VIR_DOMAIN_PAUSED_FROM_SNAPSHOT = 7, /* paused after restoring from snapshot */ VIR_DOMAIN_PAUSED_SHUTTING_DOWN = 8, /* paused during shutdown process */ + VIR_DOMAIN_PAUSED_SNAPSHOT = 9, /* paused while creating a snapshot */ #ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_PAUSED_LAST @@ -3770,6 +3771,9 @@ typedef enum { the domain */ VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC = (1 << 7), /* atomically avoid partial changes */ + VIR_DOMAIN_SNAPSHOT_CREATE_LIVE = (1 << 8), /* create the snapshot + while the guest is + running */ } virDomainSnapshotCreateFlags; /* Take a snapshot of the current VM state */ diff --git a/src/libvirt.c b/src/libvirt.c index 2a01b80..1da3973 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -17770,6 +17770,12 @@ virDomainSnapshotGetConnect(virDomainSnapshotPtr snapshot) * running after the snapshot. This flag is invalid on transient domains, * and is incompatible with VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE. * + * If @flags includes VIR_DOMAIN_SNAPSHOT_CREATE_LIVE, then the domain + * is not paused while creating the snapshot. This increases the size + * of the memory dump file, but reduces downtime of the guest while + * taking the snapshot. (Note: this feature works only with external + * checkpoints) + * * If @flags includes VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY, then the * snapshot will be limited to the disks described in @xmlDesc, and no * VM state will be saved. For an active guest, the disk image may be diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c index b264f15..faf23fe 100644 --- a/tools/virsh-domain-monitor.c +++ b/tools/virsh-domain-monitor.c @@ -223,6 +223,8 @@ vshDomainStateReasonToString(int state, int reason) return N_("from snapshot"); case VIR_DOMAIN_PAUSED_SHUTTING_DOWN: return N_("shutting down"); + case VIR_DOMAIN_PAUSED_SNAPSHOT: + return N_("creating snapshot"); case VIR_DOMAIN_PAUSED_UNKNOWN: case VIR_DOMAIN_PAUSED_LAST: ; diff --git a/tools/virsh-snapshot.c b/tools/virsh-snapshot.c index 1641f90..77364e8 100644 --- a/tools/virsh-snapshot.c +++ b/tools/virsh-snapshot.c @@ -127,6 +127,7 @@ static const vshCmdOptDef opts_snapshot_create[] = { {"reuse-external", VSH_OT_BOOL, 0, N_("reuse any existing external files")}, {"quiesce", VSH_OT_BOOL, 0, N_("quiesce guest's file systems")}, {"atomic", VSH_OT_BOOL, 0, N_("require atomic operation")}, + {"live", VSH_OT_BOOL, 0, N_("take a live snapshot")}, {NULL, 0, 0, NULL} }; @@ -155,6 +156,8 @@ cmdSnapshotCreate(vshControl *ctl, const vshCmd *cmd) flags |= VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE; if (vshCommandOptBool(cmd, "atomic")) flags |= VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC; + if (vshCommandOptBool(cmd, "live")) + flags |= VIR_DOMAIN_SNAPSHOT_CREATE_LIVE; dom = vshCommandOptDomain(ctl, cmd, NULL); if (dom == NULL) @@ -263,6 +266,7 @@ static const vshCmdOptDef opts_snapshot_create_as[] = { {"reuse-external", VSH_OT_BOOL, 0, N_("reuse any existing external files")}, {"quiesce", VSH_OT_BOOL, 0, N_("quiesce guest's file systems")}, {"atomic", VSH_OT_BOOL, 0, N_("require atomic operation")}, + {"live", VSH_OT_BOOL, 0, N_("take a live snapshot")}, {"diskspec", VSH_OT_ARGV, 0, N_("disk attributes: disk[,snapshot=type][,driver=type][,file=name]")}, {NULL, 0, 0, NULL} @@ -292,6 +296,8 @@ cmdSnapshotCreateAs(vshControl *ctl, const vshCmd *cmd) flags |= VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE; if (vshCommandOptBool(cmd, "atomic")) flags |= VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC; + if (vshCommandOptBool(cmd, "live")) + flags |= VIR_DOMAIN_SNAPSHOT_CREATE_LIVE; dom = vshCommandOptDomain(ctl, cmd, NULL); if (dom == NULL) diff --git a/tools/virsh.pod b/tools/virsh.pod index 07d6a67..a331e56 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -2594,7 +2594,7 @@ used to represent properties of snapshots. =item B<snapshot-create> I<domain> [I<xmlfile>] {[I<--redefine> [I<--current>]] | [I<--no-metadata>] [I<--halt>] [I<--disk-only>] [I<--reuse-external>] -[I<--quiesce>] [I<--atomic>]} +[I<--quiesce>] [I<--atomic>] [I<--live>]} Create a snapshot for domain I<domain> with the properties specified in I<xmlfile>. Normally, the only properties settable for a domain snapshot @@ -2647,6 +2647,10 @@ this. If this flag is not specified, then some hypervisors may fail after partially performing the action, and B<dumpxml> must be used to see whether any partial changes occurred. +If I<--live> is specified, libvirt takes the snapshot while the guest is +running. This increases the size of the memory image of the external +checkpoint. This is currently supported only for external checkpoints. + Existence of snapshot metadata will prevent attempts to B<undefine> a persistent domain. However, for transient domains, snapshot metadata is silently lost when the domain quits running (whether @@ -2655,7 +2659,7 @@ by command such as B<destroy> or by internal guest action). =item B<snapshot-create-as> I<domain> {[I<--print-xml>] | [I<--no-metadata>] [I<--halt>] [I<--reuse-external>]} [I<name>] [I<description>] [I<--disk-only> [I<--quiesce>] [I<--atomic>] -[[I<--diskspec>] B<diskspec>]...] +[I<--live>] [[I<--diskspec>] B<diskspec>]...] Create a snapshot for domain I<domain> with the given <name> and <description>; if either value is omitted, libvirt will choose a @@ -2700,6 +2704,10 @@ this. If this flag is not specified, then some hypervisors may fail after partially performing the action, and B<dumpxml> must be used to see whether any partial changes occurred. +If I<--live> is specified, libvirt takes the snapshot while the guest is +running. This increases the size of the memory image of the external +checkpoint. This is currently supported only for external checkpoints. + =item B<snapshot-current> I<domain> {[I<--name>] | [I<--security-info>] | [I<snapshotname>]} -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
The default behavior while creating external checkpoints is to pause the guest while the memory state is captured. We want the users to sacrifice space saving for creating the memory save image while the guest is live to minimize downtime.
This patch adds a flag that causes the guest not to be paused before taking the snapshot. *include/libvirt/libvirt.h.in: - add new paused reason: VIR_DOMAIN_PAUSED_SNAPSHOT - add new flag for taking snapshot: VIR_DOMAIN_SNAPSHOT_CREATE_LIVE *tools/virsh-domain-monitor.c: - add string representation for VIR_DOMAIN_PAUSED_SNAPSHOT *tools/virsh-snapshot.c: - add support for VIR_DOMAIN_SNAPSHOT_CREATE_LIVE *tools/virsh.pod: - add docs for --live option added to use VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag ---
+++ b/src/libvirt.c @@ -17770,6 +17770,12 @@ virDomainSnapshotGetConnect(virDomainSnapshotPtr snapshot) * running after the snapshot. This flag is invalid on transient domains, * and is incompatible with VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE. * + * If @flags includes VIR_DOMAIN_SNAPSHOT_CREATE_LIVE, then the domain + * is not paused while creating the snapshot. This increases the size + * of the memory dump file, but reduces downtime of the guest while + * taking the snapshot. (Note: this feature works only with external + * checkpoints)
It is entirely feasible that we could do an internal snapshot without pausing the guest, if future qemu gave us support; so our documentation should be couched in terms of possible restrictions, not permanent restrictions. s/(Note: this feature works only with external checkpoints)/Some hypervisors only support this flag during external checkpoints./ The virsh.pod wording is fine. ACK with that change. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

The new external system checkpoints will require an async job while the snapshot is taken. This patch adds QEMU_ASYNC_JOB_SNAPSHOT to track this job type. --- src/qemu/qemu_domain.c | 3 +++ src/qemu/qemu_domain.h | 1 + src/qemu/qemu_process.c | 19 +++++++++++++++++++ 3 files changed, 23 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d01e366..a5592b9 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -65,6 +65,7 @@ VIR_ENUM_IMPL(qemuDomainAsyncJob, QEMU_ASYNC_JOB_LAST, "migration in", "save", "dump", + "snapshot", ); @@ -79,6 +80,7 @@ qemuDomainAsyncJobPhaseToString(enum qemuDomainAsyncJob job, case QEMU_ASYNC_JOB_SAVE: case QEMU_ASYNC_JOB_DUMP: + case QEMU_ASYNC_JOB_SNAPSHOT: case QEMU_ASYNC_JOB_NONE: case QEMU_ASYNC_JOB_LAST: ; /* fall through */ @@ -101,6 +103,7 @@ qemuDomainAsyncJobPhaseFromString(enum qemuDomainAsyncJob job, case QEMU_ASYNC_JOB_SAVE: case QEMU_ASYNC_JOB_DUMP: + case QEMU_ASYNC_JOB_SNAPSHOT: case QEMU_ASYNC_JOB_NONE: case QEMU_ASYNC_JOB_LAST: ; /* fall through */ diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 8a66f14..9c2f67c 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -92,6 +92,7 @@ enum qemuDomainAsyncJob { QEMU_ASYNC_JOB_MIGRATION_IN, QEMU_ASYNC_JOB_SAVE, QEMU_ASYNC_JOB_DUMP, + QEMU_ASYNC_JOB_SNAPSHOT, QEMU_ASYNC_JOB_LAST }; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 3ac5282..67a7157 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3046,6 +3046,25 @@ qemuProcessRecoverJob(struct qemud_driver *driver, } break; + case QEMU_ASYNC_JOB_SNAPSHOT: + qemuDomainObjEnterMonitor(driver, vm); + ignore_value(qemuMonitorMigrateCancel(priv->mon)); + qemuDomainObjExitMonitor(driver, vm); + /* resume the domain but only if it was paused as a result of + * creating the snapshot. */ + if (state == VIR_DOMAIN_PAUSED && + ((job->asyncJob == QEMU_ASYNC_JOB_SNAPSHOT && + reason == VIR_DOMAIN_PAUSED_MIGRATION) || + reason == VIR_DOMAIN_PAUSED_UNKNOWN)) { + if (qemuProcessStartCPUs(driver, vm, conn, + VIR_DOMAIN_RUNNING_UNPAUSED, + QEMU_ASYNC_JOB_NONE) < 0) { + VIR_WARN("Could not resume domain %s after snapshot", + vm->def->name); + } + } + break; + case QEMU_ASYNC_JOB_NONE: case QEMU_ASYNC_JOB_LAST: break; -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
The new external system checkpoints will require an async job while the snapshot is taken. This patch adds QEMU_ASYNC_JOB_SNAPSHOT to track this job type. --- src/qemu/qemu_domain.c | 3 +++ src/qemu/qemu_domain.h | 1 + src/qemu/qemu_process.c | 19 +++++++++++++++++++ 3 files changed, 23 insertions(+)
+++ b/src/qemu/qemu_process.c @@ -3046,6 +3046,25 @@ qemuProcessRecoverJob(struct qemud_driver *driver, } break;
+ case QEMU_ASYNC_JOB_SNAPSHOT: + qemuDomainObjEnterMonitor(driver, vm); + ignore_value(qemuMonitorMigrateCancel(priv->mon)); + qemuDomainObjExitMonitor(driver, vm); + /* resume the domain but only if it was paused as a result of + * creating the snapshot. */ + if (state == VIR_DOMAIN_PAUSED && + ((job->asyncJob == QEMU_ASYNC_JOB_SNAPSHOT &&
In this particular case branch, job->asyncJob is always equal to QEMU_ASYNC_JOB_SNAPSHOT.
+ reason == VIR_DOMAIN_PAUSED_MIGRATION) ||
Don't you mean VIR_DOMAIN_PAUSED_SNAPSHOT here? I think you can simplify the code by squashing this in for less redundant code. ACK with this change: diff --git i/src/qemu/qemu_process.c w/src/qemu/qemu_process.c index 67a7157..84be755 100644 --- i/src/qemu/qemu_process.c +++ w/src/qemu/qemu_process.c @@ -3025,18 +3025,21 @@ qemuProcessRecoverJob(struct qemud_driver *driver, case QEMU_ASYNC_JOB_SAVE: case QEMU_ASYNC_JOB_DUMP: + case QEMU_ASYNC_JOB_SNAPSHOT: qemuDomainObjEnterMonitor(driver, vm); ignore_value(qemuMonitorMigrateCancel(priv->mon)); qemuDomainObjExitMonitor(driver, vm); /* resume the domain but only if it was paused as a result of - * running save/dump operation. Although we are recovering an - * async job, this function is run at startup and must resume - * things using sync monitor connections. */ + * running a migration-to-file operation. Although we are + * recovering an async job, this function is run at startup + * and must resume things using sync monitor connections. */ if (state == VIR_DOMAIN_PAUSED && ((job->asyncJob == QEMU_ASYNC_JOB_DUMP && reason == VIR_DOMAIN_PAUSED_DUMP) || (job->asyncJob == QEMU_ASYNC_JOB_SAVE && reason == VIR_DOMAIN_PAUSED_SAVE) || + (job->asyncJob == QEMU_ASYNC_JOB_SNAPSHOT && + reason == VIR_DOMAIN_PAUSED_SNAPSHOT) || reason == VIR_DOMAIN_PAUSED_UNKNOWN)) { if (qemuProcessStartCPUs(driver, vm, conn, VIR_DOMAIN_RUNNING_UNPAUSED, @@ -3046,25 +3049,6 @@ qemuProcessRecoverJob(struct qemud_driver *driver, } break; - case QEMU_ASYNC_JOB_SNAPSHOT: - qemuDomainObjEnterMonitor(driver, vm); - ignore_value(qemuMonitorMigrateCancel(priv->mon)); - qemuDomainObjExitMonitor(driver, vm); - /* resume the domain but only if it was paused as a result of - * creating the snapshot. */ - if (state == VIR_DOMAIN_PAUSED && - ((job->asyncJob == QEMU_ASYNC_JOB_SNAPSHOT && - reason == VIR_DOMAIN_PAUSED_MIGRATION) || - reason == VIR_DOMAIN_PAUSED_UNKNOWN)) { - if (qemuProcessStartCPUs(driver, vm, conn, - VIR_DOMAIN_RUNNING_UNPAUSED, - QEMU_ASYNC_JOB_NONE) < 0) { - VIR_WARN("Could not resume domain %s after snapshot", - vm->def->name); - } - } - break; - case QEMU_ASYNC_JOB_NONE: case QEMU_ASYNC_JOB_LAST: break; -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/02/12 18:21, Eric Blake wrote:
On 11/01/2012 10:22 AM, Peter Krempa wrote:
The new external system checkpoints will require an async job while the snapshot is taken. This patch adds QEMU_ASYNC_JOB_SNAPSHOT to track this job type. --- src/qemu/qemu_domain.c | 3 +++ src/qemu/qemu_domain.h | 1 + src/qemu/qemu_process.c | 19 +++++++++++++++++++ 3 files changed, 23 insertions(+)
+++ b/src/qemu/qemu_process.c @@ -3046,6 +3046,25 @@ qemuProcessRecoverJob(struct qemud_driver *driver, } break;
+ case QEMU_ASYNC_JOB_SNAPSHOT: + qemuDomainObjEnterMonitor(driver, vm); + ignore_value(qemuMonitorMigrateCancel(priv->mon)); + qemuDomainObjExitMonitor(driver, vm); + /* resume the domain but only if it was paused as a result of + * creating the snapshot. */ + if (state == VIR_DOMAIN_PAUSED && + ((job->asyncJob == QEMU_ASYNC_JOB_SNAPSHOT &&
In this particular case branch, job->asyncJob is always equal to QEMU_ASYNC_JOB_SNAPSHOT.
+ reason == VIR_DOMAIN_PAUSED_MIGRATION) ||
Don't you mean VIR_DOMAIN_PAUSED_SNAPSHOT here?
Yes, happens way too often when cutting&pasting :(
I think you can simplify the code by squashing this in for less redundant code. ACK with this change:
The simplification makes sense. It's possible we will need to change the code back to a separate branch once we'll be implementing a more elaborate async job recovery for snapshots.

Before now, libvirt supported only internal snapshots for active guests. This patch renames this function to qemuDomainSnapshotCreateActiveInternal to prepare the grounds for external active snapshots. --- src/qemu/qemu_driver.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 377439a..7dedb96 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10645,13 +10645,14 @@ qemuDomainSnapshotCreateInactive(struct qemud_driver *driver, return qemuDomainSnapshotForEachQcow2(driver, vm, snap, "-c", false); } + /* The domain is expected to be locked and active. */ static int -qemuDomainSnapshotCreateActive(virConnectPtr conn, - struct qemud_driver *driver, - virDomainObjPtr *vmptr, - virDomainSnapshotObjPtr snap, - unsigned int flags) +qemuDomainSnapshotCreateActiveInternal(virConnectPtr conn, + struct qemud_driver *driver, + virDomainObjPtr *vmptr, + virDomainSnapshotObjPtr snap, + unsigned int flags) { virDomainObjPtr vm = *vmptr; qemuDomainObjPrivatePtr priv = vm->privateData; @@ -11440,8 +11441,8 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, if (qemuDomainSnapshotCreateInactive(driver, vm, snap) < 0) goto cleanup; } else { - if (qemuDomainSnapshotCreateActive(domain->conn, driver, - &vm, snap, flags) < 0) + if (qemuDomainSnapshotCreateActiveInternal(domain->conn, driver, + &vm, snap, flags) < 0) goto cleanup; } -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
Before now, libvirt supported only internal snapshots for active guests. This patch renames this function to qemuDomainSnapshotCreateActiveInternal to prepare the grounds for external active snapshots. --- src/qemu/qemu_driver.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
ACK. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

When pausing the guest while migration is running (to speed up convergence) the virDomainSuspend API checks if the migration job is active before entering the job. This could cause a possible race if the virDomainSuspend is called while the job is active but ends before the Suspend API enters the job (this would require that the migration is aborted). This would cause a incorrect event to be emitted. --- src/qemu/qemu_driver.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7dedb96..98dc441 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1658,14 +1658,6 @@ static int qemudDomainSuspend(virDomainPtr dom) { priv = vm->privateData; - if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) { - reason = VIR_DOMAIN_PAUSED_MIGRATION; - eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED; - } else { - reason = VIR_DOMAIN_PAUSED_USER; - eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_PAUSED; - } - if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_SUSPEND) < 0) goto cleanup; @@ -1675,6 +1667,14 @@ static int qemudDomainSuspend(virDomainPtr dom) { goto endjob; } + if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) { + reason = VIR_DOMAIN_PAUSED_MIGRATION; + eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED; + } else { + reason = VIR_DOMAIN_PAUSED_USER; + eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_PAUSED; + } + state = virDomainObjGetState(vm, NULL); if (state == VIR_DOMAIN_PMSUSPENDED) { virReportError(VIR_ERR_OPERATION_INVALID, -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
When pausing the guest while migration is running (to speed up convergence) the virDomainSuspend API checks if the migration job is active before entering the job. This could cause a possible race if the virDomainSuspend is called while the job is active but ends before the Suspend API enters the job (this would require that the migration is aborted). This would cause a incorrect event to be emitted.
Not just aborted, but also completed normally. At any rate, I agree that we shouldn't be relying on the state of job.asyncJob until we have obtained the sync job.
--- src/qemu/qemu_driver.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7dedb96..98dc441 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1658,14 +1658,6 @@ static int qemudDomainSuspend(virDomainPtr dom) {
priv = vm->privateData;
- if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) { - reason = VIR_DOMAIN_PAUSED_MIGRATION; - eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED; - } else { - reason = VIR_DOMAIN_PAUSED_USER; - eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_PAUSED; - } - if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_SUSPEND) < 0) goto cleanup;
@@ -1675,6 +1667,14 @@ static int qemudDomainSuspend(virDomainPtr dom) { goto endjob; }
+ if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) { + reason = VIR_DOMAIN_PAUSED_MIGRATION; + eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED; + } else { + reason = VIR_DOMAIN_PAUSED_USER; + eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_PAUSED; + } +
I think we're missing something. The idea here is that if the user starts an async job, then they can request pause but not resume until that async job has completed. How do we make sure resume is blocked? By setting reason to something other than VIR_DOMAIN_PAUSED_USER. The existing code worked for outgoing live migration by setting up the async job, then if the user requests pause, sets things to the same state as if the user had requested paused migration in the first place. So in a similar vein - if the user requests live snapshot, but then changes their mind and pauses, we should set things to the same state as if they had requested paused snapshot in the first place. [time passes, as well as an IRC chat...] Oh, you did exactly that in patch 10/20! In that case, this code is fixing a pre-existing bug, in preparation for the work in patch 10. ACK as-is. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/02/12 18:59, Eric Blake wrote:
On 11/01/2012 10:22 AM, Peter Krempa wrote:
When pausing the guest while migration is running (to speed up convergence) the virDomainSuspend API checks if the migration job is active before entering the job. This could cause a possible race if the virDomainSuspend is called while the job is active but ends before the Suspend API enters the job (this would require that the migration is aborted). This would cause a incorrect event to be emitted.
Not just aborted, but also completed normally. At any rate, I agree that we shouldn't be relying on the state of job.asyncJob until we have obtained the sync job.
--- src/qemu/qemu_driver.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
[...]
In that case, this code is fixing a pre-existing bug, in preparation for the work in patch 10. ACK as-is.
Thanks. I pushed this one. I should have ordered it at the beginning of the series to avoid confusion. Peter

This patch adds support to take external system checkpoints. The functionality is layered on top of the previous disk-only snapshot code. When the checkpoint is requested the domain memory is saved to the memory image file using migration to file. (The user may specify to do take the memory image while the guest is live with the VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag.) The memory save image shares format with the image created by virDomainSave() API. --- This version adds support for pausing the guest while the snapshot is being taken. If this happens while the migration job is running the machine is unpaused after the snapshot is done. (This should mirror the behavior of live migration). As there isn't any support for aborting other jobs than migration this should be safe. --- src/qemu/qemu_driver.c | 242 +++++++++++++++++++++++++++++++++---------------- 1 file changed, 162 insertions(+), 80 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 98dc441..21d558e 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1670,6 +1670,9 @@ static int qemudDomainSuspend(virDomainPtr dom) { if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) { reason = VIR_DOMAIN_PAUSED_MIGRATION; eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED; + } else if (priv->job.asyncJob == QEMU_ASYNC_JOB_SNAPSHOT) { + reason = VIR_DOMAIN_PAUSED_SNAPSHOT; + eventDetail = -1; /* don't create lifecycle events when doing snapshot */ } else { reason = VIR_DOMAIN_PAUSED_USER; eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_PAUSED; @@ -1684,9 +1687,12 @@ static int qemudDomainSuspend(virDomainPtr dom) { if (qemuProcessStopCPUs(driver, vm, reason, QEMU_ASYNC_JOB_NONE) < 0) { goto endjob; } - event = virDomainEventNewFromObj(vm, - VIR_DOMAIN_EVENT_SUSPENDED, - eventDetail); + + if (eventDetail >= 0) { + event = virDomainEventNewFromObj(vm, + VIR_DOMAIN_EVENT_SUSPENDED, + eventDetail); + } } if (virDomainSaveStatus(driver->caps, driver->stateDir, vm) < 0) goto endjob; @@ -11000,31 +11006,25 @@ cleanup: /* The domain is expected to be locked and active. */ static int -qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, - struct qemud_driver *driver, - virDomainObjPtr *vmptr, +qemuDomainSnapshotCreateDiskActive(struct qemud_driver *driver, + virDomainObjPtr vm, virDomainSnapshotObjPtr snap, - unsigned int flags) + unsigned int flags, + enum qemuDomainAsyncJob asyncJob) { - virDomainObjPtr vm = *vmptr; qemuDomainObjPrivatePtr priv = vm->privateData; virJSONValuePtr actions = NULL; - bool resume = false; int ret = -1; int i; bool persist = false; int thaw = 0; /* 1 if freeze succeeded, -1 if freeze failed */ - bool atomic = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC) != 0; bool reuse = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT) != 0; virCgroupPtr cgroup = NULL; - if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) - return -1; - if (!virDomainObjIsActive(vm)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("domain is not running")); - goto endjob; + goto cleanup; } if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && @@ -11032,7 +11032,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, virReportError(VIR_ERR_INTERNAL_ERROR, _("Unable to find cgroup for %s"), vm->def->name); - goto endjob; + goto cleanup; } /* 'cgroup' is still NULL if cgroups are disabled. */ @@ -11044,34 +11044,14 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, if (qemuDomainSnapshotFSFreeze(driver, vm) < 0) { /* helper reported the error */ thaw = -1; - goto endjob; + goto cleanup; } else { thaw = 1; } } - /* For multiple disks, libvirt must pause externally to get all - * snapshots to be at the same point in time, unless qemu supports - * transactions. For a single disk, snapshot is atomic without - * requiring a pause. Thanks to qemuDomainSnapshotPrepare, if - * we got to this point, the atomic flag now says whether we need - * to pause, and a capability bit says whether to use transaction. - */ - if (!atomic && virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { - if (qemuProcessStopCPUs(driver, vm, VIR_DOMAIN_PAUSED_SAVE, - QEMU_ASYNC_JOB_NONE) < 0) - goto cleanup; - - resume = true; - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("guest unexpectedly quit")); - goto cleanup; - } - } if (qemuCapsGet(priv->caps, QEMU_CAPS_TRANSACTION)) { - actions = virJSONValueNewArray(); - if (!actions) { + if (!(actions = virJSONValueNewArray())) { virReportOOMError(); goto cleanup; } @@ -11082,7 +11062,9 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, * Based on earlier qemuDomainSnapshotPrepare, all * disks in this list are now either SNAPSHOT_NO, or * SNAPSHOT_EXTERNAL with a valid file name and qcow2 format. */ - qemuDomainObjEnterMonitorWithDriver(driver, vm); + if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) + goto cleanup; + for (i = 0; i < snap->def->ndisks; i++) { virDomainDiskDefPtr persistDisk = NULL; @@ -11136,9 +11118,104 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, } } qemuDomainObjExitMonitorWithDriver(driver, vm); - if (ret < 0) + +cleanup: + virCgroupFree(&cgroup); + + if (ret == 0 || !qemuCapsGet(priv->caps, QEMU_CAPS_TRANSACTION)) { + if (virDomainSaveStatus(driver->caps, driver->stateDir, vm) < 0 || + (persist && virDomainSaveConfig(driver->configDir, vm->newDef) < 0)) + ret = -1; + } + + if (thaw != 0 && + qemuDomainSnapshotFSThaw(driver, vm, thaw > 0) < 0) { + /* helper reported the error, if it was needed */ + if (thaw > 0) + ret = -1; + } + + return ret; +} + + +static int +qemuDomainSnapshotCreateActiveExternal(virConnectPtr conn, + struct qemud_driver *driver, + virDomainObjPtr *vmptr, + virDomainSnapshotObjPtr snap, + unsigned int flags) +{ + bool resume = false; + int ret = -1; + virDomainObjPtr vm = *vmptr; + qemuDomainObjPrivatePtr priv = vm->privateData; + char *xml = NULL; + bool memory = snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; + bool atomic = !!(flags & VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC); + bool transaction = qemuCapsGet(priv->caps, QEMU_CAPS_TRANSACTION); + + if (qemuDomainObjBeginAsyncJobWithDriver(driver, vm, + QEMU_ASYNC_JOB_SNAPSHOT) < 0) goto cleanup; + /* we need to resume the guest only if it was previously running */ + if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { + resume = true; + + /* For multiple disks, libvirt must pause externally to get all + * snapshots to be at the same point in time, unless qemu supports + * transactions. For a single disk, snapshot is atomic without + * requiring a pause. Thanks to qemuDomainSnapshotPrepare, if + * we got to this point, the atomic flag now says whether we need + * to pause, and a capability bit says whether to use transaction. + */ + if ((memory && !(flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE)) || + (atomic && !transaction)) { + if (qemuProcessStopCPUs(driver, vm, VIR_DOMAIN_PAUSED_SNAPSHOT, + QEMU_ASYNC_JOB_SNAPSHOT) < 0) + goto endjob; + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("guest unexpectedly quit")); + goto endjob; + } + } + } + + /* do the memory snapshot if necessary */ + if (memory) { + /* allow the migration job to be cancelled or the domain to be paused */ + qemuDomainObjSetAsyncJobMask(vm, DEFAULT_JOB_MASK | + JOB_MASK(QEMU_JOB_SUSPEND) | + JOB_MASK(QEMU_JOB_MIGRATION_OP)); + + if (!(xml = qemuDomainDefFormatLive(driver, vm->def, true, false))) + goto endjob; + + if ((ret = qemuDomainSaveMemory(driver, vm, snap->def->file, + xml, QEMUD_SAVE_FORMAT_RAW, + resume, 0, + QEMU_ASYNC_JOB_SNAPSHOT)) < 0) + goto endjob; + + /* forbid any further manipulation */ + qemuDomainObjSetAsyncJobMask(vm, DEFAULT_JOB_MASK); + } + + /* now the domain is now paused if: + * - if a memory snapshot was requested + * - an atomic snapshot was requested AND + * qemu does not support transactions + * + * Next we snapshot the disks. + */ + if ((ret = qemuDomainSnapshotCreateDiskActive(driver, vm, snap, flags, + QEMU_ASYNC_JOB_SNAPSHOT)) < 0) + goto endjob; + + /* the snapshot is complete now */ if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_HALT) { virDomainEventPtr event; @@ -11148,53 +11225,41 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, virDomainAuditStop(vm, "from-snapshot"); /* We already filtered the _HALT flag for persistent domains * only, so this end job never drops the last reference. */ - ignore_value(qemuDomainObjEndJob(driver, vm)); + ignore_value(qemuDomainObjEndAsyncJob(driver, vm)); resume = false; - thaw = 0; vm = NULL; if (event) qemuDomainEventQueue(driver, event); } -cleanup: - if (resume && virDomainObjIsActive(vm)) { - if (qemuProcessStartCPUs(driver, vm, conn, - VIR_DOMAIN_RUNNING_UNPAUSED, - QEMU_ASYNC_JOB_NONE) < 0 && - virGetLastError() == NULL) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("resuming after snapshot failed")); - goto endjob; - } - } - - if (vm && (ret == 0 || - !qemuCapsGet(priv->caps, QEMU_CAPS_TRANSACTION))) { - if (virDomainSaveStatus(driver->caps, driver->stateDir, vm) < 0 || - (persist && - virDomainSaveConfig(driver->configDir, vm->newDef) < 0)) - ret = -1; - } + ret = 0; endjob: - if (cgroup) - virCgroupFree(&cgroup); - if (vm && thaw != 0 && - qemuDomainSnapshotFSThaw(driver, vm, thaw > 0) < 0) { - /* helper reported the error, if it was needed */ - if (thaw > 0) - ret = -1; - } - if (vm && (qemuDomainObjEndJob(driver, vm) == 0)) { + if (vm && !qemuDomainObjEndAsyncJob(driver, vm)) { /* Only possible if a transient vm quit while our locks were down, - * in which case we don't want to save snapshot metadata. */ + * in which case we don't want to save snapshot metadata. + */ *vmptr = NULL; ret = -1; } +cleanup: + VIR_FREE(xml); + if (resume && vm && virDomainObjIsActive(vm) && + qemuProcessStartCPUs(driver, vm, conn, + VIR_DOMAIN_RUNNING_UNPAUSED, + QEMU_ASYNC_JOB_NONE) < 0 && + virGetLastError() == NULL) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("resuming after snapshot failed")); + + return -1; + } + return ret; } + static virDomainSnapshotPtr qemuDomainSnapshotCreateXML(virDomainPtr domain, const char *xmlDesc, @@ -11220,7 +11285,8 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY | VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT | VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE | - VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC, NULL); + VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC | + VIR_DOMAIN_SNAPSHOT_CREATE_LIVE, NULL); if ((flags & VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE) && !(flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY)) { @@ -11428,21 +11494,37 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } } + /* reject the VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag where not supported */ + if ((flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE) && + (!virDomainObjIsActive(vm) || + snap->def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("live snapshot creation is supported only " + "with external checkpoints")); + goto cleanup; + } + /* actually do the snapshot */ if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE) { /* XXX Should we validate that the redefined snapshot even * makes sense, such as checking that qemu-img recognizes the * snapshot name in at least one of the domain's disks? */ - } else if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { - if (qemuDomainSnapshotCreateDiskActive(domain->conn, driver, - &vm, snap, flags) < 0) - goto cleanup; - } else if (!virDomainObjIsActive(vm)) { - if (qemuDomainSnapshotCreateInactive(driver, vm, snap) < 0) - goto cleanup; + } else if (virDomainObjIsActive(vm)) { + if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY || + snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + /* external checkpoint or disk snapshot */ + if (qemuDomainSnapshotCreateActiveExternal(domain->conn, driver, + &vm, snap, flags) < 0) + goto cleanup; + } else { + /* internal checkpoint */ + if (qemuDomainSnapshotCreateActiveInternal(domain->conn, driver, + &vm, snap, flags) < 0) + goto cleanup; + } } else { - if (qemuDomainSnapshotCreateActiveInternal(domain->conn, driver, - &vm, snap, flags) < 0) + /* inactive */ + if (qemuDomainSnapshotCreateInactive(driver, vm, snap) < 0) goto cleanup; } -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
This patch adds support to take external system checkpoints.
The functionality is layered on top of the previous disk-only snapshot code. When the checkpoint is requested the domain memory is saved to the memory image file using migration to file. (The user may specify to do take the memory image while the guest is live with the
s/do //
VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag.)
The memory save image shares format with the image created by virDomainSave() API. --- This version adds support for pausing the guest while the snapshot is being taken. If this happens while the migration job is running the machine is unpaused after the snapshot is done. (This should mirror the behavior of live migration). As there isn't any support for aborting other jobs than migration this should be safe. --- src/qemu/qemu_driver.c | 242 +++++++++++++++++++++++++++++++++---------------- 1 file changed, 162 insertions(+), 80 deletions(-)
+qemuDomainSnapshotCreateDiskActive(struct qemud_driver *driver, + virDomainObjPtr vm, virDomainSnapshotObjPtr snap, - unsigned int flags) + unsigned int flags, + enum qemuDomainAsyncJob asyncJob) { - virDomainObjPtr vm = *vmptr; qemuDomainObjPrivatePtr priv = vm->privateData; virJSONValuePtr actions = NULL; - bool resume = false; int ret = -1; int i; bool persist = false; int thaw = 0; /* 1 if freeze succeeded, -1 if freeze failed */
Ouch. You left the quiesce logic in qemuDomainSnapshotCreateDiskActive, but changed the caller so that this point can be reached after the domain has been already paused. We already reject _QUIESCE without _DISK_ONLY, so we know there is no memory to worry about, but the quiesce must occur before pausing things. That means you need to move the quiesce and thaw logic out of CreateDiskActive and into CreateActiveExternal.
+static int +qemuDomainSnapshotCreateActiveExternal(virConnectPtr conn, + struct qemud_driver *driver, + virDomainObjPtr *vmptr, + virDomainSnapshotObjPtr snap, + unsigned int flags) +{ + /* we need to resume the guest only if it was previously running */ + if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { + resume = true; + + /* For multiple disks, libvirt must pause externally to get all + * snapshots to be at the same point in time, unless qemu supports + * transactions. For a single disk, snapshot is atomic without + * requiring a pause. Thanks to qemuDomainSnapshotPrepare, if + * we got to this point, the atomic flag now says whether we need + * to pause, and a capability bit says whether to use transaction. + */ + if ((memory && !(flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE)) || + (atomic && !transaction)) {
Hmm, I'm thinking I still got this slightly wrong: if memory is specified, the we are guaranteed that the guest will pause before the disk side of things, even if we support transaction. The only time we can get away without pausing the guest is for disk-only and transaction support. But as written, this would end up pausing if memory and _LIVE are specified but we lack transaction, which undoes the point of _LIVE. So we need a !memory conjunct in the second half of ||.
+ + /* now the domain is now paused if: + * - if a memory snapshot was requested + * - an atomic snapshot was requested AND + * qemu does not support transactions + * + * Next we snapshot the disks. + */ + if ((ret = qemuDomainSnapshotCreateDiskActive(driver, vm, snap, flags, + QEMU_ASYNC_JOB_SNAPSHOT)) < 0) + goto endjob;
@@ -11428,21 +11494,37 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } }
+ /* reject the VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag where not supported */ + if ((flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE) && + (!virDomainObjIsActive(vm) || + snap->def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("live snapshot creation is supported only " + "with external checkpoints")); + goto cleanup; + }
More on why you can't mix _QUIESCE with _LIVE external memory snapshots: we don't control when the guest will finally converge and pause, so we can't trigger the guest agent to quiesce at the right time. We don't want to quiesce up front, as that defeats the purpose of a live snapshot, if the guest can't do any I/O for the entire live migration-to-file. So here, we should error out if _QUIESCE and _LIVE are both given; or even stronger - error out if both _QUIESCE and external memory are requested (it is always easier to relax later if someone can prove it would be useful, than it is to provide it now and wish we hadn't). On the bright side, this forbidden combination is no real loss: if you are snapshotting memory, then you don't need to worry about inconsistent disk state, since you have the memory that goes along with any in-flight I/O at the time of the snapshot - quiescing only makes sense if you are going to save disk state without in-flight I/O to go along with it. Looking at the code, we already forbid _QUIESCE if _DISK_ONLY is not present; and re-reading my table in commit 4201a7ea, all that remains is to reject the combination of DISK_ONLY and a memory snapshot request. Likewise, _LIVE and _REDEFINE are incompatible. Hmm, I see that src/libvirt.c already filters out some impossible flag combinations, so maybe my proposals below should be moved. I'm playing with squashing this in, but I also need to fix the quiesce outside pause issue before I can give ACK, if you can beat me to it. diff --git i/src/qemu/qemu_driver.c w/src/qemu/qemu_driver.c index d5cfdcc..35c8670 100644 --- i/src/qemu/qemu_driver.c +++ w/src/qemu/qemu_driver.c @@ -11498,14 +11498,23 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } /* reject the VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag where not supported */ - if ((flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE) && + if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE && (!virDomainObjIsActive(vm) || - snap->def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)) { + snap->def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL || + flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE)) { virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("live snapshot creation is supported only " "with external checkpoints")); goto cleanup; } + if ((snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL || + snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL) && + flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("disk-only snapshot creation is not compatible with " + "memory snapshot")); + goto cleanup; + } /* actually do the snapshot */ if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE) { -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/02/2012 04:28 PM, Eric Blake wrote:
On 11/01/2012 10:22 AM, Peter Krempa wrote:
This patch adds support to take external system checkpoints.
The functionality is layered on top of the previous disk-only snapshot code. When the checkpoint is requested the domain memory is saved to the memory image file using migration to file. (The user may specify to do take the memory image while the guest is live with the
s/do //
VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag.)
The memory save image shares format with the image created by virDomainSave() API. ---
Ouch. You left the quiesce logic in qemuDomainSnapshotCreateDiskActive, but changed the caller so that this point can be reached after the domain has been already paused. We already reject _QUIESCE without _DISK_ONLY, so we know there is no memory to worry about, but the quiesce must occur before pausing things.
That means you need to move the quiesce and thaw logic out of CreateDiskActive and into CreateActiveExternal.
I'm playing with squashing this in, but I also need to fix the quiesce outside pause issue before I can give ACK, if you can beat me to it.
Here's what I ended up with; I can ACK your patch plus this squashed in, although it wouldn't hurt to do another round of reviews and/or testing. diff --git i/src/qemu/qemu_driver.c w/src/qemu/qemu_driver.c index d5cfdcc..2a9e09b 100644 --- i/src/qemu/qemu_driver.c +++ w/src/qemu/qemu_driver.c @@ -11020,7 +11020,6 @@ qemuDomainSnapshotCreateDiskActive(struct qemud_driver *driver, int ret = -1; int i; bool persist = false; - int thaw = 0; /* 1 if freeze succeeded, -1 if freeze failed */ bool reuse = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT) != 0; virCgroupPtr cgroup = NULL; @@ -11039,20 +11038,6 @@ qemuDomainSnapshotCreateDiskActive(struct qemud_driver *driver, } /* 'cgroup' is still NULL if cgroups are disabled. */ - /* If quiesce was requested, then issue a freeze command, and a - * counterpart thaw command, no matter what. The command will - * fail if the guest is paused or the guest agent is not - * running. */ - if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE) { - if (qemuDomainSnapshotFSFreeze(driver, vm) < 0) { - /* helper reported the error */ - thaw = -1; - goto cleanup; - } else { - thaw = 1; - } - } - if (qemuCapsGet(priv->caps, QEMU_CAPS_TRANSACTION)) { if (!(actions = virJSONValueNewArray())) { virReportOOMError(); @@ -11131,13 +11116,6 @@ cleanup: ret = -1; } - if (thaw != 0 && - qemuDomainSnapshotFSThaw(driver, vm, thaw > 0) < 0) { - /* helper reported the error, if it was needed */ - if (thaw > 0) - ret = -1; - } - return ret; } @@ -11157,24 +11135,43 @@ qemuDomainSnapshotCreateActiveExternal(virConnectPtr conn, bool memory = snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; bool atomic = !!(flags & VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC); bool transaction = qemuCapsGet(priv->caps, QEMU_CAPS_TRANSACTION); + int thaw = 0; /* 1 if freeze succeeded, -1 if freeze failed */ if (qemuDomainObjBeginAsyncJobWithDriver(driver, vm, QEMU_ASYNC_JOB_SNAPSHOT) < 0) goto cleanup; + /* If quiesce was requested, then issue a freeze command, and a + * counterpart thaw command, no matter what. The command will + * fail if the guest is paused or the guest agent is not + * running. */ + if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE) { + if (qemuDomainSnapshotFSFreeze(driver, vm) < 0) { + /* helper reported the error */ + thaw = -1; + goto endjob; + } else { + thaw = 1; + } + } + /* we need to resume the guest only if it was previously running */ if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { resume = true; - /* For multiple disks, libvirt must pause externally to get all - * snapshots to be at the same point in time, unless qemu supports - * transactions. For a single disk, snapshot is atomic without - * requiring a pause. Thanks to qemuDomainSnapshotPrepare, if - * we got to this point, the atomic flag now says whether we need - * to pause, and a capability bit says whether to use transaction. + /* For external checkpoints (those with memory), the guest + * must pause (either by libvirt up front, or by qemu after + * _LIVE converges). For disk-only snapshots with multiple + * disks, libvirt must pause externally to get all snapshots + * to be at the same point in time, unless qemu supports + * transactions. For a single disk, snapshot is atomic + * without requiring a pause. Thanks to + * qemuDomainSnapshotPrepare, if we got to this point, the + * atomic flag now says whether we need to pause, and a + * capability bit says whether to use transaction. */ if ((memory && !(flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE)) || - (atomic && !transaction)) { + (!memory && atomic && !transaction)) { if (qemuProcessStopCPUs(driver, vm, VIR_DOMAIN_PAUSED_SNAPSHOT, QEMU_ASYNC_JOB_SNAPSHOT) < 0) goto endjob; @@ -11230,6 +11227,7 @@ qemuDomainSnapshotCreateActiveExternal(virConnectPtr conn, * only, so this end job never drops the last reference. */ ignore_value(qemuDomainObjEndAsyncJob(driver, vm)); resume = false; + thaw = 0; vm = NULL; if (event) qemuDomainEventQueue(driver, event); @@ -11238,16 +11236,6 @@ qemuDomainSnapshotCreateActiveExternal(virConnectPtr conn, ret = 0; endjob: - if (vm && !qemuDomainObjEndAsyncJob(driver, vm)) { - /* Only possible if a transient vm quit while our locks were down, - * in which case we don't want to save snapshot metadata. - */ - *vmptr = NULL; - ret = -1; - } - -cleanup: - VIR_FREE(xml); if (resume && vm && virDomainObjIsActive(vm) && qemuProcessStartCPUs(driver, vm, conn, VIR_DOMAIN_RUNNING_UNPAUSED, @@ -11258,6 +11246,22 @@ cleanup: return -1; } + if (vm && thaw != 0 && + qemuDomainSnapshotFSThaw(driver, vm, thaw > 0) < 0) { + /* helper reported the error, if it was needed */ + if (thaw > 0) + ret = -1; + } + if (vm && !qemuDomainObjEndAsyncJob(driver, vm)) { + /* Only possible if a transient vm quit while our locks were down, + * in which case we don't want to save snapshot metadata. + */ + *vmptr = NULL; + ret = -1; + } + +cleanup: + VIR_FREE(xml); return ret; } @@ -11498,14 +11502,23 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } /* reject the VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag where not supported */ - if ((flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE) && + if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE && (!virDomainObjIsActive(vm) || - snap->def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)) { + snap->def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL || + flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE)) { virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("live snapshot creation is supported only " "with external checkpoints")); goto cleanup; } + if ((snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL || + snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL) && + flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("disk-only snapshot creation is not compatible with " + "memory snapshot")); + goto cleanup; + } /* actually do the snapshot */ if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE) { -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Some of the pre-snapshot check have restrictions wired in regarding configuration options that influence taking of external checkpoints. This patch removes restrictions that would inhibit taking of such a snapshot. --- src/qemu/qemu_driver.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 21d558e..53f6e4b 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10789,14 +10789,6 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, break; case VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL: - if (def->state != VIR_DOMAIN_DISK_SNAPSHOT) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("system checkpoint external snapshot for " - "disk %s not implemented yet"), - disk->name); - goto cleanup; - } - if (!disk->format) { disk->format = VIR_STORAGE_FILE_QCOW2; } else if (disk->format != VIR_STORAGE_FILE_QCOW2 && @@ -10842,12 +10834,15 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, } } - if (!found) { + /* external snapshot is possible without specifying a disk to snapshot */ + if (!found && + !(def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("snapshots require at least one disk to be " - "selected for snapshot")); + _("internal and disk-only snapshots require at least " + "one disk to be selected for snapshot")); goto cleanup; } + if (def->state != VIR_DOMAIN_DISK_SNAPSHOT && active) { if (external == 1 || qemuCapsGet(priv->caps, QEMU_CAPS_TRANSACTION)) { @@ -11429,7 +11424,8 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, snap = other; } if (def->dom) { - if (def->state == VIR_DOMAIN_DISK_SNAPSHOT) { + if (def->state == VIR_DOMAIN_DISK_SNAPSHOT || + def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; align_match = false; } @@ -11459,9 +11455,10 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, def->memory = VIR_DOMAIN_SNAPSHOT_LOCATION_NONE; } else { def->state = virDomainObjGetState(vm, NULL); - def->memory = (def->state == VIR_DOMAIN_SHUTOFF ? - VIR_DOMAIN_SNAPSHOT_LOCATION_NONE : - VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL); + if (def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + align_match = false; + align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; + } } if (virDomainSnapshotAlignDisks(def, align_location, align_match) < 0 || -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
Some of the pre-snapshot check have restrictions wired in regarding configuration options that influence taking of external checkpoints.
This patch removes restrictions that would inhibit taking of such a snapshot. --- src/qemu/qemu_driver.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-)
- if (!found) { + /* external snapshot is possible without specifying a disk to snapshot */ + if (!found && + !(def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)) {
'!(a == b)' is awkward; write it 'a != b'
@@ -11459,9 +11455,10 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, def->memory = VIR_DOMAIN_SNAPSHOT_LOCATION_NONE; } else { def->state = virDomainObjGetState(vm, NULL); - def->memory = (def->state == VIR_DOMAIN_SHUTOFF ? - VIR_DOMAIN_SNAPSHOT_LOCATION_NONE : - VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL); + if (def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + align_match = false; + align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; + }
This hunk is removing too much. ACK with this squashed in: diff --git i/src/qemu/qemu_driver.c w/src/qemu/qemu_driver.c index ce77b27..ef9771e 100644 --- i/src/qemu/qemu_driver.c +++ w/src/qemu/qemu_driver.c @@ -10839,7 +10839,7 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, /* external snapshot is possible without specifying a disk to snapshot */ if (!found && - !(def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)) { + def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("internal and disk-only snapshots require at least " "one disk to be selected for snapshot")); @@ -11460,12 +11460,15 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, align_match = false; def->state = VIR_DOMAIN_DISK_SNAPSHOT; def->memory = VIR_DOMAIN_SNAPSHOT_LOCATION_NONE; + } else if (def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + def->state = virDomainObjGetState(vm, NULL); + align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; + align_match = false; } else { def->state = virDomainObjGetState(vm, NULL); - if (def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { - align_match = false; - align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; - } + def->memory = (def->state == VIR_DOMAIN_SHUTOFF ? + VIR_DOMAIN_SNAPSHOT_LOCATION_NONE : + VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL); } if (virDomainSnapshotAlignDisks(def, align_location, align_match) < 0 || -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/03/12 00:35, Eric Blake wrote:
On 11/01/2012 10:22 AM, Peter Krempa wrote:
Some of the pre-snapshot check have restrictions wired in regarding configuration options that influence taking of external checkpoints.
This patch removes restrictions that would inhibit taking of such a snapshot. --- src/qemu/qemu_driver.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-)
Thanks for your comments and reviews. I pushed patches 5-11 as they form a block of functionality that can be separated. I've done some testing (well, except disk quiescing as I don't have a guest with agent configured) and it worked as expected. I'll repost the rest of the series after some rebasing and tweaks later this week. Peter

Two other places were left with the old code to look up snapshots. Change them to use the snapshot lookup helper. --- src/qemu/qemu_driver.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 53f6e4b..d9b6c9c 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11938,13 +11938,8 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, goto cleanup; } - snap = virDomainSnapshotFindByName(vm->snapshots, snapshot->name); - if (!snap) { - virReportError(VIR_ERR_NO_DOMAIN_SNAPSHOT, - _("no domain snapshot with matching name '%s'"), - snapshot->name); + if (!(snap = qemuSnapObjFromSnapshot(vm, snapshot))) goto cleanup; - } if (!vm->persistent && snap->def->state != VIR_DOMAIN_RUNNING && @@ -12306,13 +12301,8 @@ static int qemuDomainSnapshotDelete(virDomainSnapshotPtr snapshot, goto cleanup; } - snap = virDomainSnapshotFindByName(vm->snapshots, snapshot->name); - if (!snap) { - virReportError(VIR_ERR_NO_DOMAIN_SNAPSHOT, - _("no domain snapshot with matching name '%s'"), - snapshot->name); + if (!(snap = qemuSnapObjFromSnapshot(vm, snapshot))) goto cleanup; - } if (!(flags & VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY)) { if (!(flags & VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN_ONLY) && -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
Two other places were left with the old code to look up snapshots. Change them to use the snapshot lookup helper. --- src/qemu/qemu_driver.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-)
ACK. You can float this one sooner if you'd like. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

The workhorse part of qemuDomainSaveImageStartVM can be reused while loading external snapshots. This patch splits the code out into a new function qemuDomainSaveImageLoad that is free of setting lifecycle events. --- src/qemu/qemu_driver.c | 53 ++++++++++++++++++++++++++++++++++---------------- 1 file changed, 36 insertions(+), 17 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d9b6c9c..04906d4 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4918,19 +4918,19 @@ error: return -1; } -static int ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5) ATTRIBUTE_NONNULL(6) -qemuDomainSaveImageStartVM(virConnectPtr conn, - struct qemud_driver *driver, - virDomainObjPtr vm, - int *fd, - const struct qemud_save_header *header, - const char *path, - bool start_paused) +/* this helper loads the save image into a new qemu process */ +static int +qemuDomainSaveImageLoad(virConnectPtr conn, + struct qemud_driver *driver, + virDomainObjPtr vm, + int *fd, + const struct qemud_save_header *header, + const char *path) + { - int ret = -1; - virDomainEventPtr event; int intermediatefd = -1; virCommandPtr cmd = NULL; + int ret = -1; if (header->version == 2) { const char *prog = qemudSaveCompressionTypeToString(header->compressed); @@ -4938,7 +4938,7 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, virReportError(VIR_ERR_OPERATION_FAILED, _("Invalid compressed save format %d"), header->compressed); - goto out; + goto cleanup; } if (header->compressed != QEMUD_SAVE_FORMAT_RAW) { @@ -4954,7 +4954,7 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, _("Failed to start decompression binary %s"), prog); *fd = intermediatefd; - goto out; + goto cleanup; } } } @@ -4984,6 +4984,30 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, ret = -1; } +cleanup: + virCommandFree(cmd); + if (virSecurityManagerRestoreSavedStateLabel(driver->securityManager, + vm->def, path) < 0) + VIR_WARN("failed to restore save state label on %s", path); + + + return ret; +} + +static int ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5) ATTRIBUTE_NONNULL(6) +qemuDomainSaveImageStartVM(virConnectPtr conn, + struct qemud_driver *driver, + virDomainObjPtr vm, + int *fd, + const struct qemud_save_header *header, + const char *path, + bool start_paused) +{ + virDomainEventPtr event; + int ret = -1; + + ret = qemuDomainSaveImageLoad(conn, driver, vm, fd, header, path); + if (ret < 0) { virDomainAuditStart(vm, "restored", false); goto out; @@ -5024,11 +5048,6 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, ret = 0; out: - virCommandFree(cmd); - if (virSecurityManagerRestoreSavedStateLabel(driver->securityManager, - vm->def, path) < 0) - VIR_WARN("failed to restore save state label on %s", path); - return ret; } -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
The workhorse part of qemuDomainSaveImageStartVM can be reused while loading external snapshots. This patch splits the code out into a new function qemuDomainSaveImageLoad that is free of setting lifecycle events. --- src/qemu/qemu_driver.c | 53 ++++++++++++++++++++++++++++++++++---------------- 1 file changed, 36 insertions(+), 17 deletions(-)
ACK -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

The current snapshot reverting api supported changing the state of the machine after the snapshot was reverted to either started or paused. This patch adds the ability to revert the state but to stopped state. --- include/libvirt/libvirt.h.in | 1 + src/libvirt.c | 9 +++++---- tools/virsh-snapshot.c | 3 +++ tools/virsh.pod | 15 +++++++++------ 4 files changed, 18 insertions(+), 10 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index fe58c08..ea57d00 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -3872,6 +3872,7 @@ typedef enum { VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING = 1 << 0, /* Run after revert */ VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED = 1 << 1, /* Pause after revert */ VIR_DOMAIN_SNAPSHOT_REVERT_FORCE = 1 << 2, /* Allow risky reverts */ + VIR_DOMAIN_SNAPSHOT_REVERT_STOPPED = 1 << 3, /* Revert into stopped state */ } virDomainSnapshotRevertFlags; /* Revert the domain to a point-in-time snapshot. The diff --git a/src/libvirt.c b/src/libvirt.c index 1da3973..4a4518f 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -18698,11 +18698,12 @@ virDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, goto error; } - if ((flags & VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING) && - (flags & VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED)) { + if ((!!(flags & VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING) + + !!(flags & VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED) + + !!(flags & VIR_DOMAIN_SNAPSHOT_REVERT_STOPPED)) > 1) { virReportInvalidArg(flags, - _("running and paused flags in %s are mutually exclusive"), - __FUNCTION__); + _("running, paused and stopped flags in %s are " + "mutually exclusive"), __FUNCTION__); goto error; } diff --git a/tools/virsh-snapshot.c b/tools/virsh-snapshot.c index 77364e8..818ef56 100644 --- a/tools/virsh-snapshot.c +++ b/tools/virsh-snapshot.c @@ -1469,6 +1469,7 @@ static const vshCmdOptDef opts_snapshot_revert[] = { {"current", VSH_OT_BOOL, 0, N_("revert to current snapshot")}, {"running", VSH_OT_BOOL, 0, N_("after reverting, change state to running")}, {"paused", VSH_OT_BOOL, 0, N_("after reverting, change state to paused")}, + {"stopped", VSH_OT_BOOL, 0, N_("after reverting, change state to stopped")}, {"force", VSH_OT_BOOL, 0, N_("try harder on risky reverts")}, {NULL, 0, 0, NULL} }; @@ -1488,6 +1489,8 @@ cmdDomainSnapshotRevert(vshControl *ctl, const vshCmd *cmd) flags |= VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING; if (vshCommandOptBool(cmd, "paused")) flags |= VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED; + if (vshCommandOptBool(cmd, "stopped")) + flags |= VIR_DOMAIN_SNAPSHOT_REVERT_STOPPED; /* We want virsh snapshot-revert --force to work even when talking * to older servers that did the unsafe revert by default but * reject the flag, so we probe without the flag, and only use it diff --git a/tools/virsh.pod b/tools/virsh.pod index a331e56..7c65443 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -2803,7 +2803,7 @@ Output the name of the parent snapshot, if any, for the given I<snapshot>, or for the current snapshot with I<--current>. =item B<snapshot-revert> I<domain> {I<snapshot> | I<--current>} -[{I<--running> | I<--paused>}] [I<--force>] +[{I<--running> | I<--paused>} | I<--stopped>] [I<--force>] Revert the given domain to the snapshot specified by I<snapshot>, or to the current snapshot with I<--current>. Be aware @@ -2814,11 +2814,14 @@ the original snapshot was taken. Normally, reverting to a snapshot leaves the domain in the state it was at the time the snapshot was created, except that a disk snapshot with -no vm state leaves the domain in an inactive state. Passing either the -I<--running> or I<--paused> flag will perform additional state changes -(such as booting an inactive domain, or pausing a running domain). Since -transient domains cannot be inactive, it is required to use one of these -flags when reverting to a disk snapshot of a transient domain. +no vm state leaves the domain in an inactive state. Passing one of the +I<--running>, I<--paused> or I<--stopped> flag will perform additional +state changes such as booting an inactive domain, pausing a running domain +or shutting the domain down after the snapshot is reverted. Since +transient domains cannot be inactive, it is required to use one of +I<--running> or I<--paused> flags when reverting to a disk snapshot of a +transient domain. The I<--stopped> flag cannot be used on snapshots +of transient domains. There are two cases where a snapshot revert involves extra risk, which requires the use of I<--force> to proceed. One is the case of a -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
The current snapshot reverting api supported changing the state of the machine after the snapshot was reverted to either started or paused.
This patch adds the ability to revert the state but to stopped state. --- include/libvirt/libvirt.h.in | 1 + src/libvirt.c | 9 +++++---- tools/virsh-snapshot.c | 3 +++ tools/virsh.pod | 15 +++++++++------ 4 files changed, 18 insertions(+), 10 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index fe58c08..ea57d00 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -3872,6 +3872,7 @@ typedef enum { VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING = 1 << 0, /* Run after revert */ VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED = 1 << 1, /* Pause after revert */ VIR_DOMAIN_SNAPSHOT_REVERT_FORCE = 1 << 2, /* Allow risky reverts */ + VIR_DOMAIN_SNAPSHOT_REVERT_STOPPED = 1 << 3, /* Revert into stopped state */
Hmm, this might even be the argument I was looking for earlier about whether it makes sense to mix QUIESCE and memory state (still, using QUIESCE only makes sense for non-LIVE checkpoints). If we are going to revert into a stopped state, that means that we are going to be using the disk state without any memory and so no in-flight I/O; if that is to be allowed, we want a way to quiesce then pause the domain then save state, so we can make up our mind whether to restore just the disk state or everything; but it would also mean that the saved ram state needs to flag that it was done while the guest was quiesced, so that the first thing done on a non-stopped revert is to thaw the guest file system. On the other hand, I don't know how many people will revert to just disk state and not also load up the associated RAM state. This flag might not get much use.
+++ b/src/libvirt.c @@ -18698,11 +18698,12 @@ virDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, goto error; }
- if ((flags & VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING) && - (flags & VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED)) { + if ((!!(flags & VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING) + + !!(flags & VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED) + + !!(flags & VIR_DOMAIN_SNAPSHOT_REVERT_STOPPED)) > 1) { virReportInvalidArg(flags, - _("running and paused flags in %s are mutually exclusive"), - __FUNCTION__); + _("running, paused and stopped flags in %s are " + "mutually exclusive"), __FUNCTION__);
Also needs documentation several lines earlier.
+++ b/tools/virsh.pod @@ -2803,7 +2803,7 @@ Output the name of the parent snapshot, if any, for the given I<snapshot>, or for the current snapshot with I<--current>.
=item B<snapshot-revert> I<domain> {I<snapshot> | I<--current>} -[{I<--running> | I<--paused>}] [I<--force>] +[{I<--running> | I<--paused>} | I<--stopped>] [I<--force>]
Since it is mutually exclusive, this should be: [{I<--running> | I<--paused> | I<--stopped>}] [I<--force>] ACK with the 2 doc changes; we can save decisions about use of quiesce for later. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/02/2012 07:48 PM, Eric Blake wrote:
+++ b/include/libvirt/libvirt.h.in @@ -3872,6 +3872,7 @@ typedef enum { VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING = 1 << 0, /* Run after revert */ VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED = 1 << 1, /* Pause after revert */ VIR_DOMAIN_SNAPSHOT_REVERT_FORCE = 1 << 2, /* Allow risky reverts */ + VIR_DOMAIN_SNAPSHOT_REVERT_STOPPED = 1 << 3, /* Revert into stopped state */
Hmm, this might even be the argument I was looking for earlier about whether it makes sense to mix QUIESCE and memory state (still, using QUIESCE only makes sense for non-LIVE checkpoints). If we are going to revert into a stopped state, that means that we are going to be using the disk state without any memory and so no in-flight I/O; if that is to be allowed, we want a way to quiesce then pause the domain then save state, so we can make up our mind whether to restore just the disk state or everything; but it would also mean that the saved ram state needs to flag that it was done while the guest was quiesced, so that the first thing done on a non-stopped revert is to thaw the guest file system.
On the other hand, I don't know how many people will revert to just disk state and not also load up the associated RAM state. This flag might not get much use.
I thought about it a bit more; this flag is useful after all for reverting to the point of a snapshot without running, because we can then inspect disk state, and when ready to run, do yet another revert without the flag to restore any in-flight I/O. Again, I don't know how often it will be used, and whether we should worry about allowing one to mix quiesce with non-live external snapshots, but I agree with adding the flag. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

This patch adds support for external disk snapshots of inactive domains. The snapshot is created by calling qemu-img create -o backing_file=/path/to/disk /path/snapshot_file -f backing_file=/path/to/backing/file,backing_fmt=format_of_backing_file on each of the disks selected for snapshotting. --- Diff to v1: -added probing of backing file type -switched to virCommand --- src/qemu/qemu_driver.c | 124 ++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 113 insertions(+), 11 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 04906d4..4cea78f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10663,13 +10663,116 @@ qemuDomainSnapshotFSThaw(struct qemud_driver *driver, /* The domain is expected to be locked and inactive. */ static int -qemuDomainSnapshotCreateInactive(struct qemud_driver *driver, - virDomainObjPtr vm, - virDomainSnapshotObjPtr snap) +qemuDomainSnapshotCreateInactiveInternal(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap) { return qemuDomainSnapshotForEachQcow2(driver, vm, snap, "-c", false); } +/* The domain is expected to be locked and inactive. */ +static int +qemuDomainSnapshotCreateInactiveExternal(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + bool reuse) +{ + int i = 0; + char *backingFileArg = NULL; + virDomainSnapshotDiskDefPtr snapdisk; + virDomainDiskDefPtr defdisk; + virCommandPtr cmd = NULL; + const char *qemuImgPath; + + int ret = -1; + + if (!(qemuImgPath = qemuFindQemuImgBinary(driver))) + return -1; + + for (i = 0; i < snap->def->ndisks; i++) { + snapdisk = &(snap->def->disks[i]); + defdisk = snap->def->dom->disks[snapdisk->index]; + + if (snapdisk->snapshot == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + /* check if destination file exists */ + if (reuse && virFileExists(snapdisk->file)) { + VIR_DEBUG("Skipping snapshot creation: File exists: %s", + snapdisk->file); + continue; + } + + if (!snapdisk->format) + snapdisk->format = VIR_STORAGE_FILE_QCOW2; + + /* probe the disk format */ + if (defdisk->format <= 0) { + defdisk->format = virStorageFileProbeFormat(defdisk->src, + driver->user, + driver->group); + if (defdisk->format < 0) + goto cleanup; + } + + if (virAsprintf(&backingFileArg, "backing_file=%s,backing_fmt=%s", + defdisk->src, + virStorageFileFormatTypeToString(defdisk->format)) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (!(cmd = virCommandNewArgList(qemuImgPath, + "create", + "-f", + virStorageFileFormatTypeToString(snapdisk->format), + "-o", + backingFileArg, + snapdisk->file, + NULL))) + goto cleanup; + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + virCommandFree(cmd); + VIR_FREE(backingFileArg); + cmd = NULL; + } + } + + /* update disk definitions */ + for (i = 0; i < snap->def->ndisks; i++) { + snapdisk = &(snap->def->disks[i]); + defdisk = vm->def->disks[snapdisk->index]; + + if (snapdisk->snapshot == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + VIR_FREE(defdisk->src); + if (!(defdisk->src = strdup(snapdisk->file))) { + virReportOOMError(); + goto cleanup; + } + defdisk->format = snapdisk->format; + } + } + + ret = 0; + +cleanup: + /* unlink images if creation has failed */ + if (ret < 0 && i > 0) { + for (; i > 0; i--) { + snapdisk = &(snap->def->disks[i]); + if (unlink(snapdisk->file) < 0) + VIR_WARN("Failed to remove snapshot image '%s'", + snapdisk->file); + } + } + + VIR_FREE(backingFileArg); + virCommandFree(cmd); + + return ret; +} + /* The domain is expected to be locked and active. */ static int @@ -11462,12 +11565,6 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, goto cleanup; if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("disk snapshots of inactive domains not " - "implemented yet")); - goto cleanup; - } align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; align_match = false; def->state = VIR_DOMAIN_DISK_SNAPSHOT; @@ -11540,8 +11637,13 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } } else { /* inactive */ - if (qemuDomainSnapshotCreateInactive(driver, vm, snap) < 0) - goto cleanup; + if (snap->def->state == VIR_DOMAIN_DISK_SNAPSHOT) { + if (qemuDomainSnapshotCreateInactiveExternal(driver, vm, snap, false) < 0) + goto cleanup; + } else { + if (qemuDomainSnapshotCreateInactiveInternal(driver, vm, snap) < 0) + goto cleanup; + } } /* If we fail after this point, there's not a whole lot we can -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
This patch adds support for external disk snapshots of inactive domains. The snapshot is created by calling qemu-img create -o backing_file=/path/to/disk /path/snapshot_file -f backing_file=/path/to/backing/file,backing_fmt=format_of_backing_file
Still didn't match the code: qemu-img create -f format_of_snapshot -o backing_file=/path/to/backing,backing_fmt=format_of_backing /path/to/snapshot If disk->format is unspecified, then what we do should depend on driver->allowDiskFormatProbing; if true, omit the argument (it's okay for qemu to do the probing); if false, error out.
on each of the disks selected for snapshotting. --- Diff to v1: -added probing of backing file type -switched to virCommand
+qemuDomainSnapshotCreateInactiveExternal(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + bool reuse) +{ + int i = 0;
Unused assignment, given...
+ char *backingFileArg = NULL; + virDomainSnapshotDiskDefPtr snapdisk; + virDomainDiskDefPtr defdisk; + virCommandPtr cmd = NULL; + const char *qemuImgPath; + + int ret = -1; + + if (!(qemuImgPath = qemuFindQemuImgBinary(driver))) + return -1; + + for (i = 0; i < snap->def->ndisks; i++) {
...this initialization.
+ snapdisk = &(snap->def->disks[i]); + defdisk = snap->def->dom->disks[snapdisk->index]; + + if (snapdisk->snapshot == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + /* check if destination file exists */ + if (reuse && virFileExists(snapdisk->file)) { + VIR_DEBUG("Skipping snapshot creation: File exists: %s", + snapdisk->file); + continue; + }
Missing the converse check: if the file exists but reuse is not okay, then it had better be a block device or an empty file; similar to online disk snapshots, we don't want to allow the user to accidentally overwrite files that contain data.
+ + if (!snapdisk->format) + snapdisk->format = VIR_STORAGE_FILE_QCOW2; + + /* probe the disk format */ + if (defdisk->format <= 0) { + defdisk->format = virStorageFileProbeFormat(defdisk->src, + driver->user, + driver->group);
I don't think we need to probe here. If we don't know the current disk format, then if driver->allowDiskFormatProbing, just omit the backing_fmt listing; if not, error out and prevent the snapshot creation.
+ if (defdisk->format < 0) + goto cleanup; + } + + if (virAsprintf(&backingFileArg, "backing_file=%s,backing_fmt=%s", + defdisk->src, + virStorageFileFormatTypeToString(defdisk->format)) < 0) { + virReportOOMError(); + goto cleanup; + }
Rather than allocating into a temporary, it may be easier to rearrange things to use virCommandAddArgFormat.
+ + if (!(cmd = virCommandNewArgList(qemuImgPath, + "create", + "-f", + virStorageFileFormatTypeToString(snapdisk->format), + "-o", + backingFileArg, + snapdisk->file, + NULL))) + goto cleanup; + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + virCommandFree(cmd); + VIR_FREE(backingFileArg); + cmd = NULL; + } + } + + /* update disk definitions */
This covers rollbacks if qemu-img fails, but leaves things a bit awkward if we hit OOM. In particular...
+ for (i = 0; i < snap->def->ndisks; i++) { + snapdisk = &(snap->def->disks[i]); + defdisk = vm->def->disks[snapdisk->index]; + + if (snapdisk->snapshot == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + VIR_FREE(defdisk->src); + if (!(defdisk->src = strdup(snapdisk->file))) { + virReportOOMError(); + goto cleanup; + } + defdisk->format = snapdisk->format; + } + } + + ret = 0; + +cleanup: + /* unlink images if creation has failed */ + if (ret < 0 && i > 0) { + for (; i > 0; i--) { + snapdisk = &(snap->def->disks[i]); + if (unlink(snapdisk->file) < 0) + VIR_WARN("Failed to remove snapshot image '%s'", + snapdisk->file);
...you end up corrupting the diskdef to point to an unlinked file on OOM.
+ } + } + + VIR_FREE(backingFileArg); + virCommandFree(cmd); + + return ret; +} +
/* The domain is expected to be locked and active. */ static int @@ -11462,12 +11565,6 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, goto cleanup;
if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("disk snapshots of inactive domains not " - "implemented yet")); - goto cleanup; - } align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; align_match = false; def->state = VIR_DOMAIN_DISK_SNAPSHOT;
If the domain is offline, I'd treat def->state as VIR_DOMAIN_SHUTOFF, saving VIR_DOMAIN_DISK_SNAPSHOT for the case where we know the domain was running at the time but no memory was saved.
@@ -11540,8 +11637,13 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } } else { /* inactive */ - if (qemuDomainSnapshotCreateInactive(driver, vm, snap) < 0) - goto cleanup; + if (snap->def->state == VIR_DOMAIN_DISK_SNAPSHOT) { + if (qemuDomainSnapshotCreateInactiveExternal(driver, vm, snap, false) < 0) + goto cleanup; + } else { + if (qemuDomainSnapshotCreateInactiveInternal(driver, vm, snap) < 0) + goto cleanup; + }
This isn't quite right. For offline snapshots, we can mix and match internal and external snapshots at will, which means we should call both functions, and ensure that the internal version does nothing unless an internal snapshot is requested. Also, shouldn't you be passing (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXTERNAL) != 0 rather than false? I'll work up some proposed fixes; some of them can be deferred to followup patches, so I'll try to come up with the minimum squash for what I would ack. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/02/2012 10:49 PM, Eric Blake wrote:
On 11/01/2012 10:22 AM, Peter Krempa wrote:
This patch adds support for external disk snapshots of inactive domains. The snapshot is created by calling qemu-img create -o backing_file=/path/to/disk /path/snapshot_file -f backing_file=/path/to/backing/file,backing_fmt=format_of_backing_file
Still didn't match the code: qemu-img create -f format_of_snapshot -o backing_file=/path/to/backing,backing_fmt=format_of_backing /path/to/snapshot
If disk->format is unspecified, then what we do should depend on driver->allowDiskFormatProbing; if true, omit the argument (it's okay for qemu to do the probing); if false, error out.
on each of the disks selected for snapshotting. ---
@@ -11462,12 +11565,6 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, goto cleanup;
if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("disk snapshots of inactive domains not " - "implemented yet")); - goto cleanup; - } align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; align_match = false; def->state = VIR_DOMAIN_DISK_SNAPSHOT;
If the domain is offline, I'd treat def->state as VIR_DOMAIN_SHUTOFF, saving VIR_DOMAIN_DISK_SNAPSHOT for the case where we know the domain was running at the time but no memory was saved.
This isn't quite right. For offline snapshots, we can mix and match internal and external snapshots at will, which means we should call both functions, and ensure that the internal version does nothing unless an internal snapshot is requested.
Or, we can make life simpler by refusing to mix things (and maybe revisit that restriction in a later patch, if people want to do it after all).
Also, shouldn't you be passing (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXTERNAL) != 0 rather than false?
I'll work up some proposed fixes; some of them can be deferred to followup patches, so I'll try to come up with the minimum squash for what I would ack.
Here's the first round of things to squash - I noticed that my earlier suggestion of checking for _LIVE and _REDEFINE being mutually exclusive is done too late - that needs to be done _before_ the point where we alter the current snapshot when update_current is true. Also, this tries to keep things so that def->state == VIR_DOMAIN_DISK_SNAPSHOT is only possible for a running domain; an offline domain is always recorded as VIR_DOMAIN_SHUTOFF, whether the snapshots are internal or external. I'm not sure how much of this should be done as an independent patch; in fact, it may make sense to split this into multiple patches since I'm attempting to address multiple issues. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/06/12 01:06, Eric Blake wrote:
On 11/02/2012 10:49 PM, Eric Blake wrote:
On 11/01/2012 10:22 AM, Peter Krempa wrote:
This patch adds support for external disk snapshots of inactive domains. The snapshot is created by calling qemu-img create -o backing_file=/path/to/disk /path/snapshot_file -f backing_file=/path/to/backing/file,backing_fmt=format_of_backing_file
Still didn't match the code: qemu-img create -f format_of_snapshot -o backing_file=/path/to/backing,backing_fmt=format_of_backing /path/to/snapshot
If disk->format is unspecified, then what we do should depend on driver->allowDiskFormatProbing; if true, omit the argument (it's okay for qemu to do the probing); if false, error out.
on each of the disks selected for snapshotting. ---
@@ -11462,12 +11565,6 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, goto cleanup;
if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("disk snapshots of inactive domains not " - "implemented yet")); - goto cleanup; - } align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; align_match = false; def->state = VIR_DOMAIN_DISK_SNAPSHOT;
If the domain is offline, I'd treat def->state as VIR_DOMAIN_SHUTOFF, saving VIR_DOMAIN_DISK_SNAPSHOT for the case where we know the domain was running at the time but no memory was saved.
This isn't quite right. For offline snapshots, we can mix and match internal and external snapshots at will, which means we should call both functions, and ensure that the internal version does nothing unless an internal snapshot is requested.
Or, we can make life simpler by refusing to mix things (and maybe revisit that restriction in a later patch, if people want to do it after all).
Also, shouldn't you be passing (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXTERNAL) != 0 rather than false?
I'll work up some proposed fixes; some of them can be deferred to followup patches, so I'll try to come up with the minimum squash for what I would ack.
Here's the first round of things to squash - I noticed that my earlier suggestion of checking for _LIVE and _REDEFINE being mutually exclusive is done too late - that needs to be done _before_ the point where we alter the current snapshot when update_current is true. Also, this tries to keep things so that def->state == VIR_DOMAIN_DISK_SNAPSHOT is only possible for a running domain; an offline domain is always recorded as VIR_DOMAIN_SHUTOFF, whether the snapshots are internal or external.
I'm not sure how much of this should be done as an independent patch; in fact, it may make sense to split this into multiple patches since I'm attempting to address multiple issues.
I'll be posting a v3 of the rest of the series soon. I made heavy changes to this patch so you should probably wait until then so we don't do any duplicate work. Peter

On 11/05/2012 05:06 PM, Eric Blake wrote:
Here's the first round of things to squash - I noticed that my earlier suggestion of checking for _LIVE and _REDEFINE being mutually exclusive is done too late - that needs to be done _before_ the point where we alter the current snapshot when update_current is true. Also, this tries to keep things so that def->state == VIR_DOMAIN_DISK_SNAPSHOT is only possible for a running domain; an offline domain is always recorded as VIR_DOMAIN_SHUTOFF, whether the snapshots are internal or external.
Looks like I forgot to attach it after all.
I'm not sure how much of this should be done as an independent patch; in fact, it may make sense to split this into multiple patches since I'm attempting to address multiple issues.
This still applies, and in fact I know you are already refactoring things; so I'll see what happens for v3. From d02d92d6f24a18d1e50508dd03e13f9b8945c781 Mon Sep 17 00:00:00 2001 From: Eric Blake <eblake@redhat.com> Date: Mon, 5 Nov 2012 17:09:36 -0700 Subject: [PATCH] fixup #1 to 15/20 --- src/qemu/qemu_driver.c | 87 ++++++++++++++++++++++++++++++++------------------ 1 file changed, 56 insertions(+), 31 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 080a862..c393b88 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10865,11 +10865,11 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, { int ret = -1; int i; - bool found = false; bool active = virDomainObjIsActive(vm); struct stat st; bool reuse = (*flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT) != 0; bool atomic = (*flags & VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC) != 0; + bool found_internal = false; int external = 0; qemuDomainObjPrivatePtr priv = vm->privateData; @@ -10890,7 +10890,6 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, dom_disk->type == VIR_DOMAIN_DISK_TYPE_NETWORK && (dom_disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_SHEEPDOG || dom_disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_RBD)) { - found = true; break; } if (vm->def->disks[i]->format > 0 && @@ -10910,7 +10909,7 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, disk->name); goto cleanup; } - found = true; + found_internal = true; break; case VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL: @@ -10944,7 +10943,6 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, disk->name, disk->file); goto cleanup; } - found = true; external++; break; @@ -10959,15 +10957,34 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm, virDomainSnapshotDefPtr def, } } - /* external snapshot is possible without specifying a disk to snapshot */ - if (!found && - def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + /* internal snapshot requires an internal disk */ + if (def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL && + !found_internal) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("internal and disk-only snapshots require at least " + _("internal snapshots require at least " + "one disk to be selected for snapshot")); + goto cleanup; + } + /* For now, we don't allow mixing internal and external disks. + * XXX technically, we could mix internal and external disks for + * offline snapshots */ + if (found_internal && external) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("mixing internal and external snapshots is not " + "supported yet")); + goto cleanup; + } + /* disk snapshot requires at least one disk */ + if (def->state == VIR_DOMAIN_DISK_SNAPSHOT && !external) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("disk-only snapshots require at least " "one disk to be selected for snapshot")); goto cleanup; } + /* Alter flags to let later users know what we learned. */ + if (external && !active) + *flags |= VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY; if (def->state != VIR_DOMAIN_DISK_SNAPSHOT && active) { if (external == 1 || qemuCapsGet(priv->caps, QEMU_CAPS_TRANSACTION)) { @@ -11460,6 +11477,25 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, parse_flags))) goto cleanup; + /* reject the VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag where not supported */ + if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE && + (!virDomainObjIsActive(vm) || + snap->def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL || + flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("live snapshot creation is supported only " + "with external checkpoints")); + goto cleanup; + } + if ((snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL || + snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL) && + flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("disk-only snapshot creation is not compatible with " + "memory snapshot")); + goto cleanup; + } + if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE) { /* Prevent circular chains */ if (def->parent) { @@ -11574,7 +11610,10 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; align_match = false; - def->state = VIR_DOMAIN_DISK_SNAPSHOT; + if (virDomainObjIsActive(vm)) + def->state = VIR_DOMAIN_DISK_SNAPSHOT; + else + def->state = VIR_DOMAIN_SHUTOFF; def->memory = VIR_DOMAIN_SNAPSHOT_LOCATION_NONE; } else if (def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { def->state = virDomainObjGetState(vm, NULL); @@ -11617,25 +11656,6 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } } - /* reject the VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag where not supported */ - if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE && - (!virDomainObjIsActive(vm) || - snap->def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL || - flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE)) { - virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", - _("live snapshot creation is supported only " - "with external checkpoints")); - goto cleanup; - } - if ((snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL || - snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL) && - flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { - virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", - _("disk-only snapshot creation is not compatible with " - "memory snapshot")); - goto cleanup; - } - /* actually do the snapshot */ if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE) { /* XXX Should we validate that the redefined snapshot even @@ -11655,9 +11675,14 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, goto cleanup; } } else { - /* inactive */ - if (snap->def->state == VIR_DOMAIN_DISK_SNAPSHOT) { - if (qemuDomainSnapshotCreateInactiveExternal(driver, vm, snap, false) < 0) + /* inactive; qemuDomainSnapshotPrepare guaranteed that we + * aren't mixing internal and external, and altered flags to + * contain DISK_ONLY if there is an external disk. */ + if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { + bool reuse = !!(flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT); + + if (qemuDomainSnapshotCreateInactiveExternal(driver, vm, snap, + reuse) < 0) goto cleanup; } else { if (qemuDomainSnapshotCreateInactiveInternal(driver, vm, snap) < 0) -- 1.7.11.7 -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Snapshots are external if they are "disk snapshots" or if they have an external memory image. Other possibilities are not supported (yet). --- src/conf/snapshot_conf.c | 11 +++++++++++ src/conf/snapshot_conf.h | 2 ++ src/libvirt_private.syms | 1 + 3 files changed, 14 insertions(+) diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c index 691950a..5d9df57 100644 --- a/src/conf/snapshot_conf.c +++ b/src/conf/snapshot_conf.c @@ -1023,3 +1023,14 @@ cleanup: } return ret; } + + +bool +virDomainSnapshotIsExternal(virDomainSnapshotObjPtr snap) +{ + if (snap->def->state == VIR_DOMAIN_DISK_SNAPSHOT || + snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) + return true; + + return false; +} diff --git a/src/conf/snapshot_conf.h b/src/conf/snapshot_conf.h index 00775ae..b5c6e25 100644 --- a/src/conf/snapshot_conf.h +++ b/src/conf/snapshot_conf.h @@ -155,6 +155,8 @@ int virDomainListSnapshots(virDomainSnapshotObjListPtr snapshots, virDomainSnapshotPtr **snaps, unsigned int flags); +bool virDomainSnapshotIsExternal(virDomainSnapshotObjPtr snap); + VIR_ENUM_DECL(virDomainSnapshotLocation) VIR_ENUM_DECL(virDomainSnapshotState) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e94b478..5a07139 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1083,6 +1083,7 @@ virDomainSnapshotFindByName; virDomainSnapshotForEach; virDomainSnapshotForEachChild; virDomainSnapshotForEachDescendant; +virDomainSnapshotIsExternal; virDomainSnapshotLocationTypeFromString; virDomainSnapshotLocationTypeToString; virDomainSnapshotObjListGetNames; -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
Snapshots are external if they are "disk snapshots" or if they have an external memory image. Other possibilities are not supported (yet).
And given the discussion on 15/20, even though it might be technically possible to mix internal and external disk images into a single snapshot, we are going to forbid that mix for now (because it has interesting ramifications on how to do a revert), so this approach looks sane.
--- src/conf/snapshot_conf.c | 11 +++++++++++ src/conf/snapshot_conf.h | 2 ++ src/libvirt_private.syms | 1 + 3 files changed, 14 insertions(+)
diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c index 691950a..5d9df57 100644 --- a/src/conf/snapshot_conf.c +++ b/src/conf/snapshot_conf.c @@ -1023,3 +1023,14 @@ cleanup: } return ret; } + + +bool +virDomainSnapshotIsExternal(virDomainSnapshotObjPtr snap) +{ + if (snap->def->state == VIR_DOMAIN_DISK_SNAPSHOT || + snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) + return true; + + return false; +}
This isn't quite true. If the snapshot is offline, then snap->def->state will be VIR_DOMAIN_SHUTOFF, not VIR_DOMAIN_DISK_SNAPSHOT. If state is shutoff, you're going to have to loop over the disks to see if any of them requested external. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

This patch adds an event to be raised if a PM suspended guest is restored from a snapshot. --- examples/domain-events/events-c/event-test.c | 3 +++ include/libvirt/libvirt.h.in | 2 ++ 2 files changed, 5 insertions(+) diff --git a/examples/domain-events/events-c/event-test.c b/examples/domain-events/events-c/event-test.c index 39bea49..f0e3fde 100644 --- a/examples/domain-events/events-c/event-test.c +++ b/examples/domain-events/events-c/event-test.c @@ -204,6 +204,9 @@ static const char *eventDetailToString(int event, int detail) { case VIR_DOMAIN_EVENT_PMSUSPENDED_DISK: ret = "Disk"; break; + case VIR_DOMAIN_EVENT_PMSUSPENDED_FROM_SNAPSHOT: + ret = "from snapshot"; + break; } break; } diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index ea57d00..58c02fb 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -3221,6 +3221,8 @@ typedef enum { typedef enum { VIR_DOMAIN_EVENT_PMSUSPENDED_MEMORY = 0, /* Guest was PM suspended to memory */ VIR_DOMAIN_EVENT_PMSUSPENDED_DISK = 1, /* Guest was PM suspended to disk */ + VIR_DOMAIN_EVENT_PMSUSPENDED_FROM_SNAPSHOT = 2, /* Snapshot of a machine + in S3/S4 was reverted */ #ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_EVENT_PMSUSPENDED_LAST -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
This patch adds an event to be raised if a PM suspended guest is restored from a snapshot. --- examples/domain-events/events-c/event-test.c | 3 +++ include/libvirt/libvirt.h.in | 2 ++ 2 files changed, 5 insertions(+)
Hmm, the fact that we allow a system checkpoint from a suspended guest probably means that we have some cleanups to do in the snapshot-create and snapshot-revert cases. For example, during create, if the CREATE_REDEFINE flags is set, right now we have logic that allows the user to change between running and paused, but prevents changing between running and shutoff; but we lack logic that prevents changing between pmsuspended and other states. I don't know if you cover this later in your series, or if your v3 will have to touch more places in the code, but we certainly have to think about it. Also, does qemu even support the ability to restore a guest into an S3 state? Remember, with S3, the qemu process still exists, and we can wake it up; with S4, the qemu process is gone, and the guest appears to be shutdown (the difference being that it will boot differently the next time it starts). Reverting into S4 is trivial - put the disk state back. But reverting into S3 requires cooperation from qemu; and right now, I don't think we have that. If you migrate an S3 guest from one machine to another, right now qemu will wake the guest on the destination, since it doesn't know how to migrate S3 state.
diff --git a/examples/domain-events/events-c/event-test.c b/examples/domain-events/events-c/event-test.c index 39bea49..f0e3fde 100644 --- a/examples/domain-events/events-c/event-test.c +++ b/examples/domain-events/events-c/event-test.c @@ -204,6 +204,9 @@ static const char *eventDetailToString(int event, int detail) { case VIR_DOMAIN_EVENT_PMSUSPENDED_DISK: ret = "Disk"; break; + case VIR_DOMAIN_EVENT_PMSUSPENDED_FROM_SNAPSHOT: + ret = "from snapshot"; + break;
Hmm. I'm a bit wary of our existing code - given that PMSUSPENDED_MEMORY (S3, qemu still exists) and PMSUSPENDED_DISK (S4, qemu process no longer exists so guest appears to be shut down) are quite different in effect, I'm not sure we are even tracking existing states correctly. Are we sure we should be adding yet another state, and if so, does PMSUSPENDED_FROM_SNAPSHOT mean the guest is in S3 (a qemu process exists and is ready to wake) or in S4 (no qemu process exists, but booting will be fast)? It is ambiguous to claim both S3 and S4 from the same event, since they are so different. In short, I'm not sure about this patch, and think we have stepped into a rat's nest of awkward code to begin with in relation to S3/S4. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Only external disk snapshots were taken into account while checking if snapshot deletion is possible. This patch adds checking also for external checkpoint.s --- src/qemu/qemu_driver.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 4cea78f..44bf6dc 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1998,7 +1998,7 @@ cleanup: } -/* Count how many snapshots in a set have external disk snapshots. */ +/* Count how many snapshots in a set are external snapshots or checkpoints. */ static void qemuDomainSnapshotCountExternal(void *payload, const void *name ATTRIBUTE_UNUSED, @@ -2007,7 +2007,7 @@ qemuDomainSnapshotCountExternal(void *payload, virDomainSnapshotObjPtr snap = payload; int *count = data; - if (snap->def->state == VIR_DOMAIN_DISK_SNAPSHOT) + if (virDomainSnapshotIsExternal(snap)) (*count)++; } @@ -12427,7 +12427,7 @@ static int qemuDomainSnapshotDelete(virDomainSnapshotPtr snapshot, if (!(flags & VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY)) { if (!(flags & VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN_ONLY) && - snap->def->state == VIR_DOMAIN_DISK_SNAPSHOT) + virDomainSnapshotIsExternal(snap)) external++; if (flags & VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN) virDomainSnapshotForEachDescendant(snap, -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
Only external disk snapshots were taken into account while checking if snapshot deletion is possible. This patch adds checking also for external checkpoint.s
s/\.s/s./
--- src/qemu/qemu_driver.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
ACK. If you wanted, you could squash this in with 16/20. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

This patch adds limited support for deleting external snaphots. The machine must not be active and only whole subtrees of snapshots can be deleted as reparenting was not yet implemented for external snapshots. --- This patch introduces a possible segfault, but I haven't had time to chase it. --- src/qemu/qemu_domain.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_driver.c | 23 ++++++++-- 2 files changed, 135 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index a5592b9..e8f9aed 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1724,13 +1724,102 @@ qemuDomainSnapshotForEachQcow2(struct qemud_driver *driver, op, try_all, def->ndisks); } -/* Discard one snapshot (or its metadata), without reparenting any children. */ -int -qemuDomainSnapshotDiscard(struct qemud_driver *driver, - virDomainObjPtr vm, - virDomainSnapshotObjPtr snap, - bool update_current, - bool metadata_only) +/* Discard one external snapshot (or its metadata), without reparenting any children */ +static int +qemuDomainSnapshotDiscardExternal(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + bool update_current, + bool metadata_only) +{ + char *snapFile = NULL; + int i; + int ret = -1; + virDomainSnapshotObjPtr parentsnap = NULL; + + if (!metadata_only) { + if (virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Can't delete snapshot '%s'. Domain '%s' is active."), + snap->def->name, vm->def->name); + goto cleanup; + } + + if (snap->def->file) { + if (unlink(snap->def->file) < 0) { + virReportSystemError(errno, + _("Failed to remove memory checkpoint " + "save file '%s'"), snap->def->file); + goto cleanup; + } + VIR_FREE(snap->def->file); + } + + for (i = 0; i < snap->def->ndisks; i++) { + virDomainSnapshotDiskDefPtr disk = &(snap->def->disks[i]); + if (disk->file && unlink(disk->file) < 0) { + virReportSystemError(errno, + _("Failed to remove disk snapshot file '%s'"), + disk->file); + goto cleanup; + } + VIR_FREE(disk->file); + } + } + + if (snap == vm->current_snapshot) { + if (update_current && snap->def->parent) { + parentsnap = virDomainSnapshotFindByName(vm->snapshots, + snap->def->parent); + if (!parentsnap) { + VIR_WARN("missing parent snapshot matching name '%s'", + snap->def->parent); + } else { + parentsnap->def->current = true; + + if (qemuDomainSnapshotWriteMetadata(vm, parentsnap, + driver->snapshotDir) < 0) { + VIR_WARN("failed to set parent snapshot '%s' as current", + snap->def->parent); + parentsnap->def->current = false; + parentsnap = NULL; + } + } + } + vm->current_snapshot = parentsnap; + } + + if (virAsprintf(&snapFile, "%s/%s/%s.xml", driver->snapshotDir, + vm->def->name, snap->def->name) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (unlink(snapFile) < 0) + VIR_WARN("Failed to unlink %s", snapFile); + virDomainSnapshotObjListRemove(vm->snapshots, snap); + + ret = 0; + +cleanup: + if (ret < 0 && !snapFile && + qemuDomainSnapshotWriteMetadata(vm, snap, driver->snapshotDir) < 0) + VIR_WARN("Failed to store modified snapshot config"); + + VIR_FREE(snapFile); + + return ret; +} + +/* Discard one internal snapshot (or its metadata), + * without reparenting any children. + */ +static int +qemuDomainSnapshotDiscardInternal(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + bool update_current, + bool metadata_only) { char *snapFile = NULL; int ret = -1; @@ -1791,6 +1880,25 @@ cleanup: return ret; } +/* Discard one snapshot (or its metadata), without reparenting any children. */ +int +qemuDomainSnapshotDiscard(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + bool update_current, + bool metadata_only) +{ + int ret; + + if (virDomainSnapshotIsExternal(snap)) + ret = qemuDomainSnapshotDiscardExternal(driver, vm, snap, update_current, metadata_only); + else + ret = qemuDomainSnapshotDiscardInternal(driver, vm, snap, update_current, metadata_only); + + return ret; +} + + /* Hash iterator callback to discard multiple snapshots. */ void qemuDomainSnapshotDiscardAll(void *payload, const void *name ATTRIBUTE_UNUSED, diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 44bf6dc..46b7e3a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -12376,6 +12376,13 @@ qemuDomainSnapshotReparentChildren(void *payload, return; } + if (virDomainSnapshotIsExternal(snap)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("Reparenting of external snapshots is not implemented")); + rep->err = -1; + return; + } + VIR_FREE(snap->def->parent); snap->parent = rep->parent; @@ -12433,10 +12440,20 @@ static int qemuDomainSnapshotDelete(virDomainSnapshotPtr snapshot, virDomainSnapshotForEachDescendant(snap, qemuDomainSnapshotCountExternal, &external); - if (external) { + if (external && + virDomainObjIsActive(vm)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("deletion of %d external disk snapshots not " - "supported yet"), external); + _("deletion of %d external disk snapshots is " + "supported only on inactive guests"), external); + goto cleanup; + } + + if (external && + !(flags & (VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN | + VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN_ONLY))) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("only deletion of external snapshot subtrees is " + "implemented")); goto cleanup; } } -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
This patch adds limited support for deleting external snaphots. The
s/snaphots/snapshots/
machine must not be active and only whole subtrees of snapshots can be deleted as reparenting was not yet implemented for external snapshots.
These are reasonable restrictions for the first implementation. We may relax some of them later - for example, if qemu adds support for doing blockcommit from the active layer, then that would let us do a live deletion of an external snapshot.
--- This patch introduces a possible segfault, but I haven't had time to chase it.
Can you describe the steps you reproduced it with? I'm not sure I saw it either.
+++ b/src/qemu/qemu_domain.c @@ -1724,13 +1724,102 @@ qemuDomainSnapshotForEachQcow2(struct qemud_driver *driver, op, try_all, def->ndisks); }
-/* Discard one snapshot (or its metadata), without reparenting any children. */ -int -qemuDomainSnapshotDiscard(struct qemud_driver *driver, - virDomainObjPtr vm, - virDomainSnapshotObjPtr snap, - bool update_current, - bool metadata_only) +/* Discard one external snapshot (or its metadata), without reparenting any children */ +static int +qemuDomainSnapshotDiscardExternal(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + bool update_current, + bool metadata_only) +{ + char *snapFile = NULL; + int i; + int ret = -1; + virDomainSnapshotObjPtr parentsnap = NULL; + + if (!metadata_only) { + if (virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Can't delete snapshot '%s'. Domain '%s' is active."), + snap->def->name, vm->def->name); + goto cleanup; + } + + if (snap->def->file) { + if (unlink(snap->def->file) < 0) { + virReportSystemError(errno, + _("Failed to remove memory checkpoint " + "save file '%s'"), snap->def->file); + goto cleanup; + } + VIR_FREE(snap->def->file); + }
This part makes sense.
+ + for (i = 0; i < snap->def->ndisks; i++) { + virDomainSnapshotDiskDefPtr disk = &(snap->def->disks[i]); + if (disk->file && unlink(disk->file) < 0) { + virReportSystemError(errno, + _("Failed to remove disk snapshot file '%s'"), + disk->file); + goto cleanup; + } + VIR_FREE(disk->file); + }
But this part is rather drastic. More on this below.
+ } +
From here...
+ if (snap == vm->current_snapshot) { + if (update_current && snap->def->parent) { + parentsnap = virDomainSnapshotFindByName(vm->snapshots, + snap->def->parent); + if (!parentsnap) { + VIR_WARN("missing parent snapshot matching name '%s'", + snap->def->parent); + } else { + parentsnap->def->current = true; + + if (qemuDomainSnapshotWriteMetadata(vm, parentsnap, + driver->snapshotDir) < 0) { + VIR_WARN("failed to set parent snapshot '%s' as current", + snap->def->parent); + parentsnap->def->current = false; + parentsnap = NULL; + } + } + } + vm->current_snapshot = parentsnap; + } + + if (virAsprintf(&snapFile, "%s/%s/%s.xml", driver->snapshotDir, + vm->def->name, snap->def->name) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (unlink(snapFile) < 0) + VIR_WARN("Failed to unlink %s", snapFile); + virDomainSnapshotObjListRemove(vm->snapshots, snap);
...to here, the code is almost identical to the internal snapshot case, except that you reordered things so that they are unsafe (that is, the internal case ensures that we malloc in order to detect any OOM _prior_ to modifying any parent snapshot). I think the tail of these two functions should instead be common functionality...
+ + ret = 0; + +cleanup: + if (ret < 0 && !snapFile && + qemuDomainSnapshotWriteMetadata(vm, snap, driver->snapshotDir) < 0) + VIR_WARN("Failed to store modified snapshot config"); + + VIR_FREE(snapFile); + + return ret; +} + +/* Discard one internal snapshot (or its metadata), + * without reparenting any children. + */ +static int +qemuDomainSnapshotDiscardInternal(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + bool update_current, + bool metadata_only) { char *snapFile = NULL; int ret = -1; @@ -1791,6 +1880,25 @@ cleanup: return ret; }
+/* Discard one snapshot (or its metadata), without reparenting any children. */ +int +qemuDomainSnapshotDiscard(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + bool update_current, + bool metadata_only) +{ + int ret; + + if (virDomainSnapshotIsExternal(snap)) + ret = qemuDomainSnapshotDiscardExternal(driver, vm, snap, update_current, metadata_only); + else + ret = qemuDomainSnapshotDiscardInternal(driver, vm, snap, update_current, metadata_only);
...here.
+ + return ret; +} + + /* Hash iterator callback to discard multiple snapshots. */ void qemuDomainSnapshotDiscardAll(void *payload, const void *name ATTRIBUTE_UNUSED, diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 44bf6dc..46b7e3a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -12376,6 +12376,13 @@ qemuDomainSnapshotReparentChildren(void *payload, return; }
+ if (virDomainSnapshotIsExternal(snap)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("Reparenting of external snapshots is not implemented")); + rep->err = -1; + return; + }
Why not? If I have the snapshot chain: internal_1 <- internal_2 <- external and delete just internal_2, why can't I reparent external to point to internal_1? I'm wondering if the memory corruption is happening here by returning failure in part of the tree, while the rest of the tree assumes success, so that the end result is a corrupted tree that points into free'd storage.
+ VIR_FREE(snap->def->parent); snap->parent = rep->parent;
@@ -12433,10 +12440,20 @@ static int qemuDomainSnapshotDelete(virDomainSnapshotPtr snapshot, virDomainSnapshotForEachDescendant(snap, qemuDomainSnapshotCountExternal, &external); - if (external) { + if (external && + virDomainObjIsActive(vm)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("deletion of %d external disk snapshots not " - "supported yet"), external); + _("deletion of %d external disk snapshots is " + "supported only on inactive guests"), external); + goto cleanup; + }
Like I said earlier, this may be a reasonable first round, since we can't do live blockcommit yet. However, I think we're missing a bigger picture here. Remember, with internal snapshots, the way things work is that the same file holds both the snapshot (a point in time) and the delta (all changes since that time). Right now, the action of deleting a snapshot says to forget about the point in time, but preserve all changes since that point in time, if those changes belong to either the current state or to another branch of the snapshot hierarchy. The only time data is actually deleted is if it is not in current use and there is no other snapshot remaining that can get back to the data. But above, you just blindly unlink()d the qcow2 wrapper file, which in effect discarded the delta (all changes since the snapshot), and _reverted_ us back to the point in time where the snapshot was taken. Probably not what the user meant; or if it is, this is a NEW mode of operation, so it should only be allowed if the user passes in a new flag, maybe named VIR_DOMAIN_SNAPSHOT_DELETE_REVERT, which says to delete the external snapshot files _and_ revert to the state at the time it was taken. In fact, this particular mode of operation might fit better with virDomainRevertToSnapshot instead of virDomainSnapshotDelete (again, controlled by a flag), but let's first come to an agreement on the scenarios that users might want... 1. I created a disk snapshot, and it created new qcow2 files. Now I no longer need to go to that point in time, but I want to keep the qcow2 files, without losing my changes since the snapshot. Solution: 'virsh snapshot-delete --metadata' merely forget that we took the snapshot, but keep the backing chain as-is. 2. I created a disk snapshot, but now I no longer need to go back, and I want my disk chain as short as it was before I created the snapshot, and without losing my changes since the snapshot. Solution: use 'qemu-img commit' to squash the delta back into the original file, then delete the backing file as no longer necessary. If any further snapshots depended on the backing file, then they also need a touchup (reparenting) to deal with the change in the backing chain, or else they must also be deleted. 'qemu-img commit' can only be run while the disk is offline; worse, it is time-consuming, so we now have to deal with the fact that deleting a snapshot probably ought to be an async job with status updates, rather than a blocking job. But if we implement this solution, we can also use it for 'virsh blockcommit' for offline commits. This mode most closely matches what we do when deleting an internal snapshot. 3. I created an external checkpoint, and now I no longer need to go that point in time. Solution: Delete the memory state file, then all we are left with is a disk snapshot, so we can now use scenario 1 or 2 to clean up the rest. 4. I created a disk snapshot, then did some experiments, and decided I don't need the result and want to go back to that point in time without remembering the experiments. Solution: combination revert and delete, where we reset the domain to use the backing file and delete the qcow2 wrapper files. 5. I created an external checkpoint of a running machine, and now want to go back to that point in time. Solution: like 4, this is a combination revert and delete, where we delete qcow2 wrapper files, but we also boot using the saved memory image and backing files. 6. I created an external checkpoint, did some experiments, and want to remember those results; but I also want to go back to the checkpoint and do some more experiments on a different branch. Solution: here, we need the ability to create NEW qcow2 files that also wrap the common base image. Since virDomainRevertToSnapshot doesn't have a way for us to pass in the new file names we wish to use, this needs to go through virDomainSnapshotCreateXML, which needs a new flag that says that we are simultaneously reverting to the state of an existing snapshot and creating a new snapshot from that point rather than at the current state of execution. 7. Once we allow creation of snapshot forks from a common base point, we need to worry about deleting one fork but keeping other forks alive - for an external checkpoint, the memory file is now effectively shared by multiple snapshots, and must not be deleted until the last snapshot referencing the file is deleted. 8. It is also possible to use multiple snapshots to create a longer chain, such as 'base <- snap1 <- snap2 <- current', and then delete snap1 while still remembering snap2. Right now, we can use blockcommit while qemu is live (and a sequence of qemu-img commit and qemu-img rebase calls for offline) to merge snap1 into base; we could also use blockpull (via qemu-img rebase while offline, but currently lacking support if qemu is running) to pull snap1 into snap2. That is, forbidding the user from deleting an external snapshot unless they also delete descendants is not necessarily desirable. I guess I've turned this more into a brain dump of where we should be headed - there is really a lot of functionality involved in reverting to external files, and we should be exposing choices to the user whether to pull into current, push into backing, discard files, or create parallel branches from the same starting point; and solutions to some of these issues touch not only snapshot deletion, but also snapshot revert and snapshot create. For this particular patch, though, I think we definitely want to introduce a new flag that says when we are doing a combined revert and delete, so that we can leave the option of deletion without losing current state (similar to how internal snapshot deletion works) until a later patch where we use qemu-img commit to do the work. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/06/12 23:39, Eric Blake wrote:
On 11/01/2012 10:22 AM, Peter Krempa wrote:
This patch adds limited support for deleting external snaphots. The
s/snaphots/snapshots/
machine must not be active and only whole subtrees of snapshots can be deleted as reparenting was not yet implemented for external snapshots.
These are reasonable restrictions for the first implementation. We may relax some of them later - for example, if qemu adds support for doing blockcommit from the active layer, then that would let us do a live deletion of an external snapshot.
--- This patch introduces a possible segfault, but I haven't had time to chase it.
Can you describe the steps you reproduced it with? I'm not sure I saw it either.
+++ b/src/qemu/qemu_domain.c @@ -1724,13 +1724,102 @@ qemuDomainSnapshotForEachQcow2(struct qemud_driver *driver, op, try_all, def->ndisks); }
-/* Discard one snapshot (or its metadata), without reparenting any children. */ -int -qemuDomainSnapshotDiscard(struct qemud_driver *driver, - virDomainObjPtr vm, - virDomainSnapshotObjPtr snap, - bool update_current, - bool metadata_only) +/* Discard one external snapshot (or its metadata), without reparenting any children */ +static int +qemuDomainSnapshotDiscardExternal(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + bool update_current, + bool metadata_only) +{ + char *snapFile = NULL; + int i; + int ret = -1; + virDomainSnapshotObjPtr parentsnap = NULL; + + if (!metadata_only) { + if (virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Can't delete snapshot '%s'. Domain '%s' is active."), + snap->def->name, vm->def->name); + goto cleanup; + } + + if (snap->def->file) { + if (unlink(snap->def->file) < 0) { + virReportSystemError(errno, + _("Failed to remove memory checkpoint " + "save file '%s'"), snap->def->file); + goto cleanup; + } + VIR_FREE(snap->def->file); + }
This part makes sense.
+ + for (i = 0; i < snap->def->ndisks; i++) { + virDomainSnapshotDiskDefPtr disk = &(snap->def->disks[i]); + if (disk->file && unlink(disk->file) < 0) { + virReportSystemError(errno, + _("Failed to remove disk snapshot file '%s'"), + disk->file); + goto cleanup; + } + VIR_FREE(disk->file); + }
But this part is rather drastic. More on this below.
+ } +
From here...
+ if (snap == vm->current_snapshot) { + if (update_current && snap->def->parent) { + parentsnap = virDomainSnapshotFindByName(vm->snapshots, + snap->def->parent); + if (!parentsnap) { + VIR_WARN("missing parent snapshot matching name '%s'", + snap->def->parent); + } else { + parentsnap->def->current = true; + + if (qemuDomainSnapshotWriteMetadata(vm, parentsnap, + driver->snapshotDir) < 0) { + VIR_WARN("failed to set parent snapshot '%s' as current", + snap->def->parent); + parentsnap->def->current = false; + parentsnap = NULL; + } + } + } + vm->current_snapshot = parentsnap; + } + + if (virAsprintf(&snapFile, "%s/%s/%s.xml", driver->snapshotDir, + vm->def->name, snap->def->name) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (unlink(snapFile) < 0) + VIR_WARN("Failed to unlink %s", snapFile); + virDomainSnapshotObjListRemove(vm->snapshots, snap);
...to here, the code is almost identical to the internal snapshot case, except that you reordered things so that they are unsafe (that is, the internal case ensures that we malloc in order to detect any OOM _prior_ to modifying any parent snapshot). I think the tail of these two functions should instead be common functionality...
+ + ret = 0; + +cleanup: + if (ret < 0 && !snapFile && + qemuDomainSnapshotWriteMetadata(vm, snap, driver->snapshotDir) < 0) + VIR_WARN("Failed to store modified snapshot config"); + + VIR_FREE(snapFile); + + return ret; +} + +/* Discard one internal snapshot (or its metadata), + * without reparenting any children. + */ +static int +qemuDomainSnapshotDiscardInternal(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + bool update_current, + bool metadata_only) { char *snapFile = NULL; int ret = -1; @@ -1791,6 +1880,25 @@ cleanup: return ret; }
+/* Discard one snapshot (or its metadata), without reparenting any children. */ +int +qemuDomainSnapshotDiscard(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + bool update_current, + bool metadata_only) +{ + int ret; + + if (virDomainSnapshotIsExternal(snap)) + ret = qemuDomainSnapshotDiscardExternal(driver, vm, snap, update_current, metadata_only); + else + ret = qemuDomainSnapshotDiscardInternal(driver, vm, snap, update_current, metadata_only);
...here.
+ + return ret; +} + + /* Hash iterator callback to discard multiple snapshots. */ void qemuDomainSnapshotDiscardAll(void *payload, const void *name ATTRIBUTE_UNUSED, diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 44bf6dc..46b7e3a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -12376,6 +12376,13 @@ qemuDomainSnapshotReparentChildren(void *payload, return; }
+ if (virDomainSnapshotIsExternal(snap)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("Reparenting of external snapshots is not implemented")); + rep->err = -1; + return; + }
Why not? If I have the snapshot chain:
internal_1 <- internal_2 <- external
and delete just internal_2, why can't I reparent external to point to internal_1? I'm wondering if the memory corruption is happening here by returning failure in part of the tree, while the rest of the tree assumes success, so that the end result is a corrupted tree that points into free'd storage.
+ VIR_FREE(snap->def->parent); snap->parent = rep->parent;
@@ -12433,10 +12440,20 @@ static int qemuDomainSnapshotDelete(virDomainSnapshotPtr snapshot, virDomainSnapshotForEachDescendant(snap, qemuDomainSnapshotCountExternal, &external); - if (external) { + if (external && + virDomainObjIsActive(vm)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("deletion of %d external disk snapshots not " - "supported yet"), external); + _("deletion of %d external disk snapshots is " + "supported only on inactive guests"), external); + goto cleanup; + }
Like I said earlier, this may be a reasonable first round, since we can't do live blockcommit yet.
However, I think we're missing a bigger picture here. Remember, with internal snapshots, the way things work is that the same file holds both the snapshot (a point in time) and the delta (all changes since that time). Right now, the action of deleting a snapshot says to forget about the point in time, but preserve all changes since that point in time, if those changes belong to either the current state or to another branch of the snapshot hierarchy. The only time data is actually deleted is if it is not in current use and there is no other snapshot remaining that can get back to the data.
But above, you just blindly unlink()d the qcow2 wrapper file, which in effect discarded the delta (all changes since the snapshot), and _reverted_ us back to the point in time where the snapshot was taken.
I was too carried away about wiring up the revert code that this also ended up to be a form of revert.
Probably not what the user meant; or if it is, this is a NEW mode of operation, so it should only be allowed if the user passes in a new flag, maybe named VIR_DOMAIN_SNAPSHOT_DELETE_REVERT, which says to delete the external snapshot files _and_ revert to the state at the time it was taken. In fact, this particular mode of operation might fit better with virDomainRevertToSnapshot instead of virDomainSnapshotDelete (again, controlled by a flag), but let's first come to an agreement on the scenarios that users might want...
Yes that will be definitely better to have it in the revert API.
1. I created a disk snapshot, and it created new qcow2 files. Now I no longer need to go to that point in time, but I want to keep the qcow2 files, without losing my changes since the snapshot. Solution: 'virsh snapshot-delete --metadata' merely forget that we took the snapshot, but keep the backing chain as-is.
This should work as it is right now, unfortunately it leaves behind the mess of files that are/were part of the backing chain.
2. I created a disk snapshot, but now I no longer need to go back, and I want my disk chain as short as it was before I created the snapshot, and without losing my changes since the snapshot. Solution: use 'qemu-img commit' to squash the delta back into the original file, then delete the backing file as no longer necessary. If any further snapshots depended on the backing file, then they also need a touchup (reparenting) to deal with the change in the backing chain, or else they must also be deleted. 'qemu-img commit' can only be run while the disk is offline; worse, it is time-consuming, so we now have to deal with the fact that deleting a snapshot probably ought to be an async job with status updates, rather than a blocking job. But if we implement this solution, we can also use it for 'virsh blockcommit' for offline commits. This mode most closely matches what we do when deleting an internal snapshot.
I'm not sure if this is something we want to do entirely in libvirt. When committing into the original image (or other part of the backing chain) you might invalidate a subtree that will be created with the
3. I created an external checkpoint, and now I no longer need to go that point in time. Solution: Delete the memory state file, then all we are left with is a disk snapshot, so we can now use scenario 1 or 2 to clean up the rest.
Agreed. When you have a separate copy at every snapshot all things get easier. Unfortunately we can't do that with disk images :)
4. I created a disk snapshot, then did some experiments, and decided I don't need the result and want to go back to that point in time without remembering the experiments. Solution: combination revert and delete, where we reset the domain to use the backing file and delete the qcow2 wrapper files.
5. I created an external checkpoint of a running machine, and now want to go back to that point in time. Solution: like 4, this is a combination revert and delete, where we delete qcow2 wrapper files, but we also boot using the saved memory image and backing files.
This is the original functionality I wanted to implement in patch 20/20.
6. I created an external checkpoint, did some experiments, and want to remember those results; but I also want to go back to the checkpoint and do some more experiments on a different branch. Solution: here, we need the ability to create NEW qcow2 files that also wrap the common base image. Since virDomainRevertToSnapshot doesn't have a way for us to pass in the new file names we wish to use, this needs to go through virDomainSnapshotCreateXML, which needs a new flag that says that we are simultaneously reverting to the state of an existing snapshot and creating a new snapshot from that point rather than at the current state of execution.
I saw your submission. But this will probably have to be a 2 step procedure: 1) create a snapshot at the original branch to have a point to return to 2) create the branch and revert to that point in time Otherwise it will be hard to return to that original branch. Basically we need some kind of "current" snapshot in each of the branches. Technically that should be easy for a management application, but it won't be that comfortable to do in virsh.
7. Once we allow creation of snapshot forks from a common base point, we need to worry about deleting one fork but keeping other forks alive - for an external checkpoint, the memory file is now effectively shared by multiple snapshots, and must not be deleted until the last snapshot referencing the file is deleted.
And also we just can't commit one of the forks into the base image as the other fork would be invalidated. We'd need to pull that changes into the children.
8. It is also possible to use multiple snapshots to create a longer chain, such as 'base <- snap1 <- snap2 <- current', and then delete snap1 while still remembering snap2. Right now, we can use blockcommit while qemu is live (and a sequence of qemu-img commit and qemu-img rebase calls for offline) to merge snap1 into base; we could also use blockpull (via qemu-img rebase while offline, but currently lacking support if qemu is running) to pull snap1 into snap2. That is, forbidding the user from deleting an external snapshot unless they also delete descendants is not necessarily desirable.
I guess I've turned this more into a brain dump of where we should be headed - there is really a lot of functionality involved in reverting to external files, and we should be exposing choices to the user whether to pull into current, push into backing, discard files, or create parallel branches from the same starting point; and solutions to some of these issues touch not only snapshot deletion, but also snapshot revert and snapshot create. For this particular patch, though, I think we definitely want to introduce a new flag that says when we are doing a combined revert and delete, so that we can leave the option of deletion without losing current state (similar to how internal snapshot deletion works) until a later patch where we use qemu-img commit to do the work.
I will turn the relevant parts of this patch to the revert API as it's semantically closer to that functionality. Peter

On 11/09/12 12:36, Peter Krempa wrote:
On 11/06/12 23:39, Eric Blake wrote:
On 11/01/2012 10:22 AM, Peter Krempa wrote:
This patch adds limited support for deleting external snaphots. The
s/snaphots/snapshots/
machine must not be active and only whole subtrees of snapshots can be deleted as reparenting was not yet implemented for external snapshots.
These are reasonable restrictions for the first implementation. We may relax some of them later - for example, if qemu adds support for doing blockcommit from the active layer, then that would let us do a live deletion of an external snapshot.
---
[...]
6. I created an external checkpoint, did some experiments, and want to remember those results; but I also want to go back to the checkpoint and do some more experiments on a different branch. Solution: here, we need the ability to create NEW qcow2 files that also wrap the common base image. Since virDomainRevertToSnapshot doesn't have a way for us to pass in the new file names we wish to use, this needs to go through virDomainSnapshotCreateXML, which needs a new flag that says that we are simultaneously reverting to the state of an existing snapshot and creating a new snapshot from that point rather than at the current state of execution.
I saw your submission. But this will probably have to be a 2 step procedure:
1) create a snapshot at the original branch to have a point to return to 2) create the branch and revert to that point in time
Otherwise it will be hard to return to that original branch. Basically we need some kind of "current" snapshot in each of the branches. Technically that should be easy for a management application, but it won't be that comfortable to do in virsh.
An option would be automatically creating a "current" snapshot when leaving a branch that would re-use the disk image files, but will need to be automatically deleted as soon as the restore to it is complete.

This patch adds support for reverting of external snapshots. The support is somewhat limited yet (you can only revert to a snapshot that has no children or delete the children that would have their image chains invalidated). While reverting an external snapshot, the domain has to be taken offline as there's no possiblity to start a migration from file on a running machine. This poses a few dificulties when the user has virt-viewer attached as apropriate events need to be re-emited even if the machine doesn't change states. --- src/qemu/qemu_driver.c | 333 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 328 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 46b7e3a..dd4c216 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -12012,6 +12012,327 @@ qemuDomainSnapshotRevertInactive(struct qemud_driver *driver, return ret > 0 ? -1 : ret; } +/* helper to create lifecycle for possible outcomes of reverting snapshots */ +static int +qemuDomainSnapshotCreateEvent(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainState old_state, + virDomainState new_state) +{ + int reason = 0; /* generic unknown reason */ + int type = -1; + virDomainEventPtr event; + + switch (new_state) { + case VIR_DOMAIN_RUNNING: + switch (old_state) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + case VIR_DOMAIN_RUNNING: + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_SHUTDOWN: + type = VIR_DOMAIN_EVENT_STARTED; + reason = VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_PMSUSPENDED: + case VIR_DOMAIN_PAUSED: + type = VIR_DOMAIN_EVENT_RESUMED; + reason = VIR_DOMAIN_EVENT_RESUMED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_LAST: + /* no event */ + break; + } + break; + + case VIR_DOMAIN_PAUSED: + switch (old_state) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + /* the machine was started here, so we need an aditional event */ + event = virDomainEventNewFromObj(vm, + VIR_DOMAIN_EVENT_STARTED, + VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT); + if (!event) + goto no_memory; + qemuDomainEventQueue(driver, event); + /* fallthrough! */ + case VIR_DOMAIN_PMSUSPENDED: + case VIR_DOMAIN_RUNNING: + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_BLOCKED: + type = VIR_DOMAIN_EVENT_SUSPENDED; + reason = VIR_DOMAIN_EVENT_SUSPENDED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_PAUSED: + case VIR_DOMAIN_LAST: + /* no event */ + break; + } + break; + + case VIR_DOMAIN_SHUTOFF: + switch (old_state) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_PAUSED: + case VIR_DOMAIN_RUNNING: + case VIR_DOMAIN_PMSUSPENDED: + type = VIR_DOMAIN_EVENT_STOPPED; + reason = VIR_DOMAIN_EVENT_STOPPED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + case VIR_DOMAIN_LAST: + /* no event */ + break; + } + break; + + case VIR_DOMAIN_PMSUSPENDED: + switch (old_state) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + /* the machine was started here, so we need an aditional event */ + event = virDomainEventNewFromObj(vm, + VIR_DOMAIN_EVENT_STARTED, + VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT); + if (!event) + goto no_memory; + + qemuDomainEventQueue(driver, event); + /* fallthrough! */ + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_PAUSED: + case VIR_DOMAIN_RUNNING: + type = VIR_DOMAIN_EVENT_PMSUSPENDED; + reason = VIR_DOMAIN_EVENT_PMSUSPENDED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_PMSUSPENDED: + case VIR_DOMAIN_LAST: + /* no event */ + break; + } + break; + + /* fallthrough */ + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_CRASHED: + case VIR_DOMAIN_LAST: + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Suspicious domain state '%d' after snapshot"), + new_state); + return -1; + } + + if (!(event = virDomainEventNewFromObj(vm, type, reason))) + goto no_memory; + + qemuDomainEventQueue(driver, event); + + return 0; + +no_memory: + virReportOOMError(); + return -1; +} + + +/* The domain and driver is expected to be locked */ +static int +qemuDomainSnapshotRevertExternal(virConnectPtr conn, + struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + unsigned int flags) +{ + int ret = -1; + int loadfd = -1; + + virDomainDefPtr save_def = NULL; + struct qemud_save_header header; + virFileWrapperFdPtr wrapperFd = NULL; + + virDomainState old_state = virDomainObjGetState(vm, NULL); + virDomainDefPtr config = NULL; + + struct qemu_snap_remove rem; + + /* check if snapshot has children */ + if (snap->nchildren > 0 && + !(flags & VIR_DOMAIN_SNAPSHOT_REVERT_FORCE)) { + virReportError(VIR_ERR_SNAPSHOT_REVERT_RISKY, + _("Can't revert to snapshot '%s'. " + "Snapshot has children."), snap->def->name); + goto cleanup; + } + + if (vm->current_snapshot) { + vm->current_snapshot->def->current = false; + if (qemuDomainSnapshotWriteMetadata(vm, vm->current_snapshot, + driver->snapshotDir) < 0) + goto cleanup; + vm->current_snapshot = NULL; + /* XXX Should we restore vm->current_snapshot after this point + * in the failure cases where we know there was no change? */ + } + + /* Prepare to copy the snapshot inactive xml as the config of this + * domain. Easiest way is by a round trip through xml. + * + * XXX Should domain snapshots track live xml rather + * than inactive xml? */ + snap->def->current = true; + if (snap->def->dom) { + char *xml; + if (!(xml = qemuDomainDefFormatXML(driver, + snap->def->dom, + VIR_DOMAIN_XML_INACTIVE | + VIR_DOMAIN_XML_SECURE | + VIR_DOMAIN_XML_MIGRATABLE))) + goto cleanup; + + config = virDomainDefParseString(driver->caps, xml, + QEMU_EXPECTED_VIRT_TYPES, + VIR_DOMAIN_XML_INACTIVE); + VIR_FREE(xml); + if (!config) + goto cleanup; + } else { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Domain definition not found in snapshot '%s'"), + snap->def->name); + goto cleanup; + } + + /* try to open the memory save image if one exists */ + if (snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + if (!snap->def->file || !virFileExists(snap->def->file)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Memory save file '%s' does not exist."), + snap->def->file); + goto cleanup; + } + + /* open the save image file */ + if ((loadfd = qemuDomainSaveImageOpen(driver, snap->def->file, + &save_def, &header, + false, &wrapperFd, NULL, + -1, false, false)) < 0) + goto cleanup; + + /* opening succeeded, there's a big probability the revert will work. + * We can now get rid of the active qemu, if it runs */ + } + + if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) + goto cleanup; + + /* Now we destroy the (possibly) running qemu process. + * Unfortunately, the guest can be restored only using incomming migration. + */ + if (virDomainObjIsActive(vm)) { + qemuProcessStop(driver, vm, + VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT, 0); + virDomainAuditStop(vm, "from-snapshot"); + } + + /* remove snapshot descendatns, as we're invalidating the image chain */ + + rem.driver = driver; + rem.vm = vm; + rem.metadata_only = false; + rem.err = 0; + rem.current = false; + virDomainSnapshotForEachDescendant(snap, + qemuDomainSnapshotDiscardAll, + &rem); + if (rem.err < 0) + goto endjob; + + /* assign domain definition */ + virDomainObjAssignDef(vm, config, false); + config = NULL; + + /* wipe and re-create disk images */ + if (qemuDomainSnapshotCreateInactiveExternal(driver, vm, snap, true) < 0) + goto endjob; + + /* save the config files */ + if (virDomainSaveConfig(driver->configDir, vm->def) < 0) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("Write to config file failed")); + goto endjob; + } + + /* re-start the guest if necessary */ + if (!(flags & VIR_DOMAIN_SNAPSHOT_REVERT_STOPPED) && + (snap->def->state == VIR_DOMAIN_RUNNING || + snap->def->state == VIR_DOMAIN_PAUSED || + flags & VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING || + flags & VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED)) { + + if (loadfd < 0) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Can't revert snapshot to running state. " + "Memory image not found.")); + goto endjob; + } + + if ((ret = qemuDomainSaveImageLoad(conn, driver, + vm, &loadfd, + &header, snap->def->file)) < 0) { + virDomainAuditStart(vm, "from-snapshot", false); + goto endjob; + } + + virDomainAuditStart(vm, "from-snapshot", true); + + if ((snap->def->state == VIR_DOMAIN_RUNNING && + !(flags & VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED)) || + flags & VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING) { + if (qemuProcessStartCPUs(driver, vm, conn, + VIR_DOMAIN_RUNNING_FROM_SNAPSHOT, + QEMU_ASYNC_JOB_NONE) < 0) + goto endjob; + } + } + + /* create corresponding life cycle events */ + if (qemuDomainSnapshotCreateEvent(driver, vm, old_state, + virDomainObjGetState(vm, NULL)) < 0) + goto endjob; + + ret = 0; +endjob: + if (qemuDomainObjEndJob(driver, vm) == 0) { + vm = NULL; + } else if (ret < 0 && !vm->persistent) { + qemuDomainRemoveInactive(driver, vm); + vm = NULL; + } + +cleanup: + virDomainDefFree(config); + virDomainDefFree(save_def); + VIR_FORCE_CLOSE(loadfd); + virFileWrapperFdFree(wrapperFd); + + return ret; +} + static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, unsigned int flags) { @@ -12072,12 +12393,14 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, "to revert to inactive snapshot")); goto cleanup; } - if (snap->def->state == VIR_DOMAIN_DISK_SNAPSHOT) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("revert to external disk snapshot not supported " - "yet")); + + /* Check if reverting an external snapshot */ + if (virDomainSnapshotIsExternal(snap)) { + ret = qemuDomainSnapshotRevertExternal(snapshot->domain->conn, + driver, vm, snap, flags); goto cleanup; } + if (!(flags & VIR_DOMAIN_SNAPSHOT_REVERT_FORCE)) { if (!snap->def->dom) { virReportError(VIR_ERR_SNAPSHOT_REVERT_RISKY, @@ -12096,7 +12419,6 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, } } - if (vm->current_snapshot) { vm->current_snapshot->def->current = false; if (qemuDomainSnapshotWriteMetadata(vm, vm->current_snapshot, @@ -12129,6 +12451,7 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, goto cleanup; } + /* Internal snapshot revert code starts below */ if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) goto cleanup; -- 1.7.12.4

On 11/01/2012 10:22 AM, Peter Krempa wrote:
This patch adds support for reverting of external snapshots. The support is somewhat limited yet (you can only revert to a snapshot that has no children or delete the children that would have their image chains invalidated).
Sounds like we need to work on the snapshot-create-and-revert operation; I'll see if I can propose something for that.
While reverting an external snapshot, the domain has to be taken offline as there's no possiblity to start a migration from file on a running
s/possiblity/possibility/
machine. This poses a few dificulties when the user has virt-viewer
s/dificulties/difficulties/
attached as apropriate events need to be re-emited even if the machine
s/apropriate/appropriate/ s/re-emited/re-emitted/
doesn't change states.
The need to start a new qemu process in order to revert to an external checkpoint sounds like a case for the already-existing VIR_DOMAIN_SNAPSHOT_REVERT_FORCE - if that flag is missing, then we refuse to revert to a running state just like we refuse to revert across guest ABI changes; when the flag is present, then the user is admitting that spawning a new qemu process (and thus temporarily breaking virt-viewer connection) is acceptable.
--- src/qemu/qemu_driver.c | 333 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 328 insertions(+), 5 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 46b7e3a..dd4c216 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -12012,6 +12012,327 @@ qemuDomainSnapshotRevertInactive(struct qemud_driver *driver, return ret > 0 ? -1 : ret; }
+/* helper to create lifecycle for possible outcomes of reverting snapshots */ +static int +qemuDomainSnapshotCreateEvent(struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainState old_state, + virDomainState new_state) +{ + int reason = 0; /* generic unknown reason */ + int type = -1; + virDomainEventPtr event; + + switch (new_state) { + case VIR_DOMAIN_RUNNING: + switch (old_state) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + case VIR_DOMAIN_RUNNING:
Do we need an EVENT_STARTED even when we were previously running? Or is that only if we had to spawn a new qemu process (in which case, this function might need a bool parameter that says whether we are reusing an existing qemu or creating a new one).
+ case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_SHUTDOWN: + type = VIR_DOMAIN_EVENT_STARTED; + reason = VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_PMSUSPENDED: + case VIR_DOMAIN_PAUSED: + type = VIR_DOMAIN_EVENT_RESUMED; + reason = VIR_DOMAIN_EVENT_RESUMED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_LAST: + /* no event */ + break; + } + break; + + case VIR_DOMAIN_PAUSED: + switch (old_state) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + /* the machine was started here, so we need an aditional event */
s/aditional/additional/
+ event = virDomainEventNewFromObj(vm, + VIR_DOMAIN_EVENT_STARTED, + VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT); + if (!event) + goto no_memory; + qemuDomainEventQueue(driver, event); + /* fallthrough! */ + case VIR_DOMAIN_PMSUSPENDED: + case VIR_DOMAIN_RUNNING: + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_BLOCKED: + type = VIR_DOMAIN_EVENT_SUSPENDED; + reason = VIR_DOMAIN_EVENT_SUSPENDED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_PAUSED: + case VIR_DOMAIN_LAST: + /* no event */ + break; + } + break; + + case VIR_DOMAIN_SHUTOFF: + switch (old_state) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_PAUSED: + case VIR_DOMAIN_RUNNING: + case VIR_DOMAIN_PMSUSPENDED: + type = VIR_DOMAIN_EVENT_STOPPED; + reason = VIR_DOMAIN_EVENT_STOPPED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + case VIR_DOMAIN_LAST: + /* no event */ + break; + } + break; + + case VIR_DOMAIN_PMSUSPENDED: + switch (old_state) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + /* the machine was started here, so we need an aditional event */
s/aditional/additional/
+ event = virDomainEventNewFromObj(vm, + VIR_DOMAIN_EVENT_STARTED, + VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT); + if (!event) + goto no_memory; + + qemuDomainEventQueue(driver, event); + /* fallthrough! */ + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_PAUSED: + case VIR_DOMAIN_RUNNING: + type = VIR_DOMAIN_EVENT_PMSUSPENDED; + reason = VIR_DOMAIN_EVENT_PMSUSPENDED_FROM_SNAPSHOT; + break;
Back to my questions in 17 about whether we even have the capability of reverting to S3 state, and how we have to be careful about the difference between S3 (qemu continues to exist) and S4 (looks like SHUTOFF except for how the guest will boot next).
+ + case VIR_DOMAIN_PMSUSPENDED: + case VIR_DOMAIN_LAST: + /* no event */ + break; + } + break; + + /* fallthrough */ + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_CRASHED: + case VIR_DOMAIN_LAST: + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Suspicious domain state '%d' after snapshot"), + new_state); + return -1; + } + + if (!(event = virDomainEventNewFromObj(vm, type, reason))) + goto no_memory; + + qemuDomainEventQueue(driver, event); + + return 0; + +no_memory: + virReportOOMError(); + return -1; +}
Is this helper function worth splitting into a separate commit, rather than trying to add it and external reverts all at once?
+ + +/* The domain and driver is expected to be locked */ +static int +qemuDomainSnapshotRevertExternal(virConnectPtr conn, + struct qemud_driver *driver, + virDomainObjPtr vm, + virDomainSnapshotObjPtr snap, + unsigned int flags) +{ + int ret = -1; + int loadfd = -1; + + virDomainDefPtr save_def = NULL; + struct qemud_save_header header; + virFileWrapperFdPtr wrapperFd = NULL; + + virDomainState old_state = virDomainObjGetState(vm, NULL); + virDomainDefPtr config = NULL; + + struct qemu_snap_remove rem; + + /* check if snapshot has children */ + if (snap->nchildren > 0 && + !(flags & VIR_DOMAIN_SNAPSHOT_REVERT_FORCE)) { + virReportError(VIR_ERR_SNAPSHOT_REVERT_RISKY, + _("Can't revert to snapshot '%s'. " + "Snapshot has children."), snap->def->name); + goto cleanup; + }
This says that if the user _does_ pass FORCE, then we proceed with the revert anyway, even though it invalidates the children. I wonder if instead we need a flag that says 'revert to this snapshot and simultaneously delete its children'. Also, in the case of external snapshots, we have two plausible points of where to revert to. For symmetry with internal checkpoints, the default with no flag has to be reverting to the point in time where the snapshot was taken. But for external snapshots, if we allow creation of snapshot branches (again, back to my create-and-revert idea), then it makes sense to revert to the named snapshot _and all subsequent changes_ tracked by the external file; sounds like yet another flag worth adding.
+ + if (vm->current_snapshot) { + vm->current_snapshot->def->current = false; + if (qemuDomainSnapshotWriteMetadata(vm, vm->current_snapshot, + driver->snapshotDir) < 0) + goto cleanup; + vm->current_snapshot = NULL; + /* XXX Should we restore vm->current_snapshot after this point + * in the failure cases where we know there was no change? */ + } + + /* Prepare to copy the snapshot inactive xml as the config of this + * domain. Easiest way is by a round trip through xml. + * + * XXX Should domain snapshots track live xml rather + * than inactive xml? */ + snap->def->current = true; + if (snap->def->dom) { + char *xml; + if (!(xml = qemuDomainDefFormatXML(driver, + snap->def->dom, + VIR_DOMAIN_XML_INACTIVE | + VIR_DOMAIN_XML_SECURE | + VIR_DOMAIN_XML_MIGRATABLE))) + goto cleanup; + + config = virDomainDefParseString(driver->caps, xml, + QEMU_EXPECTED_VIRT_TYPES, + VIR_DOMAIN_XML_INACTIVE); + VIR_FREE(xml); + if (!config) + goto cleanup; + } else { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Domain definition not found in snapshot '%s'"), + snap->def->name); + goto cleanup; + } + + /* try to open the memory save image if one exists */ + if (snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL) { + if (!snap->def->file || !virFileExists(snap->def->file)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Memory save file '%s' does not exist."), + snap->def->file);
NULL dereference if !snap->def->file. Your error message needs to be different for disk-only snapshots than for missing memory file for external checkpoint.
+ goto cleanup; + } + + /* open the save image file */ + if ((loadfd = qemuDomainSaveImageOpen(driver, snap->def->file, + &save_def, &header, + false, &wrapperFd, NULL, + -1, false, false)) < 0) + goto cleanup; + + /* opening succeeded, there's a big probability the revert will work. + * We can now get rid of the active qemu, if it runs */ + } + + if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) + goto cleanup; + + /* Now we destroy the (possibly) running qemu process. + * Unfortunately, the guest can be restored only using incomming migration.
s/incomming/incoming/
+ */ + if (virDomainObjIsActive(vm)) { + qemuProcessStop(driver, vm, + VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT, 0); + virDomainAuditStop(vm, "from-snapshot"); + } + + /* remove snapshot descendatns, as we're invalidating the image chain */
s/descendatns/descendants/
+ + rem.driver = driver; + rem.vm = vm; + rem.metadata_only = false; + rem.err = 0; + rem.current = false; + virDomainSnapshotForEachDescendant(snap, + qemuDomainSnapshotDiscardAll, + &rem); + if (rem.err < 0) + goto endjob; + + /* assign domain definition */ + virDomainObjAssignDef(vm, config, false); + config = NULL; + + /* wipe and re-create disk images */ + if (qemuDomainSnapshotCreateInactiveExternal(driver, vm, snap, true) < 0) + goto endjob;
Oh, here I see where you are passing 'true' for the reuse parameter where I complained that patch 15 was always passing false. But there's a difference between the user passing VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT (we should reuse the file exactly as-is, under the assumption that the user created it as a wrapper around the existing file with no contents) and the revert case (the wrapper is NOT thin, so it needs to be re-created or at least wiped to be made thin again). I'm wondering if this will need to be made a tri-state parameter, or more work done here to actually wipe things.
+ + /* save the config files */ + if (virDomainSaveConfig(driver->configDir, vm->def) < 0) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("Write to config file failed")); + goto endjob; + } + + /* re-start the guest if necessary */ + if (!(flags & VIR_DOMAIN_SNAPSHOT_REVERT_STOPPED) && + (snap->def->state == VIR_DOMAIN_RUNNING || + snap->def->state == VIR_DOMAIN_PAUSED || + flags & VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING || + flags & VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED)) { + + if (loadfd < 0) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Can't revert snapshot to running state. " + "Memory image not found.")); + goto endjob; + }
I'm wondering if some of these errors should be detected up front, rather than after the point that we already shut down an already-running domain in preparation for loading the new state.
+ ... @@ -12096,7 +12419,6 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, } }
- if (vm->current_snapshot) { vm->current_snapshot->def->current = false; if (qemuDomainSnapshotWriteMetadata(vm, vm->current_snapshot,
Spurious whitespace change?
@@ -12129,6 +12451,7 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, goto cleanup; }
+ /* Internal snapshot revert code starts below */ if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) goto cleanup;
-- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 11/01/2012 10:22 AM, Peter Krempa wrote:
This is a second spin of the patches:
Changes to previous version: - pushed fix for private_syms file
patches 1-4 are Eric's patches that this series builds upon patches 5,were ACKed in v1 patches 6,7,9,10,14 are fixed versions after review patch 8 is new in the series the rest needs review.
You can fetch the changes at: git fetch git://pipo.sk/pipo/libvirt.git snap-revert
I've now reviewed the series, and have some serious design questions about what we want to be doing in the last few patches (probably best by adding new flags in libvirt.in.h, affecting all of create, revert, and delete, in order to do some fused operations). Not covered in those reviews, but related to this series - we need a patch to virsh snapshot-create-as to make it easy to specify that we want an external checkpoint; maybe by copying after --diskspec, something like --memspec snapshot=external,file=/path/to/file -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (2)
-
Eric Blake
-
Peter Krempa